Learning Objectives
Following this assignment students should be able to:
- understand the basic plot function of
ggplot2
- import ‘messy’ data with missing values and extra lines
- execute and visualize a regression analysis
Reading
-
Topics
ggplot
-
Readings
Exercises
-- Mass vs Metabolism --
The relationship between the body size of an organism and its metabolic rate is one of the most well studied and still most controversial areas of organismal physiology. We want to graph this relationship in the Artiodactyla using a subset of data from a large compilation of body size data (Savage et al. 2004). You can copy and paste this data frame into your program:
size_mr_data <- data.frame( body_mass = c(32000, 37800, 347000, 4200, 196500, 100000, 4290, 32000, 65000, 69125, 9600, 133300, 150000, 407000, 115000, 67000,325000, 21500, 58588, 65320, 85000, 135000, 20500, 1613, 1618), metabolic_rate = c(49.984, 51.981, 306.770, 10.075, 230.073, 148.949, 11.966, 46.414, 123.287, 106.663, 20.619, 180.150, 200.830, 224.779, 148.940, 112.430, 286.847, 46.347, 142.863, 106.670, 119.660, 104.150, 33.165, 4.900, 4.865))
Now make three plots with appropriate axis labels:
- A graph of body mass vs. metabolic rate
- A graph of log(body mass) vs. log(metabolic rate) (You can do this
transformation inside the call to
aes()
) - A graph of body mass vs. metabolic rate, with logarithmically scaled axes (this is different from number 2), and the point size set to 5.
Think about what the shape of these graphs tells you about the form of the relationship between mass and metabolic rate.
[click here for output] [click here for output] [click here for output]-- Adult vs Newborn Size 1 --
It makes sense that larger organisms have larger offspring, but what the mathematical form of this relationship should be is unclear. Let’s look at the problem empirically for mammals.
Download some mammal life history data from the web. You can do this either directly in the program using
read.csv()
or download the file to your computer using your browser, save it in thedata
subdirectory, and import it from there.When you import the data there are some extra blank lines at the end of this file. Get rid of them by using the optional
read.csv()
argumentnrows = 1440
to select the valid 1440 rows.Missing data in this file is specified by
-999
and-999.00
. Tell R that these are null values using the optionalread.csv()
argument,na.strings = c("-999", "-999.00")
. This will stop them from being plotted.- Graph adult mass vs. newborn mass. Label the axes with clearer labels than the column names.
- It looks like there’s a regular pattern here, but it’s definitely not linear. Let’s see if log-transformation straightens it out. Graph adult mass vs. newborn mass, with both axes scaled logarithmically. Label the axes.
- This looks like a pretty regular pattern, so you wonder if it varies among different groups. Graph adult mass vs. newborn mass, with both axes scaled logarithmically, and the data points colored by order. Label the axes.
- Coloring the points was useful, but there are a lot of points and it’s kind
of hard to see what’s going on with all of the orders. Use
facet_wrap
to create subplot for each order. - Now let’s visualize the relationships between the variables using a simple
linear model. Create a new graph like your faceted plot, but using
geom_smooth
to fit a linear model to each order. You can do this using the optional argumentmethod = "lm"
ingeom_smooth
.