Learning Objectives

Following this assignment students should be able to:

  • download public data using R packages
  • combine data from multiple public data sources
  • integrate public data with their own data
  • use public data to enhance analysis of their own data

Reading

Exercises

  1. -- Adult vs Newborn Size 2 --

    This is a follow up to Adult vs Newborn Size 1.

    We’ve graphed the relationship between adult size and new born size in mammals and now it’s time to analyze the relationship statistically.

    1. Do a regression using the lm() function where x is log10(adult mass) and y is log10(newborn mass).
    2. Print the summary statistics for this regression.
    3. Using ggplot make a graph that shows both the data points and the regression line through those points. Either the axes or the data should be logarithmically scaled to match your regression analysis. You won’t actually need to include the regression results themselves since geom_smooth will let you graph the linear model with the data. Label the axes.

    Optional: If you want, plot a histogram of the residuals of the regression to make sure that they are roughly normally distributed (you can do this with just a single line of code)

    [click here for output] [click here for output] [click here for output]
  2. -- Shrub Volume 4 --

    This is a follow up to Shrub Volume 3.

    Dr. Granger wants you to run an ANOVA to determine if the different experimental treatments lead to differences in shrub carbon.

    1. Import the data and your results table that you exported in ‘Combining Basics’.
    2. Do an ANOVA, using aov(), to determine if the experiment has an influence
      on the shrub carbon and print out the results in a standard ANOVA table.
    [click here for output]
  3. -- Mixed Model Analysis --

    This is a follow up to R Markdown Data Analysis.

    Some time later you decide to try analyzing the Efaw_Freeze2014.xlsx dataset with a mixed effects model.

    1. If you haven’t already, import the Efaw freeze data.
    2. Use the lmer() function from the lme4 package to fit a linear mixed effects model with yield (BUAC) as the dependent variable, treatment as a fixed effect, and replication as a random intercept.
    3. You decide to check if there is a difference in effect on yield of quantity of nitrogen applied and method of application. Separate your UAN column into a column of amounts of 0, 10, and 20 gallons per acre and another column that contains method of application. Fit a linear mixed effects model with yield as the dependent variable, your two new columns as fixed effects, and replication as a random intercept.
    4. You are not sure whether adding new terms is really justified. Use the Likelihood Ratio test to determine whether method of application or quantity of UAN are significant.
    [click here for output] [click here for output] [click here for output]