Chapter 2

2.1 The Correlation Coefficient

Here are the Sparrowhawk data from the 2.1 lecture notes:

> returning = c(74, 66, 81, 52, 73, 62, 52, 45, 62, 46, 60, 46, 38)  # %Returning adults
> new = c(5, 6, 8, 11, 12, 15, 16, 17, 18, 18, 19, 20, 20)  # #New adults

(The “#” character indicates that the rest of the line is a comment, not interpreted by R.)

Scatterplot: plot()

plot(x, y, main, sub, xlab, ylab) makes a scatter plot of the points in vectors x and y. You may optionally add a main title main, a subtitle sub, an x-axis label xlab, and a y-axis label ylab. With this many parameters, I like to name each argument supplied to the function (“x=returning” is the first argument instead of just “returning”, etc.):

> plot(x = returning, y = new, main = "13 Sparrowhawk colonies", xlab = "% Returning adults", 
+     ylab = "#New adults")

plot of chunk scatterplot

Correlation: cor()

The correlation between vectors x and y is given by cor(x, y):

> cor(returning, new)
[1] -0.7485

2.2 The Least-Squares Line

Regression line: lm()

The linear regression model for predicting y from x is given by lm(y ~ x) (“lm”=“linear model”):

> model = lm(new ~ returning)

See the coefficients via coef(), which returns a vector of length 2:

> coef(model)
(Intercept)   returning 
     31.934      -0.304 
> y.intercept = coef(model)[1]
> y.intercept
(Intercept) 
      31.93 
> slope = coef(model)[2]
> slope
returning 
   -0.304 

Add a line with y-intercept a and slope b to a plot with abline(a, b):

> plot(x = returning, y = new, main = "13 Sparrowhawk colonies", xlab = "% Returning adults", 
+     ylab = "#New adults")  # same as last plot
> abline(a = y.intercept, b = slope)  # add regression line

plot of chunk scatterplot with line

2.3 Features and Limitations of the Least-Squares Line

(There's nothing to see here, folks. Move along.)