This lesson is designed to introduce students to correlation between two variables and the line of
best fit.
These activities can be done individually or in groups of as many as four students. Allow 1.5-2
hours of class time for the entire lesson if all portions are done in class.
Objectives
Upon completion of this lesson, students will:
have plotted bivariate data onto a scatter plot
have seen the line of best fit for several different scatter plots
be able to estimate the lines of best fit for data sets
be able to estimate the correlation coefficient for data sets
Standards Addressed:
Grade 10
Statistics and Probability
The student demonstrates an ability to classify and organize data.
The student demonstrates an ability to analyze data (comparing, explaining, interpreting, evaluating, making predictions, describing trends; drawing, formulating, or justifying conclusions).
Grade 6
Statistics and Probability
The student demonstrates an ability to analyze data (comparing, explaining, interpreting, evaluating; drawing or justifying conclusions).
Grade 7
Statistics and Probability
The student demonstrates an ability to analyze data (comparing, explaining, interpreting, evaluating, making predictions; drawing or justifying conclusions).
Grade 8
Statistics and Probability
The student demonstrates an ability to analyze data (comparing, explaining, interpreting, evaluating, making predictions, describing trends; drawing, formulating, or justifying conclusions).
Grade 9
Statistics and Probability
The student demonstrates an ability to classify and organize data.
The student demonstrates an ability to analyze data (comparing, explaining, interpreting, evaluating, making predictions, describing trends; drawing, formulating, or justifying conclusions).
Statistics and Probability
Interpreting Categorical and Quantitative Data
Summarize, represent, and interpret data on two categorical and quantitative variables
Interpret linear models
Grades 9-12
Algebra
Understand patterns, relations, and functions
Data Analysis and Probability
Formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them
Select and use appropriate statistical methods to analyze data
Algebra I
Data Analysis and Probability
Competency Goal 3: The learner will collect, organize, and interpret data with matrices and linear models to solve problems.
Student Prerequisites
Arithmetic: Students must be able to:
plot points on the Cartesian coordinate system
Statistics: Students must:
have a very basic understanding of correlation
Technological: Students must be able to:
perform basic mouse manipulations such as point, click and drag
use a browser for experimenting with the activities
Teacher Preparation
Students will need:
Access to a browser
Scatter Plot Exploration Questions
Graph paper and pencil
Key Terms
correlation
A statistical measure referring to the relationship between two random variables. It is a positive correlation when each variable tends to increase or decrease as the other does, and a negative or inverse correlation if one tends to increase as the other decreases.
correlation coefficient
A numerical value (between +1 and -1) that identifies the strength of the linear relationship between variables. A value of +1 indicates an exact positive relationship, -1 indicates an exact inverse relationship, and 0 indicates no predictable relationship between the variables.
line of best fit
A straight line used as a best approximation of a summary of all the points in a scatter-plot. The position and slope of the line are determined by the amount of correlation between the two, paired variables involved in generating the scatter-plot. This line can be used to make predictions about the value of one of the paired variables if only the other value in the pair is known.
linear regression
An attempt to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered as the independent variable, and the other is considered as the dependent variable.
residual
The observed value minus the predicted value. It is the difference of the results obtained by observation, and by computation from a formula.
scatter plot
A graphical representation of the distribution of two random variables as a set of points whose coordinates represent their observed paired values.
slope of a linear function
The slope of the line y = mx + b is the rate at which y is changing per unit of change in x. The units of measurement of the slope are units of y per unit of x (cf. Linear Functions Discussion).
Lesson Outline
Focus and Review
Review with the class the concept of correlation. Have the students begin to think about the words
and ideas of this lesson:
What are two variables that have no correlation with one another? Can anyone give me an
example of two variables that have some sort of correlation with one another? Is this a
positive or a negative correlation?
Objectives
Let the students know what it is that they will be doing and learning today. Say something like
this:
Today, class, we are going to learn more about correlation between two variables and be
introduced to the line of best fit.
We are going to use the computers to learn more about correlation, but please do not turn your
computers on until I ask you to. I want to show you a little about this activity first.
Teacher Input
Lead a
discussion on correlation of variables and the purpose of the line of best fit.
Lead a
discussion on the correlation coefficient, r, and how it varies depending on the relationship of the
data on the scatter plot.
Guided Practice
As a class complete the
Scatter Plot Exploration Questions. Have the students draw a scatter plot of the class data on a sheet of graph paper. Ask the class
where they predict the line of best fit will lie and what they think the correlation coefficient
is. Together, graph this data using the
Regression activity, look at the actual results, and compare these findings with your predictions.
Independent Practice
Have the students use the
Regression activity to estimate the line of best fit for their own data sets and then see where the line of
best fit actually lies. Encourage them to experiment with data sets that include outliers. Also,
have the students experiment with creating scatter plots that will have a specific correlation
coefficient.
Closure
You may wish to bring the class back together for a discussion on the findings. Once the students
have been allowed to share what they have found, summarize the results of the lesson.
Alternate Outline
This lesson can be rearranged in several ways.
omit the discussion of the correlation coefficient
omit the scatter plot worksheet
As a class, before splitting them into groups, have the students plot specific points on the
Regression activity and have each of them draw the line of best fit that they imagine. Then, have them
select the
true line of best fit and see who had the closest estimation.