SlideShare a Scribd company logo
1 of 83
Chapter 12
Correlation
Correlation - Definition
Correlation: a statistical technique that measures and describes
the degree of linear relationship between two variables
Obs X Y
A 1 1
B 1 3
C 3 2
D 4 5
E 6 4
F 7 5
Dataset
X
Y
Scatterplot
Characteristics
• Direction
– Positive (+) or Negative (-)
• Degree of association
– Between –1 and 1
– Absolute values signify strength
• Form
– Linear or Non-linear
– We will work with linear only
Direction
Positive
Large values of X associated
with large values of Y,
small values of X associated
with small values of Y.
e.g. IQ and SAT
Large values of X associated
with small values of Y
& vice versa
e.g. SPEED and ACCURACY
Negative
Degree of association
• If the points do not fall along a straight line, then there is NO
linear association.
• If the points fall nearly along a straight line, then there is a
STRONG linear association.
• If the points fall exactly along a straight line, then there is a
PERFECT linear association.
Strong
(tight cloud)
Weak
(diffuse cloud)
Practice
• Which value represents the strongest
relationship?
1. .56
2. -.32
3. .24
4. -.77
Practice
• Which value represents the weakest
relationship?
1. .56
2. -.32
3. .24
4. -.77
Practice
• Which value represents the strongest
relationship?
1. .89
2. .22
3. -.66
4. -.15
Practice
• The older we get, the less sleep we tend to
require. What is the nature of this
relationship?
1. Positive relationship
2. Negative relationship
Practice
• The more education we receive, the higher
our salary when we enter the workforce.
What is the nature of this relationship?
1. Positive relationship
2. Negative relationship
Practice
• The better an employees feels about his or
her job, the less often they will call in sick.
What is the nature of this relationship?
1. Positive relationship
2. Negative relationship
Types of Correlations
• For interval/ratio data use Pearson’s r
• For ordinal data use Spearman’s r
• For nominal data use the phi coefficent
Pearson’s r
• One way to calculate the correlation is to use
Pearson’s r
• Can use a Deviation score formula
– r is a fraction that captures
– where
Covariation of X and YCovariation of X and Y
Variation of X and YVariation of X and Y
separatelyseparately
r =
SP
√SSxSSy
SP = Σ (X - X)(Y - Y)
Deviation Score Formula
FemurFemur HumerusHumerus (X - X) (Y - Y) (X - X)2
(Y - Y)2
(X - X)(Y - Y)
AA 3838 4141
BB 5656 6363
CC 5959 7070
DD 6464 7272
EE 7474 8484
meanmean 58.258.2 66.0066.00
SSSSXX SSSSYY SPSP
r =
SP
√SSxSSy
Deviation Score Formula
FemurFemur HumerusHumerus (X - X) (Y - Y) (X - X)2
(Y - Y)2
(X - X)(Y - Y)
AA 3838 4141 -20.2-20.2 -25-25
BB 5656 6363 -2.2-2.2 -3-3
CC 5959 7070 0.80.8 44
DD 6464 7272 5.85.8 66
EE 7474 8484 15.815.8 1818
meanmean 58.258.2 66.0066.00
SSSSXX SSSSYY SPSP
r =
SP
√SSxSSy
Deviation Score Formula
FemurFemur HumerusHumerus (X - X) (Y - Y) (X - X)2
(Y - Y)2
(X - X)(Y - Y)
AA 3838 4141 -20.2 -25 408.0
4
625 505
BB 5656 6363 -2.2 -3 4.84 9 6.6
CC 5959 7070 0.8 4 .64 16 3.2
DD 6464 7272 5.8 6 33.64 36 34.8
EE 7474 8484 15.8 18 249.6
4
324 284.4
meanmean 58.258.2 66.0066.00
SSSSXX SSSSYY SPSP
r =
SP
√SSxSSy
Deviation Score Formula
FemurFemur HumerusHumerus (X - X) (Y - Y) (X - X)2
(Y - Y)2
(X - X)(Y - Y)
AA 38 41 -20.2 -25 408.0
4
625 505
BB 56 63 -2.2 -3 4.84 9 6.6
CC 59 70 0.8 4 .64 16 3.2
DD 64 72 5.8 6 33.64 36 34.8
EE 74 84 15.8 18 249.6
4
324 284.4
meanmean 58.258.2 66.0066.00 696.8696.8 10101010 834834
SSSSXX SSSSYY SPSP
r =
SP
√SSxSSy
= .99
The Computational Formula
( )( )
( )[ ] ( )[ ]∑ ∑∑ ∑
∑∑ ∑
−−
−
=
2222
YYnXXn
YXXYn
r
What are the preliminary steps to
calculating a correlation coefficient?
• When calculating the
correlation coefficient, one
begins with scores on two
variables.
What are the preliminary steps to
calculating a correlation coefficient?
• When calculating the
correlation coefficient, one
begins with scores on two
variables.
• The illustration on the right
involves scores on a reading
readiness test, and scores later
obtained by these same
students on a reading
achievement test.
Reading
Readiness
Scores
Reading
Achievement
Scores
Todd 10 19
Andrea 16 25
Kristen 19 23
Luis 22 31
Scott 28 27
What are the preliminary steps to
calculating a correlation coefficient?
• The formula used in the
calculation involves six
different values obtained
from the X and Y variables
The first two values are simply
the sum of X values and Y
values. Those sums are 95
and 125 for these particular
test scores.
X
Reading
Readiness
Scores
Y
Reading
Achievement
Scores
Todd 10 19
Andrea 16 25
Kristen 19 23
Luis 22 31
Scott 28 27
What are the preliminary steps to
calculating a correlation coefficient?
• The formula used in the
calculation involves six
different values obtained
from the X and Y variables
• The first two values are
simply the sum of X values
and Y values. Those sums
are 95 and 125 for these
particular test scores.
X
Reading
Readiness
Scores
Y
Reading
Achievement
Scores
Todd 10 19
Andrea 16 25
Kristen 19 23
Luis 22 31
Scott 28 27
95 125
∑
∑
=
=
125
95
Y
X
What are the preliminary steps to
calculating a correlation coefficient?
• The next step involves
squaring each of the X
and Y values.
X Y
10 19
16 25
19 23
22 31
28 27
95 125
What are the preliminary steps to
calculating a correlation coefficient?
• The next step involves
squaring each of the X
and Y values.
• and then summing
them
X2
X Y Y2
100 10 19 361
256 16 25 625
361 19 23 529
484 22 31 961
784 28 27 729
1985 95 125 3205
What are the preliminary steps to
calculating a correlation coefficient?
• Using the summation notation…
X2
X Y Y2
100 10 19 361
256 16 25 625
361 19 23 529
484 22 31 961
784 28 27 729
1985 95 125 3205
∑
∑
∑
∑
=
=
=
=
3205
1985
125
95
2
2
Y
X
Y
X
What are the preliminary steps to
calculating a correlation coefficient?
• In the next step, the
product of each pair of X
and Y scores is obtained.
X2
X Y Y2
100 10 19 361
256 16 25 625
361 19 23 529
484 22 31 961
784 28 27 729
1985 95 125 3205
What are the preliminary steps to
calculating a correlation coefficient?
• In the next step, the
product of each pair of X
and Y scores is obtained.
• and then summed.
X2
X XY Y Y2
100 10 190 19 361
256 16 400 25 625
361 19 437 23 529
484 22 682 31 961
784 28 756 27 729
1985 95 2465 125 3205
What are the preliminary steps to
calculating a correlation coefficient?
• Using the summation notation…
X2
X XY Y Y2
100 10 190 19 361
256 16 400 25 625
361 19 437 23 529
484 22 682 31 961
784 28 756 27 729
1985 95 2465 125 3205
∑
∑
∑
∑
∑
=
=
=
=
=
2465
3205
1985
125
95
2
2
XY
Y
X
Y
X
What are the preliminary steps to
calculating a correlation coefficient?
• The last of the
preliminary steps is to
simply determine the
number of people
being included in the
calculations. In this
case, the calculations
involve 5 students.
Therefore...
X2
X XY Y Y2
100 10 190 19 361
256 16 400 25 625
361 19 437 23 529
484 22 682 31 961
784 28 756 27 729
1985 95 2465 125 3205
What are the preliminary steps to
calculating a correlation coefficient?
• The last of the
preliminary steps is to
simply determine the
number of people
being included in the
calculations. In this
case, the calculations
involve 5 students.
Therefore...
X2
X XY Y Y2
100 10 190 19 361
256 16 400 25 625
361 19 437 23 529
484 22 682 31 961
784 28 756 27 729
1985 95 2465 125 3205
5=n
What are the preliminary steps to
calculating a correlation coefficient?
• In summary, our six values
used to calculate the
correlation coefficient are…
∑
∑
∑
∑
∑
=
=
=
=
=
=
2465
3205
1985
125
95
5
2
2
XY
Y
X
Y
X
n
X2
X XY Y Y2
100 10 190 19 361
256 16 400 25 625
361 19 437 23 529
484 22 682 31 961
784 28 756 27 729
1985 95 2465 125 3205
Using the computational formula...
∑
∑
∑
∑
∑
=
=
=
=
=
=
2465
3205
1985
125
95
5
2
2
XY
Y
X
Y
X
n
Using the computational formula...
A somewhatA somewhat
impressive lookingimpressive looking
formula uses theseformula uses these
six values tosix values to
compute thecompute the
correlationcorrelation
coefficient...coefficient...
∑
∑
∑
∑
∑
=
=
=
=
=
=
2465
3205
1985
125
95
5
2
2
XY
Y
X
Y
X
n
A somewhatA somewhat
impressive lookingimpressive looking
formula uses theseformula uses these
six values tosix values to
compute thecompute the
correlationcorrelation
coefficient…,coefficient…,
however the formulahowever the formula
turns out not to beturns out not to be
very difficult to use.very difficult to use.
∑
∑
∑
∑
∑
=
=
=
=
=
=
2465
3205
1985
125
95
5
2
2
XY
Y
X
Y
X
n
Using the computational formula...
( )( )
( )[ ] ( )[ ]∑ ∑∑ ∑
∑∑ ∑
−−
−
=
2222
YYnXXn
YXXYn
r
∑
∑
∑
∑
∑
=
=
=
=
=
=
2465
3205
1985
125
95
5
2
2
XY
Y
X
Y
X
n
The formula is...The formula is...
Using the computational formula...
( )( )
( )[ ] ( )[ ]∑ ∑∑ ∑
∑∑ ∑
−−
−
=
2222
YYnXXn
YXXYn
r
∑
∑
∑
∑
∑
=
=
=
=
=
=
2465
3205
1985
125
95
5
2
2
XY
Y
X
Y
X
n
The variables in thisThe variables in this
formula consist of onlyformula consist of only
the six previouslythe six previously
calculated values to thecalculated values to the
left...left...
Using the computational formula...
∑
∑
∑
∑
∑
=
=
=
=
=
=
2465
3205
1985
125
95
5
2
2
XY
Y
X
Y
X
n
Here is the formula withHere is the formula with
these values inserted...these values inserted...
Using the computational formula...
( )( )
( )[ ] ( )[ ]∑ ∑∑ ∑
∑∑ ∑
−−
−
=
2222
YYnXXn
YXXYn
r
( )( ) ( )( )
( )( ) ( )[ ]( )( ) ( )[ ]22
125320559519855
1259524655
−−
−
=r
The correlation between these students readingThe correlation between these students reading
readiness scores and later reading achievementreadiness scores and later reading achievement
scores is 0.75scores is 0.75
X
Reading
Readiness
Scores
Y
Reading
Achievement
Scores
Todd 10 19
Andrea 16 25
Kristen 19 23
Luis 22 31
Scott 28 27
Using the computational formula…
Determining Significance
►Test whether the association is greater than can be
expected by chance
►Hypotheses
– H0: ρ = 0
– H1: ρ ≠ 0
►df = n – 2
– n is the total number of subjects
►Use the Pearson correlation table
►If your correlation score is greater than the score given
in the table (critical value), then your correlation is
significant
Now its your turn...
Now its your turn...
• To the right are the
scores of four students
on a spelling test and a
vocabulary test. Can you
calculate the correlation
coefficient?
X
Spelling
Y
Vocabulary
Sandra 8 10
Neil 5 6
Laura 4 7
Jerome 1 3
Now its your turn...
• On your own paper,
calculate these six values:
∑
∑
∑
∑
∑
=
=
=
=
=
=
XY
Y
X
Y
X
n
2
2
X
Spelling
Y
Vocabulary
Sandra 8 10
Neil 5 6
Laura 4 7
Jerome 1 3
Now its your turn...
• You should get these
values:
141
194
106
26
18
4
2
2
∑
∑
∑
∑
∑
=
=
=
=
=
=
XY
Y
X
Y
X
n X2
X XY Y Y2
64 8 80 10 100
25 5 30 6 36
16 4 28 7 49
1 1 3 3 9
106 18 141 26 194
Now its your turn...
• Now insert these values
in the equation
141
194
106
26
18
4
2
2
∑
∑
∑
∑
∑
=
=
=
=
=
=
XY
Y
X
Y
X
n
( )( )
( )[ ] ( )[ ]∑ ∑∑ ∑
∑∑ ∑
−−
−
=
2222
YYnXXn
YXXYn
r
( )( ) ( )( )
( )( ) ( )[ ]( )( ) ( )[ ]22
261944181064
26181414
−−
−
=r
96.0
100
96
==r
Significant at alpha = .05?
►What is the critical value?
1. .95
2. .90
3. .811
4. .632
Significant?
►Is this correlation significant?
1.Yes
2.No
Regression
The Linear Equation
• If two variables are linearly related it is
possible to develop a simple equation to
represent the relationship
• E.g. centigrade to Fahrenheit:
–F = 1.8C + 32
– this formula gives a specific straight line
The Linear Equation
• Equation of the line (Y = bX + a)
– a and b are constants in a given line;
– X and Y change
Predictor
Criterion
The Linear Equation
• Equation of the line (Y = bX + a)
– The slope (b)
• the amount of change in y with one unit change in x
• On a graph, it is represented by how steep the line is.
The Linear Equation
• When b changes (different formulas)
Predictor
Criterion
The Linear Equation
• Equation of the line (Y = bX + a)
– The intercept (a)
• the value of y when x is zero
• On a graph, it is represented by where the line crosses
the y axis
The Linear Equation
• When a changes (different formulas)
Predictor
Criterion
Practice
• Y = 32(.3) + 10
• Identify the slope
1. 32
2. .3
3. 10
Practice
• Y = 32(.3) + 10
• Identify the Y intercept
1. 32
2. .3
3. 10
The Regression Line
• Relationships are rarely perfect. Scores are
“scattered”.
• The regression line is a straight line which is
drawn through a scatterplot, to summarize
the relationship between X and Y
• It is the line that minimizes the squared
deviations (Y – Y’)2
• We call these vertical deviations “residuals”
When there is some linear association, the
regression line fits as close to the points as possible
150
175
200
225
250
67 68 69 70 71 72 73 74 75 76 77
Weight
in
Pounds
Height in Inches
The 2001 Mets
Calculating the regression lineCalculating the regression line
► To the right are theTo the right are the
scores of four studentsscores of four students
on a spelling test and aon a spelling test and a
vocabulary test.vocabulary test.
► Sallie has just takenSallie has just taken
the spelling test andthe spelling test and
scored a 6. What doscored a 6. What do
you predict heryou predict her
vocabulary score tovocabulary score to
be?be?
X
Spelling
Y
Vocabulary
Sandra 6 8
Neil 5 6
Laura 4 7
Jerome 1 3
Means, Sums, and Products
X
Spelling
Y
Vocabulary
6 8
5 6
4 7
1 3
M=4 M=6
Means, Sums, and ProductsMeans, Sums, and Products
X
Spelling
Y
Vocabulary
X-Mx Y-MY
6 8 2 2
5 6 1 0
4 7 0 1
1 3 -3 -3
M=4 M=6
Means, Sums, and ProductsMeans, Sums, and Products
X
Spelling
Y
Vocabulary
X-Mx Y-MY (X-Mx)( Y-MY)
6 8 2 2 4
5 6 1 0 0
4 7 0 1 0
1 3 -3 -3 9
M=4 M=6 13=SP
Means, Sums, and ProductsMeans, Sums, and Products
X
Spelling
Y
Vocabulary
X-Mx Y-MY (X-Mx)( Y-MY) (X-Mx)2
6 8 2 2 4 4
5 6 1 0 0 1
4 7 0 1 0 0
1 3 -3 -3 9 9
M=4 M=6 13=SP 14=SSx
Now the formulasNow the formulas
X
Spelling
Y
Vocabulary
X-Mx Y-MY (X-Mx)( Y-MY) (X-Mx)2
6 8 2 2 4 4
5 6 1 0 0 1
4 7 0 1 0 0
1 3 -3 -3 9 9
M=4 M=6 13=SP 14=SSx
93.
14
13
===
xSS
SP
b 28.2)4(93.6 =−=−= XY bMMa
Now the formulas
86.728.2)6(93.
^
=+=+= abXY
Sallie should get a vocabulary score of 7.86
Causation
• A strong relationship between variables does
not always mean that changes in one variable
cause changes in the other variable.
Causation
• The relationship between two variables is
often influenced by other variables lurking in
the background.
“Beware the lurking variable!
Causation
• The best evidence of causation comes from
randomized comparative experiments.
The Chi-Square Analysis
Chi-Square
• Examines nominal data or ordinal data that is
being treated as a category
• Called a non-parametric test
– Chi-square requires no assumptions about the
shape of the population distribution from which a
sample is drawn.
• The test examines the difference between
observed counts and expected values
Chi-square Goodness of Fit
• Two ways to use the chi-square
• First way to use the chi-square is called the
Goodness of Fit test
– Determines whether a frequency distribution
follows a claimed distribution
• Hypothesis test
– Ho: the variable follows the claimed distribution
– H1: the variable does not follow the claimed
distribution
Chi-square Goodness of Fit
• The FBI compiles data on
crime and crime rates and
publishes the information
in Crime in the United
States. A violent crime is
classified by the FBI as
murder, forcible rape,
robbery, or aggravated
assault.
Types of
violent crime
Relative
frequency
Murder 0.012
Forcible rape 0.054
Robbery 0.323
Agg. assault 0.611
1.000
Types of
violent crime Frequency
Murder 9
Forcible rape 26
Robbery 144
Agg. assault 321
500
Crime Distribution for 1995
Last Year
Chi-square Goodness of Fit
• Do the data provide sufficient evidence to conclude that last
year’s distribution of violent crimes has changed from the
1995 distribution?
• Get expected frequency
E = Np
Types of
violent crime
Relative
frequency
p
Expected
frequency
Np =E
Murder 0.012 (500)(0.012) = 6.0
Forcible rape 0.054 (500)(0.054) = 27.0
Robbery 0.323 (500)(0.323) = 161.5
Agg, assault 0.611 (500)(0.611) = 305.5
Chi-square Goodness of Fit
• Then calculate the chi formula
Cell O E O-E (O-E)2
(O-E)2
/E
Murder 9 6 3 9 1.5
Forcible Rape 26 27 -1 1 0.037
Robbery 144 161.5 -17.5 306.25 1.896
Agg. Assault 321 305.5 15.5 240.25 0.786
χχ22
= 4.219= 4.219
( )
∑
−
=
E
EO
2
2
χ
Chi-square Goodness of Fit
• Finally
– Use Table to find critical value
– df = k – 1, where k is the number of cells
– Example – df = 3
– Critical value is 7.815
– Our value is 4.219 so fail to reject
– This means that the pattern of crime has not
changed when comparing 1995 to last year.
Chi-square Test of Independence
• Second way to use a chi-square is the test of
independence
– Hypotheses
• H0: Variables Are Independent
• Ha: Variables Are Related (Dependent)
Chi-square Test of Independence
• We are interested in whether single men vs.
women are more likely to own cats vs. dogs.
• Notice that both variables are categorical.
– Kind of pet: people are classified as owning cats or
dogs. We can count the number of people
belonging to each category
– Sex: people are male or female. We count the
number of people in each category
Chi-square Test of Independence
• Are these differences
because there is a real
relationship between
gender and pet
ownership?
• Or is there actually no
relationship between
these variables?
Cat Dog
Male 20 30 50
Female 30 20 50
50 50 100
Chi-square Test of Independence
• To answer this question, we need to know
what we would expect to observe if the null
hypothesis were true
• The differences between these expected
values and the observed values are
aggregated according to the Chi-square
formula
Chi-square Test of Independence
• To find expected value for a cell
of the table, multiply the
corresponding row total by the
column total, and divide by the
grand total
• For the first cell (and all other
cells), (50 x 50)/100 = 25
• Thus, if the two variables are
unrelated, we would expect to
observe 25 people in each cell
Cat Dog
Male 20 30 50
Female 30 20 50
50 50 100
Chi-square Test of Independence
• Then apply to the same chi-square formula
( )
∑
−
=
E
EO
2
2
χ
Cell O E O-E (O-E)2
(O-E)2
/E
Male w/ Car 20 25 -5 25 1
Male w/ Dog 30 25 5 25 1
Female w/ Cat 30 25 5 25 1
Female w/ Dog 20 25 -5 25 1
χχ22
= 4= 4
Chi-square Test of Independence
• Compare to critical value from chi-square table.
• Degrees of freedom is
– (number of rows – 1)(number of columns -1)
– In our example (2-1)(2-1)= 1
– Critical value is 3.841
– Our value of 4 is greater than the critical so reject the null.
Cat Dog
Male 20 30 50
Female 30 20 50
50 50 100

More Related Content

What's hot

Pearson product moment correlation
Pearson product moment correlationPearson product moment correlation
Pearson product moment correlationSharlaine Ruth
 
Correlation new 2017 black
Correlation new 2017 blackCorrelation new 2017 black
Correlation new 2017 blackfizjadoon
 
PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT
PEARSON PRODUCT MOMENT CORRELATION COEFFICIENTPEARSON PRODUCT MOMENT CORRELATION COEFFICIENT
PEARSON PRODUCT MOMENT CORRELATION COEFFICIENTONE Virtual Services
 
Simple linear regression and correlation
Simple linear regression and correlationSimple linear regression and correlation
Simple linear regression and correlationShakeel Nouman
 
Linear regression without tears
Linear regression without tearsLinear regression without tears
Linear regression without tearsAnkit Sharma
 
multiple regression
multiple regressionmultiple regression
multiple regressionPriya Sharma
 
Regression analysis
Regression analysisRegression analysis
Regression analysisAwais Salman
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysisFarzad Javidanrad
 
Basics of Regression analysis
 Basics of Regression analysis Basics of Regression analysis
Basics of Regression analysisMahak Vijayvargiya
 
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression (Linear Regression and Logistic Regression) by Akanksha BaliRegression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression (Linear Regression and Logistic Regression) by Akanksha BaliAkanksha Bali
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inferenceKemal İnciroğlu
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regressiondessybudiyanti
 
Chap12 multiple regression
Chap12 multiple regressionChap12 multiple regression
Chap12 multiple regressionJudianto Nugroho
 
Direct variation-ppt
Direct variation-pptDirect variation-ppt
Direct variation-pptREYHISONA2
 

What's hot (20)

Pearson product moment correlation
Pearson product moment correlationPearson product moment correlation
Pearson product moment correlation
 
Correlation new 2017 black
Correlation new 2017 blackCorrelation new 2017 black
Correlation new 2017 black
 
Glm
GlmGlm
Glm
 
S2 pb
S2 pbS2 pb
S2 pb
 
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT
PEARSON PRODUCT MOMENT CORRELATION COEFFICIENTPEARSON PRODUCT MOMENT CORRELATION COEFFICIENT
PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Simple linear regression and correlation
Simple linear regression and correlationSimple linear regression and correlation
Simple linear regression and correlation
 
Linear regression without tears
Linear regression without tearsLinear regression without tears
Linear regression without tears
 
multiple regression
multiple regressionmultiple regression
multiple regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
 
Basics of Regression analysis
 Basics of Regression analysis Basics of Regression analysis
Basics of Regression analysis
 
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression (Linear Regression and Logistic Regression) by Akanksha BaliRegression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
04 regression
04 regression04 regression
04 regression
 
Chap12 multiple regression
Chap12 multiple regressionChap12 multiple regression
Chap12 multiple regression
 
Direct variation-ppt
Direct variation-pptDirect variation-ppt
Direct variation-ppt
 

Viewers also liked

Tourism English 2
Tourism English 2Tourism English 2
Tourism English 2Les Davy
 
Возможности Казахстана на урановом рынке
Возможности Казахстана на урановом рынкеВозможности Казахстана на урановом рынке
Возможности Казахстана на урановом рынкеАО "Самрук-Казына"
 
Private sector - recommendations from AIGLIA2014
Private sector - recommendations from AIGLIA2014Private sector - recommendations from AIGLIA2014
Private sector - recommendations from AIGLIA2014futureagricultures
 
Вторичный рынок "Новой Москвы"
Вторичный рынок "Новой Москвы"Вторичный рынок "Новой Москвы"
Вторичный рынок "Новой Москвы"МИЭЛЬ
 
Visual Aid
Visual AidVisual Aid
Visual Aidcoshik26
 
Retraining a racehorse
Retraining a racehorseRetraining a racehorse
Retraining a racehorseraquel63485
 
Ancillary magazine making
Ancillary magazine makingAncillary magazine making
Ancillary magazine makingaq101824
 
20087067 choi mun jung presentation
20087067 choi mun jung presentation20087067 choi mun jung presentation
20087067 choi mun jung presentation문정 최
 
Evaluation Question 2
Evaluation Question 2Evaluation Question 2
Evaluation Question 2Sammi Wilde
 
Putting the wow into your school's wom, NYSAIS Presentation
Putting the wow into your school's wom, NYSAIS PresentationPutting the wow into your school's wom, NYSAIS Presentation
Putting the wow into your school's wom, NYSAIS PresentationRick Newberry
 
一個民宿老闆教我的事
一個民宿老闆教我的事一個民宿老闆教我的事
一個民宿老闆教我的事Fa Zhou Shi
 
如何掌控自己的时间和生活(完整版)By louiechot
如何掌控自己的时间和生活(完整版)By louiechot如何掌控自己的时间和生活(完整版)By louiechot
如何掌控自己的时间和生活(完整版)By louiechotliaohuanzhuo
 
Improving Land Governance for Inclusive and Sustainable Agriculture Transform...
Improving Land Governance for Inclusive and Sustainable Agriculture Transform...Improving Land Governance for Inclusive and Sustainable Agriculture Transform...
Improving Land Governance for Inclusive and Sustainable Agriculture Transform...futureagricultures
 
How to use of moodle
How to use of moodleHow to use of moodle
How to use of moodlehayate19996
 

Viewers also liked (20)

Tourism English 2
Tourism English 2Tourism English 2
Tourism English 2
 
Tibet
TibetTibet
Tibet
 
Возможности Казахстана на урановом рынке
Возможности Казахстана на урановом рынкеВозможности Казахстана на урановом рынке
Возможности Казахстана на урановом рынке
 
Private sector - recommendations from AIGLIA2014
Private sector - recommendations from AIGLIA2014Private sector - recommendations from AIGLIA2014
Private sector - recommendations from AIGLIA2014
 
Heart Attack
Heart AttackHeart Attack
Heart Attack
 
Apres pi pcbc
Apres pi pcbcApres pi pcbc
Apres pi pcbc
 
Вторичный рынок "Новой Москвы"
Вторичный рынок "Новой Москвы"Вторичный рынок "Новой Москвы"
Вторичный рынок "Новой Москвы"
 
Visual Aid
Visual AidVisual Aid
Visual Aid
 
Retraining a racehorse
Retraining a racehorseRetraining a racehorse
Retraining a racehorse
 
affTA00 - 10 Daftar Isi
affTA00 - 10 Daftar IsiaffTA00 - 10 Daftar Isi
affTA00 - 10 Daftar Isi
 
Ancillary magazine making
Ancillary magazine makingAncillary magazine making
Ancillary magazine making
 
20087067 choi mun jung presentation
20087067 choi mun jung presentation20087067 choi mun jung presentation
20087067 choi mun jung presentation
 
Evaluation Question 2
Evaluation Question 2Evaluation Question 2
Evaluation Question 2
 
Putting the wow into your school's wom, NYSAIS Presentation
Putting the wow into your school's wom, NYSAIS PresentationPutting the wow into your school's wom, NYSAIS Presentation
Putting the wow into your school's wom, NYSAIS Presentation
 
一個民宿老闆教我的事
一個民宿老闆教我的事一個民宿老闆教我的事
一個民宿老闆教我的事
 
Ts 4783 1
Ts 4783 1Ts 4783 1
Ts 4783 1
 
如何掌控自己的时间和生活(完整版)By louiechot
如何掌控自己的时间和生活(完整版)By louiechot如何掌控自己的时间和生活(完整版)By louiechot
如何掌控自己的时间和生活(完整版)By louiechot
 
Improving Land Governance for Inclusive and Sustainable Agriculture Transform...
Improving Land Governance for Inclusive and Sustainable Agriculture Transform...Improving Land Governance for Inclusive and Sustainable Agriculture Transform...
Improving Land Governance for Inclusive and Sustainable Agriculture Transform...
 
1 18
1 181 18
1 18
 
How to use of moodle
How to use of moodleHow to use of moodle
How to use of moodle
 

Similar to Chapter 12

correlationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfcorrelationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfDrAmanSaxena
 
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...RekhaChoudhary24
 
Lesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing dataLesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing datamjlobetos
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptxnikshaikh786
 
Lecture 07 Regression Analysis Part 1
Lecture 07 Regression Analysis Part 1Lecture 07 Regression Analysis Part 1
Lecture 07 Regression Analysis Part 1Riri Ariyanty
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regressionUnsa Shakir
 
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Muhammadasif909
 
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)마이캠퍼스
 
Correlation_and_Regression-3.ppt
Correlation_and_Regression-3.pptCorrelation_and_Regression-3.ppt
Correlation_and_Regression-3.pptRidaIrfan10
 
Regression and correlation in statistics
Regression and correlation in statisticsRegression and correlation in statistics
Regression and correlation in statisticsiphone4s4
 
Math4 presentation.ppsx
Math4 presentation.ppsxMath4 presentation.ppsx
Math4 presentation.ppsxRaviPal876687
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfRavinandan A P
 
1Bivariate RegressionStraight Lines¾ Simple way to.docx
1Bivariate RegressionStraight Lines¾ Simple way to.docx1Bivariate RegressionStraight Lines¾ Simple way to.docx
1Bivariate RegressionStraight Lines¾ Simple way to.docxaulasnilda
 
Pearson's correlation coefficient
Pearson's correlation coefficientPearson's correlation coefficient
Pearson's correlation coefficientWaleed Zaghal
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relationnuwan udugampala
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and RegressionNeha Dokania
 

Similar to Chapter 12 (20)

correlationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfcorrelationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdf
 
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Lesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing dataLesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing data
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptx
 
Lecture 07 Regression Analysis Part 1
Lecture 07 Regression Analysis Part 1Lecture 07 Regression Analysis Part 1
Lecture 07 Regression Analysis Part 1
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regression
 
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)
 
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
 
33851.ppt
33851.ppt33851.ppt
33851.ppt
 
regression
regressionregression
regression
 
Correlation_and_Regression-3.ppt
Correlation_and_Regression-3.pptCorrelation_and_Regression-3.ppt
Correlation_and_Regression-3.ppt
 
Regression and correlation in statistics
Regression and correlation in statisticsRegression and correlation in statistics
Regression and correlation in statistics
 
Math4 presentation.ppsx
Math4 presentation.ppsxMath4 presentation.ppsx
Math4 presentation.ppsx
 
Regression
RegressionRegression
Regression
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdf
 
1Bivariate RegressionStraight Lines¾ Simple way to.docx
1Bivariate RegressionStraight Lines¾ Simple way to.docx1Bivariate RegressionStraight Lines¾ Simple way to.docx
1Bivariate RegressionStraight Lines¾ Simple way to.docx
 
Pearson's correlation coefficient
Pearson's correlation coefficientPearson's correlation coefficient
Pearson's correlation coefficient
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relation
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 

Recently uploaded

4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 

Recently uploaded (20)

4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 

Chapter 12

  • 3. Correlation - Definition Correlation: a statistical technique that measures and describes the degree of linear relationship between two variables Obs X Y A 1 1 B 1 3 C 3 2 D 4 5 E 6 4 F 7 5 Dataset X Y Scatterplot
  • 4. Characteristics • Direction – Positive (+) or Negative (-) • Degree of association – Between –1 and 1 – Absolute values signify strength • Form – Linear or Non-linear – We will work with linear only
  • 5. Direction Positive Large values of X associated with large values of Y, small values of X associated with small values of Y. e.g. IQ and SAT Large values of X associated with small values of Y & vice versa e.g. SPEED and ACCURACY Negative
  • 6. Degree of association • If the points do not fall along a straight line, then there is NO linear association. • If the points fall nearly along a straight line, then there is a STRONG linear association. • If the points fall exactly along a straight line, then there is a PERFECT linear association. Strong (tight cloud) Weak (diffuse cloud)
  • 7.
  • 8. Practice • Which value represents the strongest relationship? 1. .56 2. -.32 3. .24 4. -.77
  • 9. Practice • Which value represents the weakest relationship? 1. .56 2. -.32 3. .24 4. -.77
  • 10. Practice • Which value represents the strongest relationship? 1. .89 2. .22 3. -.66 4. -.15
  • 11. Practice • The older we get, the less sleep we tend to require. What is the nature of this relationship? 1. Positive relationship 2. Negative relationship
  • 12. Practice • The more education we receive, the higher our salary when we enter the workforce. What is the nature of this relationship? 1. Positive relationship 2. Negative relationship
  • 13. Practice • The better an employees feels about his or her job, the less often they will call in sick. What is the nature of this relationship? 1. Positive relationship 2. Negative relationship
  • 14. Types of Correlations • For interval/ratio data use Pearson’s r • For ordinal data use Spearman’s r • For nominal data use the phi coefficent
  • 15. Pearson’s r • One way to calculate the correlation is to use Pearson’s r • Can use a Deviation score formula – r is a fraction that captures – where Covariation of X and YCovariation of X and Y Variation of X and YVariation of X and Y separatelyseparately r = SP √SSxSSy SP = Σ (X - X)(Y - Y)
  • 16. Deviation Score Formula FemurFemur HumerusHumerus (X - X) (Y - Y) (X - X)2 (Y - Y)2 (X - X)(Y - Y) AA 3838 4141 BB 5656 6363 CC 5959 7070 DD 6464 7272 EE 7474 8484 meanmean 58.258.2 66.0066.00 SSSSXX SSSSYY SPSP r = SP √SSxSSy
  • 17. Deviation Score Formula FemurFemur HumerusHumerus (X - X) (Y - Y) (X - X)2 (Y - Y)2 (X - X)(Y - Y) AA 3838 4141 -20.2-20.2 -25-25 BB 5656 6363 -2.2-2.2 -3-3 CC 5959 7070 0.80.8 44 DD 6464 7272 5.85.8 66 EE 7474 8484 15.815.8 1818 meanmean 58.258.2 66.0066.00 SSSSXX SSSSYY SPSP r = SP √SSxSSy
  • 18. Deviation Score Formula FemurFemur HumerusHumerus (X - X) (Y - Y) (X - X)2 (Y - Y)2 (X - X)(Y - Y) AA 3838 4141 -20.2 -25 408.0 4 625 505 BB 5656 6363 -2.2 -3 4.84 9 6.6 CC 5959 7070 0.8 4 .64 16 3.2 DD 6464 7272 5.8 6 33.64 36 34.8 EE 7474 8484 15.8 18 249.6 4 324 284.4 meanmean 58.258.2 66.0066.00 SSSSXX SSSSYY SPSP r = SP √SSxSSy
  • 19. Deviation Score Formula FemurFemur HumerusHumerus (X - X) (Y - Y) (X - X)2 (Y - Y)2 (X - X)(Y - Y) AA 38 41 -20.2 -25 408.0 4 625 505 BB 56 63 -2.2 -3 4.84 9 6.6 CC 59 70 0.8 4 .64 16 3.2 DD 64 72 5.8 6 33.64 36 34.8 EE 74 84 15.8 18 249.6 4 324 284.4 meanmean 58.258.2 66.0066.00 696.8696.8 10101010 834834 SSSSXX SSSSYY SPSP r = SP √SSxSSy = .99
  • 20. The Computational Formula ( )( ) ( )[ ] ( )[ ]∑ ∑∑ ∑ ∑∑ ∑ −− − = 2222 YYnXXn YXXYn r
  • 21. What are the preliminary steps to calculating a correlation coefficient? • When calculating the correlation coefficient, one begins with scores on two variables.
  • 22. What are the preliminary steps to calculating a correlation coefficient? • When calculating the correlation coefficient, one begins with scores on two variables. • The illustration on the right involves scores on a reading readiness test, and scores later obtained by these same students on a reading achievement test. Reading Readiness Scores Reading Achievement Scores Todd 10 19 Andrea 16 25 Kristen 19 23 Luis 22 31 Scott 28 27
  • 23. What are the preliminary steps to calculating a correlation coefficient? • The formula used in the calculation involves six different values obtained from the X and Y variables The first two values are simply the sum of X values and Y values. Those sums are 95 and 125 for these particular test scores. X Reading Readiness Scores Y Reading Achievement Scores Todd 10 19 Andrea 16 25 Kristen 19 23 Luis 22 31 Scott 28 27
  • 24. What are the preliminary steps to calculating a correlation coefficient? • The formula used in the calculation involves six different values obtained from the X and Y variables • The first two values are simply the sum of X values and Y values. Those sums are 95 and 125 for these particular test scores. X Reading Readiness Scores Y Reading Achievement Scores Todd 10 19 Andrea 16 25 Kristen 19 23 Luis 22 31 Scott 28 27 95 125 ∑ ∑ = = 125 95 Y X
  • 25. What are the preliminary steps to calculating a correlation coefficient? • The next step involves squaring each of the X and Y values. X Y 10 19 16 25 19 23 22 31 28 27 95 125
  • 26. What are the preliminary steps to calculating a correlation coefficient? • The next step involves squaring each of the X and Y values. • and then summing them X2 X Y Y2 100 10 19 361 256 16 25 625 361 19 23 529 484 22 31 961 784 28 27 729 1985 95 125 3205
  • 27. What are the preliminary steps to calculating a correlation coefficient? • Using the summation notation… X2 X Y Y2 100 10 19 361 256 16 25 625 361 19 23 529 484 22 31 961 784 28 27 729 1985 95 125 3205 ∑ ∑ ∑ ∑ = = = = 3205 1985 125 95 2 2 Y X Y X
  • 28. What are the preliminary steps to calculating a correlation coefficient? • In the next step, the product of each pair of X and Y scores is obtained. X2 X Y Y2 100 10 19 361 256 16 25 625 361 19 23 529 484 22 31 961 784 28 27 729 1985 95 125 3205
  • 29. What are the preliminary steps to calculating a correlation coefficient? • In the next step, the product of each pair of X and Y scores is obtained. • and then summed. X2 X XY Y Y2 100 10 190 19 361 256 16 400 25 625 361 19 437 23 529 484 22 682 31 961 784 28 756 27 729 1985 95 2465 125 3205
  • 30. What are the preliminary steps to calculating a correlation coefficient? • Using the summation notation… X2 X XY Y Y2 100 10 190 19 361 256 16 400 25 625 361 19 437 23 529 484 22 682 31 961 784 28 756 27 729 1985 95 2465 125 3205 ∑ ∑ ∑ ∑ ∑ = = = = = 2465 3205 1985 125 95 2 2 XY Y X Y X
  • 31. What are the preliminary steps to calculating a correlation coefficient? • The last of the preliminary steps is to simply determine the number of people being included in the calculations. In this case, the calculations involve 5 students. Therefore... X2 X XY Y Y2 100 10 190 19 361 256 16 400 25 625 361 19 437 23 529 484 22 682 31 961 784 28 756 27 729 1985 95 2465 125 3205
  • 32. What are the preliminary steps to calculating a correlation coefficient? • The last of the preliminary steps is to simply determine the number of people being included in the calculations. In this case, the calculations involve 5 students. Therefore... X2 X XY Y Y2 100 10 190 19 361 256 16 400 25 625 361 19 437 23 529 484 22 682 31 961 784 28 756 27 729 1985 95 2465 125 3205 5=n
  • 33. What are the preliminary steps to calculating a correlation coefficient? • In summary, our six values used to calculate the correlation coefficient are… ∑ ∑ ∑ ∑ ∑ = = = = = = 2465 3205 1985 125 95 5 2 2 XY Y X Y X n X2 X XY Y Y2 100 10 190 19 361 256 16 400 25 625 361 19 437 23 529 484 22 682 31 961 784 28 756 27 729 1985 95 2465 125 3205
  • 34. Using the computational formula... ∑ ∑ ∑ ∑ ∑ = = = = = = 2465 3205 1985 125 95 5 2 2 XY Y X Y X n
  • 35. Using the computational formula... A somewhatA somewhat impressive lookingimpressive looking formula uses theseformula uses these six values tosix values to compute thecompute the correlationcorrelation coefficient...coefficient... ∑ ∑ ∑ ∑ ∑ = = = = = = 2465 3205 1985 125 95 5 2 2 XY Y X Y X n
  • 36. A somewhatA somewhat impressive lookingimpressive looking formula uses theseformula uses these six values tosix values to compute thecompute the correlationcorrelation coefficient…,coefficient…, however the formulahowever the formula turns out not to beturns out not to be very difficult to use.very difficult to use. ∑ ∑ ∑ ∑ ∑ = = = = = = 2465 3205 1985 125 95 5 2 2 XY Y X Y X n Using the computational formula...
  • 37. ( )( ) ( )[ ] ( )[ ]∑ ∑∑ ∑ ∑∑ ∑ −− − = 2222 YYnXXn YXXYn r ∑ ∑ ∑ ∑ ∑ = = = = = = 2465 3205 1985 125 95 5 2 2 XY Y X Y X n The formula is...The formula is... Using the computational formula...
  • 38. ( )( ) ( )[ ] ( )[ ]∑ ∑∑ ∑ ∑∑ ∑ −− − = 2222 YYnXXn YXXYn r ∑ ∑ ∑ ∑ ∑ = = = = = = 2465 3205 1985 125 95 5 2 2 XY Y X Y X n The variables in thisThe variables in this formula consist of onlyformula consist of only the six previouslythe six previously calculated values to thecalculated values to the left...left... Using the computational formula...
  • 39. ∑ ∑ ∑ ∑ ∑ = = = = = = 2465 3205 1985 125 95 5 2 2 XY Y X Y X n Here is the formula withHere is the formula with these values inserted...these values inserted... Using the computational formula... ( )( ) ( )[ ] ( )[ ]∑ ∑∑ ∑ ∑∑ ∑ −− − = 2222 YYnXXn YXXYn r ( )( ) ( )( ) ( )( ) ( )[ ]( )( ) ( )[ ]22 125320559519855 1259524655 −− − =r
  • 40. The correlation between these students readingThe correlation between these students reading readiness scores and later reading achievementreadiness scores and later reading achievement scores is 0.75scores is 0.75 X Reading Readiness Scores Y Reading Achievement Scores Todd 10 19 Andrea 16 25 Kristen 19 23 Luis 22 31 Scott 28 27 Using the computational formula…
  • 41. Determining Significance ►Test whether the association is greater than can be expected by chance ►Hypotheses – H0: ρ = 0 – H1: ρ ≠ 0 ►df = n – 2 – n is the total number of subjects ►Use the Pearson correlation table ►If your correlation score is greater than the score given in the table (critical value), then your correlation is significant
  • 42. Now its your turn...
  • 43. Now its your turn... • To the right are the scores of four students on a spelling test and a vocabulary test. Can you calculate the correlation coefficient? X Spelling Y Vocabulary Sandra 8 10 Neil 5 6 Laura 4 7 Jerome 1 3
  • 44. Now its your turn... • On your own paper, calculate these six values: ∑ ∑ ∑ ∑ ∑ = = = = = = XY Y X Y X n 2 2 X Spelling Y Vocabulary Sandra 8 10 Neil 5 6 Laura 4 7 Jerome 1 3
  • 45. Now its your turn... • You should get these values: 141 194 106 26 18 4 2 2 ∑ ∑ ∑ ∑ ∑ = = = = = = XY Y X Y X n X2 X XY Y Y2 64 8 80 10 100 25 5 30 6 36 16 4 28 7 49 1 1 3 3 9 106 18 141 26 194
  • 46. Now its your turn... • Now insert these values in the equation 141 194 106 26 18 4 2 2 ∑ ∑ ∑ ∑ ∑ = = = = = = XY Y X Y X n ( )( ) ( )[ ] ( )[ ]∑ ∑∑ ∑ ∑∑ ∑ −− − = 2222 YYnXXn YXXYn r ( )( ) ( )( ) ( )( ) ( )[ ]( )( ) ( )[ ]22 261944181064 26181414 −− − =r 96.0 100 96 ==r
  • 47. Significant at alpha = .05? ►What is the critical value? 1. .95 2. .90 3. .811 4. .632
  • 48. Significant? ►Is this correlation significant? 1.Yes 2.No
  • 50. The Linear Equation • If two variables are linearly related it is possible to develop a simple equation to represent the relationship • E.g. centigrade to Fahrenheit: –F = 1.8C + 32 – this formula gives a specific straight line
  • 51. The Linear Equation • Equation of the line (Y = bX + a) – a and b are constants in a given line; – X and Y change Predictor Criterion
  • 52. The Linear Equation • Equation of the line (Y = bX + a) – The slope (b) • the amount of change in y with one unit change in x • On a graph, it is represented by how steep the line is.
  • 53. The Linear Equation • When b changes (different formulas) Predictor Criterion
  • 54. The Linear Equation • Equation of the line (Y = bX + a) – The intercept (a) • the value of y when x is zero • On a graph, it is represented by where the line crosses the y axis
  • 55. The Linear Equation • When a changes (different formulas) Predictor Criterion
  • 56. Practice • Y = 32(.3) + 10 • Identify the slope 1. 32 2. .3 3. 10
  • 57. Practice • Y = 32(.3) + 10 • Identify the Y intercept 1. 32 2. .3 3. 10
  • 58. The Regression Line • Relationships are rarely perfect. Scores are “scattered”. • The regression line is a straight line which is drawn through a scatterplot, to summarize the relationship between X and Y • It is the line that minimizes the squared deviations (Y – Y’)2 • We call these vertical deviations “residuals”
  • 59. When there is some linear association, the regression line fits as close to the points as possible 150 175 200 225 250 67 68 69 70 71 72 73 74 75 76 77 Weight in Pounds Height in Inches The 2001 Mets
  • 60. Calculating the regression lineCalculating the regression line ► To the right are theTo the right are the scores of four studentsscores of four students on a spelling test and aon a spelling test and a vocabulary test.vocabulary test. ► Sallie has just takenSallie has just taken the spelling test andthe spelling test and scored a 6. What doscored a 6. What do you predict heryou predict her vocabulary score tovocabulary score to be?be? X Spelling Y Vocabulary Sandra 6 8 Neil 5 6 Laura 4 7 Jerome 1 3
  • 61. Means, Sums, and Products X Spelling Y Vocabulary 6 8 5 6 4 7 1 3 M=4 M=6
  • 62. Means, Sums, and ProductsMeans, Sums, and Products X Spelling Y Vocabulary X-Mx Y-MY 6 8 2 2 5 6 1 0 4 7 0 1 1 3 -3 -3 M=4 M=6
  • 63. Means, Sums, and ProductsMeans, Sums, and Products X Spelling Y Vocabulary X-Mx Y-MY (X-Mx)( Y-MY) 6 8 2 2 4 5 6 1 0 0 4 7 0 1 0 1 3 -3 -3 9 M=4 M=6 13=SP
  • 64. Means, Sums, and ProductsMeans, Sums, and Products X Spelling Y Vocabulary X-Mx Y-MY (X-Mx)( Y-MY) (X-Mx)2 6 8 2 2 4 4 5 6 1 0 0 1 4 7 0 1 0 0 1 3 -3 -3 9 9 M=4 M=6 13=SP 14=SSx
  • 65. Now the formulasNow the formulas X Spelling Y Vocabulary X-Mx Y-MY (X-Mx)( Y-MY) (X-Mx)2 6 8 2 2 4 4 5 6 1 0 0 1 4 7 0 1 0 0 1 3 -3 -3 9 9 M=4 M=6 13=SP 14=SSx 93. 14 13 === xSS SP b 28.2)4(93.6 =−=−= XY bMMa
  • 66. Now the formulas 86.728.2)6(93. ^ =+=+= abXY Sallie should get a vocabulary score of 7.86
  • 67. Causation • A strong relationship between variables does not always mean that changes in one variable cause changes in the other variable.
  • 68. Causation • The relationship between two variables is often influenced by other variables lurking in the background. “Beware the lurking variable!
  • 69. Causation • The best evidence of causation comes from randomized comparative experiments.
  • 71. Chi-Square • Examines nominal data or ordinal data that is being treated as a category • Called a non-parametric test – Chi-square requires no assumptions about the shape of the population distribution from which a sample is drawn. • The test examines the difference between observed counts and expected values
  • 72. Chi-square Goodness of Fit • Two ways to use the chi-square • First way to use the chi-square is called the Goodness of Fit test – Determines whether a frequency distribution follows a claimed distribution • Hypothesis test – Ho: the variable follows the claimed distribution – H1: the variable does not follow the claimed distribution
  • 73. Chi-square Goodness of Fit • The FBI compiles data on crime and crime rates and publishes the information in Crime in the United States. A violent crime is classified by the FBI as murder, forcible rape, robbery, or aggravated assault. Types of violent crime Relative frequency Murder 0.012 Forcible rape 0.054 Robbery 0.323 Agg. assault 0.611 1.000 Types of violent crime Frequency Murder 9 Forcible rape 26 Robbery 144 Agg. assault 321 500 Crime Distribution for 1995 Last Year
  • 74. Chi-square Goodness of Fit • Do the data provide sufficient evidence to conclude that last year’s distribution of violent crimes has changed from the 1995 distribution? • Get expected frequency E = Np Types of violent crime Relative frequency p Expected frequency Np =E Murder 0.012 (500)(0.012) = 6.0 Forcible rape 0.054 (500)(0.054) = 27.0 Robbery 0.323 (500)(0.323) = 161.5 Agg, assault 0.611 (500)(0.611) = 305.5
  • 75. Chi-square Goodness of Fit • Then calculate the chi formula Cell O E O-E (O-E)2 (O-E)2 /E Murder 9 6 3 9 1.5 Forcible Rape 26 27 -1 1 0.037 Robbery 144 161.5 -17.5 306.25 1.896 Agg. Assault 321 305.5 15.5 240.25 0.786 χχ22 = 4.219= 4.219 ( ) ∑ − = E EO 2 2 χ
  • 76. Chi-square Goodness of Fit • Finally – Use Table to find critical value – df = k – 1, where k is the number of cells – Example – df = 3 – Critical value is 7.815 – Our value is 4.219 so fail to reject – This means that the pattern of crime has not changed when comparing 1995 to last year.
  • 77. Chi-square Test of Independence • Second way to use a chi-square is the test of independence – Hypotheses • H0: Variables Are Independent • Ha: Variables Are Related (Dependent)
  • 78. Chi-square Test of Independence • We are interested in whether single men vs. women are more likely to own cats vs. dogs. • Notice that both variables are categorical. – Kind of pet: people are classified as owning cats or dogs. We can count the number of people belonging to each category – Sex: people are male or female. We count the number of people in each category
  • 79. Chi-square Test of Independence • Are these differences because there is a real relationship between gender and pet ownership? • Or is there actually no relationship between these variables? Cat Dog Male 20 30 50 Female 30 20 50 50 50 100
  • 80. Chi-square Test of Independence • To answer this question, we need to know what we would expect to observe if the null hypothesis were true • The differences between these expected values and the observed values are aggregated according to the Chi-square formula
  • 81. Chi-square Test of Independence • To find expected value for a cell of the table, multiply the corresponding row total by the column total, and divide by the grand total • For the first cell (and all other cells), (50 x 50)/100 = 25 • Thus, if the two variables are unrelated, we would expect to observe 25 people in each cell Cat Dog Male 20 30 50 Female 30 20 50 50 50 100
  • 82. Chi-square Test of Independence • Then apply to the same chi-square formula ( ) ∑ − = E EO 2 2 χ Cell O E O-E (O-E)2 (O-E)2 /E Male w/ Car 20 25 -5 25 1 Male w/ Dog 30 25 5 25 1 Female w/ Cat 30 25 5 25 1 Female w/ Dog 20 25 -5 25 1 χχ22 = 4= 4
  • 83. Chi-square Test of Independence • Compare to critical value from chi-square table. • Degrees of freedom is – (number of rows – 1)(number of columns -1) – In our example (2-1)(2-1)= 1 – Critical value is 3.841 – Our value of 4 is greater than the critical so reject the null. Cat Dog Male 20 30 50 Female 30 20 50 50 50 100

Editor's Notes

  1. Early developments Sir Francis Galton was very interested in these issues in the 1880’s Galton was the cousin of Darwin and thus he became interested in evolution and heredity Galton had an intuition that the heredity could be understood in terms of deviations from means He began to measure characteristics of plants and animals including people Lots of normality Galton found many characteristics that were normally distributed Height in humans (by gender) Weight in humans (by gender) Length of animal bones Weights of seeds of plants Sweat peas The next step that Galton took was to look at distributions of measurements of parents and offspring together He first looked at sweet pea plants because the female plants can self fertilize This makes examining the data easier because only one parent influences the characteristics of the offspring He plotted mother seed sizes against daughter seed sizes