Correlation

Dr.C.Hemamalini
Assistant Professor
Department of Economics
Ethiraj College for Women
Chennai 600 008
India

 The method of correlation is expanded by Francis Galton in 1885.
 Correlation is a statistical technique that can reveal whether and how
strongly pairs of variables are associated.
 Correlation is a term measure the strength of a linear relationship between
two quantitative variables.
 Correlation used in measuring the closeness of the relationship between
the variables. Example Price and Demand


 Simpson and Kofka
“Correlation analysis deals with the association between two or more variables”.
 Ya Lun Chow
“Correlation analysis attempts to determine the degree of relationship between variables”.
 Croxton and Cowden
“When the relationship is of a quantitative nature, the appropriate statistical tool for
discovering and measuring the relationship and expressing it in brief formula is known as
correlation”.

 Correlation can measure the degree of relationship existing between the variables. It
measures the strength of linear relationship.
 Correlation analysis contributes to the understanding of economic behaviour.
 Correlation deals executive to estimate costs, prices and other variables.
 The effect of correlation is to reduce the range of uncertainty. The prediction based on
correlation analysis is likely to be more reliable and near to reality.

 It does not tell us anything about cause and effect relationship.
 It establish only covariation. The correlation may be due to pure chance, especially
in a small sample.
 The variables may be mutually influencing each other so that neither can be
designated as the cause and the other the effect.

Correlation
Positive
and
Negative
Simple, Partial
& Multiple
Linear &
Non
Linear

 The correlation between the variables is positive or negative depends on
its direction of change.
 Two variables are positively correlated when they move together in the same
direction. Example quantity supplied increases as the price increases.
 Positive coefficient of correlation 0 to + 1
X 10 12 15 18 20
Y 15 20 22 25 37

 A negative correlation is a relationship between two variables in which an increase in one
variable is associated with a decrease in the other. Example the Price of Product decreases
Quantity Demand increases.
 An inverse relation between the variables. Negative coefficient of correlation 0 to -1
 A zero correlation exists when there is no relationship between two variables. Example their is no
relationship between the amount of tea drunk and level of intelligence.

X 100 90 60 40 30
Y 10 20 30 40 50

y = 1.8382x - 3.7735
R² = 0.8485
Price
Quantity
Positive Correlation
y = -0.5108x + 62.688
R² = 0.9704
Price
Quantity
Negative Correlation
y = 0.1042x + 72.271
R² = 1E-04
LevelofIntelligence
Amount of Tea Drunk
Zero Correlation
Source: Primary Data

 The correlation is said to be simple when only two variables are studied.
 The correlation is said to be Multiple when three or more variables are studied
simultaneously. Example the study the relationship between the yield of wheat per
acre and the amount of fertilizers and rainfall.
 In partial correlation study more than two variables, but consider only two among
them that would be influencing each other such that the effect of the other
influencing variable is kept constant. Example study the relationship between the
yield and fertilizers used the particular periods - Partial Correlation.

 The Correlation is linear when the amount of change in one variable to the amount of
change in another variable tends to bear a constant ratio. It shows that the ratio of
change between the variables is the same.
 The correlation is called as non - linear or curvilinear when the amount of change in
one variable does not bear a constant ratio to the amount of change in the other
variable. Example If the amount of fertilizers is doubled the yield of wheat would not
be necessarily be doubled.
X 10 20 30 40 50
Y 20 40 60 80 100

y = 2x
R² = 1
VariableY
Variable X
Linear Correlation

Graphic Method
Scatter Diagram Method
Karl Pearson Coefficient Correlation of Method
Spearman’s Rank Correlation Method
Concurrent Deviation Method
Method of Least Squares

 The values of dependent series are plotted on X axis and independent
series are plotted on Y axis of graph paper.
 The graph lines of two independent series move in upward direction -
Positive Correlation
 The graph line of one series moves upward from left to right and that of
the other independent series moves downward from left to right -
Negative Correlation.

 The pairs of values are plotted on the graph paper, graphs of dots are obtained. Its called
scatter diagrams or dotograms.
 When the dots appear to be situated on a line which advances upward at 45° angle from the
0 to X axis - Perfect Positive Correlation.
 If the dots appear to be situated on a line which moves from left to right in downward
direction at 45° angle from 0 to X axis - Perfect Negative Correlation.

 Merits
 Its is a very simple method of studying correlation between two variables
 It explains if the values of the variables have any relation or not
 Scatter diagram indicates whether the relationship is positive or negative
 Demerits
 Scatter diagram does not measure the precise the extent of correlation
 It gives only an approximate idea of the relationship
 It is only an qualitative expression of the qualitative change

 Karl Pearson’s Coefficient of Correlation is used to calculate the degree and direction of
the relationship between linear related variables.
 Pearson’s method is known as a Pearson Coefficient of Correlation, It is denoted by “r”
 Pearson’s Coefficent correlation can be transforms formula

Calculate Karl Pearson’s coefficient of correlation from the following data and interpret
is value:
Roll No. of Students 1 2 3 4 5
Marks in Accountancy 48 35 17 23 47
Marks in Statistics 45 20 40 25 45
Roll No X X2
Y y2
xy
1 48 14 196 45 10 100 140
2 35 1 1 20 -15 225 -15
3 17 -17 289 40 5 25 -85
4 23 -11 121 25 -10 100 110
5 47 13 169 45 10 100 130
Solution:

Solution
x 9 8 7 6 5 4 3 2 1
y 15 16 14 13 11 12 10 8 9
x x2 y y2 xy
9 81 15 225 135
8 64 16 256 128
7 49 14 196 98
6 36 13 169 78
5 25 11 121 55
4 16 12 144 48
3 9 10 100 30
2 4 8 64 16
1 1 9 81 9

X
X/100
(X’-4)
x x²
Y Y/10 (Y’- 8)
y y²
xy
100 1 -3 9 30 3 -5 25 15
200 2 -2 4 50 5 -3 9 6
300 3 -1 1 60 6 -2 4 2
400 4 0 0 80 8 0 0 0
500 5 1 1 100 10 2 4 2
600 6 2 4 110 11 3 9 6
700 7 3 9 130 13 5 25 15
Ʃx’ = 28 Ʃx = 0 Ʃx2 = 28 Ʃy/ = 56 Ʃy= 0 Ʃy2 = 76 Ʃxy = 46
X 100 200 300 400 500 600 700
Y 30 50 60 80 100 110 130
Solution

WHEN DEVIATIONS ARE FROM AN ASSUMED MEAN
Calculate the coefficient of correlation and calculate the probable error.
X dx (X-69)
dx²
Y dy(Y-112) dy² dxdy
78 9 81 125 13 169 117
89 20 400 137 25 625 500
99 30 900 156 44 1936 1320
60 -9 81 112 0 0 0
59 -10 100 107 -5 25 50
79 10 100 136 24 576 240
68 -1 1 123 11 121 -11
61 -8 64 108 4 16 32
ƩX= 593 Ʃdx= 41 Ʃdx² =1727 ƩY= 1004 Ʃdy= 108 Ʃdy² =3468 Ʃdxdy= 2248

 Conditions of Probable Error
 The data must approximate to the bell shaped curve. (Normal Frequency Curve)
 The Probable error computed from the statistical measure must have been taken from
the sample .
 The Sample items must be selected in an unbiased manner and must be independent of
each other.
 The Probable Error of Correlation Coeficient helps in determining the accuracy and
reliability of the value of the coefficient that in so far depends on the random
sampling.
 Probable Error =

Definition
•Spearman’s Rank Correlation Coefficient is a technique which can be used to
summarise the strength and direction (negative or Positive)of a relationship between
two variables. The result will always between +1 to -1.
•Where R denotes Rank Correlation Coefficient
•D refers to the difference of the rank between paired items in to series
•Rank Correlation (when rank are not given)
•Ranks can be assigned by taking either the highest value as 1 or the lowest value as 1

 The value of such co-efficient of correlation lies between +1 and -1.
 The sum of the differences between the corresponding ranks i.e. ∑d=0.
 It is independent of the nature of distribution from which the sample data are collected for
calculation of the co-efficient.
 It is calculation on the basis of the ranks of the individual items rather than their actual
values.
 Its result equals with the result of Karl Pearson’s co-efficient of correlation unless there is
repletion of any rank. This is because, Spearman’s correlation is nothing more than the
Pearson’s co-efficient of correlation between the ranks.

Rank
(English)
Rank
(Maths)
d= ( Rx- Ry) d2
6 3 3 9
5 8 3 9
3 4 1 1
10 9 1 1
2 1 1 1
4 6 2 4
9 10 1 1
7 7 0 0
8 5 3 9
1 2 1 1
ƩD2= 36

x 50 66 34 21 15 79 42
y 31 64 53 41 17 73 29
Solution
Marks by
X
Rx Marks by
Y
Ry
50 5 31 3 4
66 6 64 6 0
34 3 53 5 4
21 2 41 4 4
15 1 17 1 0
79 8 73 7 0
42 4 29 2 4

Marks in Commerce 15 20 28 12 40 60 20 80
Marks in Maths 40 30 50 30 20 10 30 60
Marks in
Commerce
(X)
Rank (Rx) Marks in
Mathematics
(Y)
Rank (Ry) D=(Rx-Ry) D2
15 2 40 6 -4 16
20 3.5 30 4 -0.5 0.25
28 5 50 7 -2 4
12 1 30 4 -3 9
40 6 20 2 4 16
60 7 10 1 6 36
20 3.5 30 4 -0.5 0.25
80 8 60 8 0 0

Marks in Commerce and Mathematics are uncorrelated

 Reference Books
 S.P Gupta, Statistical Methods, Sultan Chand Sons, New Delhi-2017
 Web Source
 https://www.youtube.com/watch?v=4EXNedimDMs
 https://www.youtube.com/watch?v=YoeV_1M3xuc

Correlation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Correlation

Similar to Correlation (20)

Recently uploaded

Recently uploaded (20)

Correlation