The presentation explains the theory of what is Frequency distribution, central tendency, measures of dispersion. It also has numericals on how to find CT for grouped and ungrouped data.
2. “The table into which the data are grouped is referred to as
frequency distribution, the average that can be computed are
measures of Central Tendency or Central location of the data,
and the measure of spread around the average is a measure
of Dispersion”
Source: Morris Hamburg- Statistical analysis for decision making
3.
Frequency distribution
Ungrouped
Grouped
Central Tendency
Mean
Median
Mode
Measures of Dispersion
Range
Mean deviation
Standard Deviation
4.
Frequency distribution is a way of organizing your
data so that it makes more sense.
A frequency distribution of data can be shown in a
table or graph.
Some common methods of showing frequency
distributions include frequency tables, histograms or
bar charts.
Frequency distribution
5.
Given below are marks obtained by 20 students in Math
out of 25.
21, 23, 19, 17, 12, 15, 15, 17, 17, 19, 23, 23, 21, 23, 25, 25,
21, 19, 19, 19
Frequency distribution of ungrouped data:
6. The presentation of the earlier data can be expressed
into groups. These groups are called classes or
the class interval.
Each class interval is bounded by two figures called
the class limits.
Frequency distribution of grouped data:
Marks obtained by 20 students in Math out of 25.
21, 23, 19, 17, 12, 15, 15, 17, 17, 19, 23, 23, 21, 23, 25, 25, 21, 19, 19, 19
7. Exclusive form of data:
In this, the class intervals are 0 - 10, 10 - 20, 20 - 30. In
this, we include lower limit but exclude upper limit.
So, 10 - 20 means values from 10 and more but less
than 20.
20 - 30 would mean values from 20 and more but less
than 30.
Marks obtained by 20 students in Math out of 25.
21, 23, 19, 17, 12, 15, 15, 17, 17, 19, 23, 23, 21, 23, 25, 25, 21, 19, 19, 19
8. Data in the inclusive form:
Marks obtained by 20 students of class VIII in Math text are given below.
23, 0, 14, 10, 15, 3, 8, 16, 18, 20, 1, 3, 20, 23, 24, 15, 24, 22, 14, 13
Let us represent this data in the inclusive form.
Here, also we arrange the data into different groups called class intervals,
i.e., 0 - 10, 11 - 20, 21 - 30.
0 to 10 means between 0 and 10 including 0 and 10.
Here, 0 is the lower limit and 10 is the upper limit. 11 to 20 means
between 11 and 20 including 11 and 20.
Here, 11 is the lower limit and 20 is the upper limit.
9. A frequency table is a simple way to display the number of
occurrences of a particular value or characteristic.
The absolute frequency describes the number of times a
particular value for a variable (data item) has been
observed to occur.
A relative frequency describes the number of times a
particular value for a variable (data item) has been
observed to occur in relation to the total number of values
for that variable. [Class frequency/ Total frequency]
10.
Frequency Graphs
Histograms and bar charts are both visual displays
of frequencies using columns plotted on a graph.
The Y-axis (vertical axis) generally represents the
frequency count, while the X-axis (horizontal axis)
generally represents the variable being measured.
11. A histogram is a type of graph in which each
column represents a numeric variable, in particular
that which is continuous and/or grouped.
12. A bar chart is a type of graph in which each column
(plotted either vertically or horizontally) represents a
categorical variable or a discrete ungrouped numeric
variable.
13. Example: Let’s say you have a list of IQ scores for a gifted
classroom in a particular elementary school. The IQ scores are: 118,
123, 124, 125, 127, 128, 129, 130, 130, 133, 136, 138, 141, 142, 149, 150,
154. That list doesn’t tell you much about anything. You could
draw a frequency distribution table, which will give a better
picture of your data.
Step 1: Figure out how many classes (categories) you need.
There are no hard rules about how many classes to pick, but
there are a couple of general guidelines:
Pick between 5 and 20 classes. Eg. For the list of IQs above, we
picked 5 classes.
Make sure you have a few items in each category. For example, if
you have 20 items, choose 5 classes (4 items per category), not 20
classes (which would give you only 1 item per category).
How to Draw a Frequency
Distribution Table: Steps.
14. Note: There is a more mathematical way to choose classes. The formula
is log(observations) log(2). You would round up the answer to the next
integer. For example, log17log2 = 4.1 will be rounded up to become 5.
Step 2: Subtract the minimum data value from the
maximum data value. For example, our IQ list above had
a minimum value of 118 and a maximum value of 154, so:
154 – 118 = 36
Step 3: Divide your answer in Step 2 by the number of
classes you chose in Step 1.
36 / 5 = 7.2
Step 4: Round the number from Step 3 up to a whole
number to get the class width.
Rounded up, 7.2 becomes 8.
Step 5: Write down your lowest value for your first
minimum data value: The lowest value is 118
15. Step 6: Add the class width from Step 4 to Step 5 to get the next lower class
limit:
118 + 8 = 126
Step 7: Repeat Step 6 for the other minimum data values (in other words, keep
on adding your class width to your minimum data values) until you have
created the number of classes you chose in Step 1. We chose 5 classes, so our 5
minimum data values are:
118
126 (118 + 8)
134 (126 + 8)
142 (134 + 8)
150 (142 + 8)
Step 8: Write down the upper class limits. These are the highest values that
can be in the category, so in most cases you can subtract 1 from the class
width and add that to the minimum data value. For example:
118 + (8 – 1) = 125
118 – 125
126 – 133
134 – 141
142 – 149
150 – 157
16. Step 9: Add a second column for the number of items in each class, and label the
columns with appropriate headings:
IQ NUMBER
118-125
126-133
134-141
142-149
150-157
Step 10: Count the number of items in each class, and put the total in the second
column. The list of IQ scores are: 118, 123, 124, 125, 127, 128, 129, 130, 130, 133, 136,
138, 141, 142, 149, 150, 154.
IQ NUMBER
118-125 4
126-133 6
134-141 3
142-149 2
150-157 2
17.
Measures of Central tendency gives us an idea about
the concentration of the values in the central part of
the distribution of frequencies of the whole data.
The term central tendency refers to the “middle”
value or perhaps a typical value of the data, and is
measured using the mean, median, or mode.
Each of these measures is calculated differently, and
the one that is best to use depends upon the
situation.
Central Tendency
18. Arithmetic mean
The mean is the most commonly used measure of
central tendency. When we talk about an “average”, we
usually refer it to the mean.
The mean is simply the sum of the values divided by
the total number of items in the set. The result is
referred to as the arithmetic mean.
Sometimes it is useful to give more weightage to certain
data points, in that case the result is called the weighted
arithmetic mean. Eg: Subject credits
Mean
19.
Example 1:
What is the Mean of these numbers?
6, 11, 7
Add the numbers: 6 + 11 + 7 = 24
Divide by how many numbers (there are 3 numbers):
24 / 3 = 8
The Mean is 8
20. Ungrouped Data
Example 2: Calculate Arithmetic mean for
x f (frequency)
1 5
2 9
3 12
4 17
5 14
6 10
7 16
N= 83
x = sum of fx/ N
=299/83
= 3.6
fx
5
18
36
68
70
60
42
299
21. Grouped Data
Example 3: Calculate Arithmetic mean for
Marks Students (f)
0-10 12
10-20 18
20-30 27
30-40 20
40-50 17
50-60 6
Total 100
x = sum of fx/ N
=2800/100
= 28
Mid Value (x) fx
5 60
15 270
25 675
35 700
45 765
55 330
2800
22.
The Median is the "middle" of a sorted list of numbers.
i.e It is the middle most value of the data arranged in
ascending or descending order of magnitude.
Thus median is a positional average.
Median
23. Example 1:
How to Find the Median Value
3, 13, 7, 5, 21, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29
When we put those numbers in order we have:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40, 56
There are fifteen numbers. Our middle is
the eighth number:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40, 56
The median value of this set of numbers is 23.
24. Example 2: Two Numbers in the Middle
3, 13, 7, 5, 21, 23, 23, 40, 23, 14, 12, 56, 23, 29
When we put those numbers in order we have:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56
There are now fourteen numbers and so we don't have just one
middle number, we have a pair of middle numbers:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56
In this example the middle numbers are 21 and 23.
To find the value halfway between them, add them together and
divide by 2:
(21 + 23)/2
then 44 ÷ 2 = 22
So the Median in this example is 22.
25. Median of grouped data
Median = L1 + (N/2) – C X h
f
L1= lower class limit of the Class Interval in which the
median lies
N = Cumulative frequency
C = Cumulative frequency of the class preceding the
median class
f = frequency of the median class
h = Class Interval
26. Example 3:
Wages (in Rs.) Labours Cumulative
Frequency
20-30 3 3
30-40 5 8
40-50 20 28
50-60 10 38
60-70 5 43
Median = L1 + (N/2) – C X h
f
L1 f
C
N
= 40 + (43/2) - 8 X 10
20
= 40 + 21.5 - 8 X 10
20
= 46.75
27. Example 4:
Class Interval Frequency Cumulative
Frequency
0-10 5 5
10-20 14 19
20-30 29 48
30-40 21 69
40-50 25 94
50-60 21 115
60-70 10 125
70-80 7 132
80-90 15 147
90-100 3 150
Median = 40 + (150/2) - 69 X 10
25
= 42.4
28.
The mode refers to that value in a distribution which
occurs most frequently.
It is an actual value which has the highest
concentration of items in and around it.
Mode
29. Example 1 :
3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29
In order these numbers are:
3, 5, 7, 12, 13, 14, 20, 23, 23, 23, 23, 29, 39, 40, 56
This makes it easy to see which numbers appear most
often.
In this case the mode is 23.
30. More Than One Mode
Example 2:
{1, 3, 3, 3, 4, 4, 6, 6, 6, 9}
3 appears three times, as does 6.
So there are two modes: at 3 and 6
Having two modes is called "bimodal".
Having more than two modes is called "multimodal".
31. Mode for Grouped Data
Mode = L1 + f1- f0 X h
2f1- f0- f2
f1= Frequency of modal class
f0= Frequency of the class succeeding the modal class
f2= Frequency of the class after the modal class
L1= lower limit of the modal class
h =Class Interval
32. Example 3:
Find mode for
Class Frequency
0-5 9
5-10 12
10-15 15
15-20 16
20-25 17
25-30 15
30-35 10
35-40 13
Mode = L1 + f1- f0 X h
2f1- f0- f2
= 20 + 17- 16 X 5
(2*17)- 16- 15
= 21.667
f0
f1
f2
34.
Recap :Levels of Measurement
Levels of Measurement Characteristics. The Measurement Values can
be…
Nominal Scale Distinguished
Ordinal Scale Distinguished and ranked
Interval Scale Distinguished, ranked and measured with
constant units of measurement
Ratio Scale Distinguished, ranked, measured with
constant units of measurement and have a zero
point
35.
Dispersion or spread is the degree of the scatter of
variation of the variable about a central line.
They tell us how spread out our data is.
Range
Mean deviation
Variance, Standard Deviation
Measure of dispersion
36.
Range is the simplest measure of dispersion is defined
as the difference between the two extreme observations.
Range = Largest value- Smallest value
Find the range for the following data-
3, 6, 7, 9, 10
Range= 10-3 =7
Range
37. Coefficient of Range
Coefficient of Range = Xmax – Xmin
Xmax + Xmin
Range is an absolute measure of dispersion which is
unfit for purposes of comparison if the data are in
different units.
Eg. Range of data of heights and range of data of
weights cannot be compared because measurement
units are not same.
So, we can use coefficient of range to compare the range
of both data.
38.
The Mean deviation is the Arithmetic mean of the
deviation of the individual values from the average
of the given data.
Mean deviation is denoted by small data ‘S’
Sx (mean deviation from mean)
Smd (mean deviation from median)
Smo (mean deviation from mode)
Mean deviation
39. Coefficient of Mean deviation
It will give a reductive measure of dispersion suitable
for comparing two or more data which are expressed in
different units of measurement.
Coefficient of Mean deviation = Mean deviation
Mean or Median or Mode
If the result is desired in percentage then
Coefficient of Mean deviation = Mean deviation * 100
Mean or Median or Mode
40. Sr. No. Value
1 10
2 15
3 18
4 20
5 20
6 22
7 23
8 25
9 27
10 30
Total 210
Mean= 210/10 = 21
Median = ½ (10+1)th observation or 5.5th observation
= (20+22) /2 =21
Mode = 20
Deviation from
Mode
10
5
2
0
0
2
3
5
7
10
44
Deviation from
Mean
11
6
3
1
1
1
2
4
6
9
44
Deviation from
Median
11
6
3
1
1
1
2
4
6
9
44
Example: Find the Mean deviation from Mean, Median and Mode
Mean Deviation = 44/10
41.
The Standard Deviation is a measure of how spread out
numbers are. Its symbol is σ (the greek letter sigma)
The formula for SD : Square root of the Variance.
The Variance is defined as: The average of
the squared differences from the Mean.
Variance & Standard Deviation
42. The heights (at the shoulders) are: 600mm, 470mm,
170mm, 430mm and 300mm.
43. Using the Standard Deviation we have a "standard" way of knowing what is normal, and
what is extra large or extra small.
44.
Statistical hypothesis testing is found on the assessment
of differences and similarities between frequency
distributions.
This assessment also involves measures of central
tendency or averages, such as the mean and median,
and measures of variability or statistical dispersion,
such as the standard deviation or variance.
Why do Planners need to use
FD, CT & SD?