2. 2
Population
Unknown: we would like to
make inferences (statement)
about
Take a
Sample
Use the sample to say
something about the
population
3. 3
Sampling: A Pictorial View
Sample
Target Population
Sampled
(Study)
Population
Sample
Target Population Sampled(STUDY) Population
4. Sampling variation
We need to distinguish between a population and a sample of a
population
• Data from a sample:
• The mean of a variable ( ) is considered an estimate of the
true population mean (µ)
• The standard deviation of the variable (s) estimates the
population standard deviation (s)
However, if another sample is drawn:
The second sample mean will differ from the first sample mean
• This is called sampling variation
x
5. µ = population mean
1 , 2 are sample means
1 2
µ
Frequency
Population
Samples
6. How variable are the sample means
The Standard Deviation (sd):
• The sample standard deviation (s) estimates the variability of the
individual data in the population (s)
The Standard Error (SE)
• Represents the variability of the sample means
• Can be estimated from the standard deviation as s/√n
• Standard Error decreases as the sample size increases
standard deviation represents the variability in the individual data
standard error represents the variability in the sample means
8. Definitions
Continuous data are data such as age, weight, height,
haemoglobin.
In descriptive stats we use the mean and standard
deviation to describe these data
Assumption the data are approximately Normal
If not we could use median and IQR to describe the
data
9. WHY THE NORMAL DISTRIBUTION IS
IMPORTANT
A good empirical description of the distribution of many
variables
the sampling distribution of a mean is normal, even
when the individual observations are not normally
distributed, provided that the sample size is not too
small
it occupies a central role in statistical analysis.
CI’s, P-values, proportions and rates
10. What is the Distribution?
Gives us a picture of
the variability and
central tendency.
Can also show the
amount of skewness.
13. Characteristics of Normal Distributions
Mean = Median = Mode
Bilaterally symmetrical
Tails never touch x-axis
Total area under curve = 1
14. Notation
A random sample of size n is taken from the population of
interest.
The mean and standard deviation of the quantitative variable
x in the population and in the sample are given by:
15. In a normal distribution:
~68% of observations lie between –1 and 1 standard deviations from
the mean
~95% of observations lie between –2 and 2 standard deviations from
the mean (actually -1.96 to +1.96)
~99% of observations lie between –3 and 3 standard deviations from
the mean
18. Relationship between the mean, standard
deviation and the normal distribution
For symmetric distributions, approximately, 95% of all observations
lie in the interval 2s
The limits: - 2s and + 2s are referred to as the 95% tolerance limits
(or 95% spread limits)
Values contained in the interval are commonly termed as “THE
NORMAL VALUES”
19. In Summary
For most continuous data, the data points are Normally
distributed, around the Mean, with a spread which is called the
Standard deviations
From any sample of continuous data, the mean is calculated, and
will vary from sample to sample.
The sample mean will have a distribution which is centered on
the Population mean , and will have a spread which is called
the Standard Error.
The standard error of the sample mean is related to the
standard deviation and the sample size.