Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
P value, Power, Type 1 and 2 errors
1. P value, Power & Type I & II error
Dr. S. A. Rizwan, M.D.
Public Health Specialist
SBCM, Joint Program – Riyadh
Ministry of Health, Kingdom of Saudi Arabia
2. Learning objectives
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Define p value
• Describe the meaning and limitations of p value
• Define power of a test and its meaning
• Describe type 1 and type 2 errors in hypothesis
testing and how they affect the interpretation of
results
• Understand how consideration of p value, type 1
and 2 errors relate to sample size calculation
2
4. P value
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Defined as the probability of obtaining a result
equal to or more extreme than what was actually
observed
• First introduced by Karl Pearson in his Pearson's
chi-squared test
• It can also be seen in relation to the probability of
making a Type I error
4
5. P value
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
The vertical coordinate is the probability density of each outcome, computed under the null hypothesis.
The p-value is the area under the curve past the observed data point.
5
6. P value – choice of cut off value
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Arbitrary cut-off 0.05 (5% chance of a false+ conclusion)
• If p<0.05 statistically significant- Reject H0, Accept H1
• If p>0.05 statistically not significant, Accept H0, Reject H1
• Testing potential harmful interventions ‘α’ value is set
below 0.05
• Depends upon the research question!
6
7. P value – degrees of magnitude
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Very small (<0.001), the results are said to be
highly significant
• Near 0.05, it is said to be borderline significant
• Near 1.0, result does not matter!
7
8. P value – how to calculate it?
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Depending upon the statistic we are interested in
predetermined p values and their critical values
are displayed in statistical tables
• So each type of distribution has its own table
• It is also possible to calculate exact p values with
computers instead of using such tables
8
9. P value – interpretation
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• If the results are statistically significant, decide whether the
observed differences are clinically important
• If not significant, see if the sample size was adequate enough
not to have missed a clinically important difference
• Power of the study tells us the strength which we can conclude
that there is no difference between the two groups
9
10. P value – interpretation
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Statistical significance does not necessarily mean real significance
• If sample size is large, even small differences can have a low p-value
• Lack of significance doesn’t necessarily mean null hypothesis is true
• If sample size is small, there could be a real difference, but we are not
able to detect it
• If you perform a large number of tests in a study, 1 in 20 will be
significant merely by chance
10
12. Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• These are errors that arise when performing hypothesis testing and
decision making
• Type 1 error (false positive conclusion)
• Stating difference when there is no difference, alpha
• Related to p value, how?
• Set at 1/20 or 0.05 or 5%
• The probability is distributed at the tails of the normal curve i.e., 0.025 on
either tail
• Type 2 error (false negative conclusion)
• Stating no difference when there is a difference, beta
• Occurs when sample size is too small.
• Conventional values are 0.1 or 0.2
• Related to power, how?
What are these errors?
12
13. Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
What are these errors?
13
Reality:
No effect
Reality:
Effect exists
Research concludes:
Fail to reject null;
No effect
CORRECT FAILURE TO
REJECT
TYPE 2 ERROR (β)
Researcher concludes:
Reject null;
Effect exists
TYPE 1 ERROR (α) CORRECT REJECT (1-β)
• Advanced learning: Do you know there are type 3 and 4 also?
14. Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
Example 1
14
15. Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
Example 2
15
16. Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
Example 3
16
18. Power of the study
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• The ability to detect a statistically significant association
• It can also be seen as the probability of not missing an effect,
due to sampling error, when there really is an effect
• It is also the probability of avoiding a type 2 error, i.e., 1 – beta
• A prospective power analysis is used before collecting data, to
consider design sensitivity
• A retrospective power analysis is used in order to know
whether the studies you are interpreting were well enough
designed
18
19. Factors affecting power
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• All else being equal:
1. As sample sizes increase, power increases
2. As population variances decrease, power increases
3. As the difference increases, power increases
4. Statistical power is greater for one-tailed tests
5. The greater the probability of making a Type I error, the
greater the power
19
20. Calculating Power: Example
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• A study of n = 16 retains null H: μ = 170 at α = 0.05 (two-sided);
σ is 40. What was the power of test’s conditions to identify a
population mean of 190?
( )
5160.0
04.0
40
16|190170|
96.1
||
1 0
1 2
=
Φ=
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛ −
+−Φ=
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛ −
+−Φ=− −
σ
µµ
β α
n
z a
20
21. Calculating Power: Example
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Top curve assumes null H is true
• Bottom curve assumes alternative H is true
• α is set to 0.05 (two-sided)
• We will reject null when a sample mean exceeds
189.6 (right tail, top curve)
• The probability of getting a value greater than 189.6
on the bottom curve is 0.5160, corresponding to the
power of the test
21
22. Power vs. confidence intervals
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Once we have constructed a confidence interval, power calculations
yield no additional insights
• It is pointless to perform power calculations for hypotheses outside
of the confidence interval
• Confidence intervals better inform readers about the possibility of
an inadequate sample size than do post hoc power calculations
22
23. How do the errors relate to sample size?
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• Sample size for one-sample z test:
• 1 – β ≡ desired power
• α ≡ desired significance level (two-sided)
• σ ≡ population standard deviation
• Δ = μ0 – μa ≡ the difference worth detecting
( )
2
2
11
2
2
Δ
+
=
−− αβσ zz
n
23
24. How do the errors relate to sample size?
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• How large a sample is needed for a one-sample z test
with 90% power and α = 0.05 (two-tailed) when σ =
40? Let H0: μ = 170 and Ha: μ = 190 (thus, Δ = μ0 − μa
= 170 – 190 = −20)
• Sample size should be 42 to ensure adequate power.
( ) 99.41
20
)96.128.1(40
2
22
2
2
11
2
2
=
−
+
=
Δ
+
=
−− αβσ zz
n
24
26. Take home messages
Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh
• P value, type 1 and 2 errors,
alpha, beta, power, critical
value and hypothesis testing,
sample size are all related to
each other
26