1. Sample size calculation is an important part of ethical scientific research to avoid underpowered studies.
2. There are different approaches to sample size calculation depending on the study design and endpoints, such as comparing proportions, estimating confidence intervals, or analyzing time to event outcomes.
3. Key steps include defining the research hypothesis, primary and secondary endpoints, how and in whom the endpoints will be measured, and determining what difference is clinically meaningful to detect between study groups.
5. Random Sample
Derived from a defined population
Each individual has the same chance of
being included in the sample
Sampling can be done with minimum
knowledge about the population
Allows externally valid conclusions
6. Sampling
Frame
1. Source material from which the sample is drawn
2. List all who can be sampled from a population
3. Example: Census
4. Must be representative of the population
5. No elements from outside the population of interest are
present in the frame
Q: Can telephone directory be used as a sampling frame to
represent adult population of Mumbai?
Q: Can a sample drawn randomly from this be called a random
sample?
7. Why Does it Matter?
1. Avoids resource wastage
2. Ensures aims are clear
3. Reduces harm
4. Discourages much needed future
research
5. Needed for publication and grants
Avoids an unethical underpowered
study
8. Why are underpowered studies unethical?
1. Often yield optimistic differences
2. Confidence intervals around these differences are wider
3. Small reductions of CI (w.r.t no trials) is not justified when risks to patients is
considered
4. Combined meta-analyses more susceptible to variability in study design and
execution
5. Impairs informed consent - do we inform patient of the limited benefit from an
underpowered study ? - a form of deception
6. Serendipitous results are rare - publication bias makes them seem more
9. J. P. A. Ioannidis, Why most
published research findings are
false. PLoS Med. 2, e124 (2005).
13. μ0 μ1
d
Basic Theory
Probability of
rejecting the null
hypothesis when it
is really true (Type
I Error)
Probability of
rejecting the
alternate hypothesis
when it is true (Type
II error)
14. μ0 μ1
d
Basic Theory
Probability of
accepting the null
hypothesis as true
when it is really
false (Type I Error)
Power of the test
20. Basic Principles
1. Define a research hypothesis
2. Define the primary and the secondary endpoints
3. Define the measurement:
a. What to measure
b. In whom to measure
c. Where to measure
d. When to measure
e. Why to measure - most important
21. Sample Size
Calculation
Example
Scenarios
1. Cataract surgery in mobile eye surgical unit:
Safe and viable alternative
1. Topical sodium cromoglycate in management
of chronic non-infectious conjunctivitis: A
Double blind controlled clinical trial
22. Sample size for comparing proportions
1. Endpoint : Cumulative infection rate at 72 hours. Measure : percent or ratio
2. Single sample design
3. “Hopefully” random
4. We approach in two ways:
a. Compare against a “known” rate
b. Estimate the precision of the estimate we generate
23.
24.
25. Sample Size for Confidence Interval Estimates
● Most commonly used for single sample situations
● Confidence intervals basically indicate the range of plausible values of the
population estimate that is desired.
● Essentially implies if the same experiment is repeated, the estimated value
will lie within the range of the confidence intervals x% of the time (only if the
sample mean is centered though)
● Easier to do as historical precedent need not be present.
26. Sample Size for Confidence Interval Estimates
Endpoint is the precision of estimate of the mean here.
Let us assume that you would be satisfied with a rate of 5% and do not want the
estimate to go beyond 8% (士 5%).
You want the confidence level to be 95%
28. Sample Size
Calculation
Example
Scenarios
1. Cataract surgery in mobile eye surgical unit:
Safe and viable alternative
1. Topical sodium cromoglycate in management
of chronic non-infectious conjunctivitis: A
Double blind controlled clinical trial
29. Primary Endpoint
1. What are we measuring : Patient's subjective report of improvement in
symptoms
2. Whom are we measuring it in : Patients with B/L chronic non infective
conjunctivitis
3. Where are we measuring it : In a hospital where the study is being
conducted*
4. When are we measuring it : At 4 weeks
5. Why are we measuring it : Is the drug better than a placebo for this condition.
30. Sample Size : Mean Score
Endpoint is an estimate of the mean score in the questionnaire at 4 weeks
We want to know if the mean score of the patients in the control group is different
from the score in the test group
Assume a random sample
31.
32.
33. Time to Event Endpoint
Endpoint is an estimate of median time taken for the symptom score to normalize
Here the comparing the median times by a T test approach will fail
What we need is a sample size estimation for a time to event outcome
34. Hazard rates and ratio
Usual survival curves follow an exponential
distribution.
The probability of Surviving for a specific time
period is given as P = e-ht
Here h = the instantaneous hazard rate
h = ln (1/Median Survival Time)
h= - ln (S(T))/T .. where T is time and S is proportion
surviving upto time T
35. Sample Size : Time to Symptomatic Change
Assume that 40% of the patients receiving placebo in the control group at 4
weeks.
We consider a clinically meaningful difference exists if the proportion of patients
differs by 20%
● 20% or less improve with drug at 4 weeks - significantly worse
● 60% or more improve with drug at 4 weeks - significantly better
We assume that the rate of improvement over the 4 weeks is constant implying
uniform hazard rate.
36. Sample Size : Time to Symptomatic Change
% improving in 4 weeks in placebo arm : 40%
% not improving in 4 weeks in placebo arm : 60%
Hazard rate of not improving : - ln (0.4/4) or -ln(1-0.6)/4 = 2.3
% improving in 4 weeks with drug : 60%
Hazard rate of not improving : - ln (0.4/4) = 1.9
Hazard ratio = 1.9 / 2.3 = 0.82
37.
38. Summary
1. Sample size calculation integral part of valid and ethical scientific research
2. Lots of tools available
3. Important to define the hypothesis and end point clearly for proper sample
size
An unbiased (representative) sample is a set of objects chosen from a complete sample using a selection process that does not depend on the properties of the objects. There are several types of non random sample but the most well known and most abused is convenience sample.
Increasing the acceptable Type I error rejection threshold can improve the power and vice versa.
An exponential distribution is produced when events occur CONTINUOUSLY and INDEPENDENTLY at a constant average rate.