SlideShare a Scribd company logo
1 of 81
Download to read offline
1
Dr Maarten van Smeden, UMC Utrecht, M.vanSmeden@umcutrecht.nl
Dr Laure Wynants, Maastricht University, laure.wynants@maastrichtuniversity.nl
2
} Introduction prediction models vs everything else
PART A: development
} biased estimators and Stein’s paradox
} overfitting/sample size
Part B: validation and beyond
} Metrics
} Validation strategies
} Impact and implementation
PART C: open discussion on prediction and being an
early career researcher
3
5-Feb-21
Insert > Header & footer
3
4
Cartoon of Jim Borgman, first published by the Cincinnati Inquirer and King Features Syndicate April 27 1997
5
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
“We selected 50 common ingredients from random
recipes of a cookbook”
6
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
} veal, salt, pepper spice, flour, egg, bread, pork,
butter, tomato, lemon, duck, onion, celery, carrot,
parsley, mace, sherry, olive, mushroom, tripe,
milk, cheese, coffee, bacon, sugar, lobster, potato,
beef, lamb, mustard, nuts, wine, peas, corn,
cinnamon, cayenne, orange, tea, rum, raisin, bay
leaf, cloves, thyme, vanilla, hickory, molasses,
almonds, baking soda, ginger, terrapin
7
8
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
} veal, salt, pepper spice, flour, egg, bread, pork,
butter, tomato, lemon, duck, onion, celery, carrot,
parsley, mace, sherry, olive, mushroom, tripe,
milk, cheese, coffee, bacon, sugar, lobster, potato,
beef, lamb, mustard, nuts, wine, peas, corn,
cinnamon, cayenne, orange, tea, rum, raisin, bay
leaf, cloves, thyme, vanilla, hickory, molasses,
almonds, baking soda, ginger, terrapin
9
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
10
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
} veal, salt, pepper spice, flour, egg, bread, pork,
butter, tomato, lemon, duck, onion, celery, carrot,
parsley, mace, sherry, olive, mushroom, tripe,
milk, cheese, coffee, bacon, sugar, lobster, potato,
beef, lamb, mustard, nuts, wine, peas, corn,
cinnamon, cayenne, orange, tea, rum, raisin, bay
leaf, cloves, thyme, vanilla, hickory, molasses,
almonds, baking soda, ginger, terrapin
11
Credits to Peter Tennant for identifying this example
12
13
} Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
e.g. aetiology of illness, effect of treatment
} Predictive models
• Interest in (risk) predictions of future observations
• No concern about causality
• Concerns about overfitting and optimism
e.g. prognostic or diagnostic prediction model
} Descriptive models
• Capture the data structure
14
} Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
e.g. aetiology of illness, effect of treatment
} Predictive models
• Interest in (risk) predictions of future observations
• No concern about causality
• Concerns about overfitting and optimism
e.g. prognostic or diagnostic prediction model
} Descriptive models
• Capture the data structure
A
L
Y
exposure outcome
confounder
15
} Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
e.g. aetiology of illness, effect of treatment
} Predictive models
• Interest in (risk) predictions of future observations
• No concern about causality
• Concerns about overfitting and optimism
e.g. prognostic or diagnostic prediction model
} Descriptive models
• Capture the data structure
16
Van Smeden et al. Clinical prediction models: diagnosis versus prognosis, JCE, in press
17
} What is a prediction model?
◦ Mathematical formula (usually logistic or Cox regression,
sometimes machine learning methods)
◦ Combining multiple predictors (independent variables)
◦ The outcome (dependent variable) is usually a diagnosis
or prognosis
◦ Used to predict the outcome in new individuals
} What is the advantage
◦ Uses multiple characteristics, simultanously
◦ Giving each of them appropriate weights
◦ Personalized evidence-based approach to healthcare
(hopefully)
} Medical guidelines are usually binary
(dichotomania!)
◦ Treatment X if: Age>40 OR BMI>30
◦ What about a patient of 39 years old, with a BMI of 29?
18
19
20
Wells et al., Lancet, 1997. doi: 10.1016/S0140-6736(97)08140-3
21
Apgar, JAMA, 1958. doi: 10.1001/jama.1958.03000150027007
22
23
24
1. Before getting
started
2. Study design
3. Modelling
strategy
4. Model fitting
5. Model validation
– quantify
predictive
performance
6. Presentation
7. Reporting
8. Model validation
– external test
9. Impact studies
10. Implementation
} Phase 1: model development
} Phase 2: external validation
} Phase 3: impact evaluation
} Phase 4: implementation
25
1. Before getting
started
2. Study design
3. Modelling
strategy
4. Model fitting
5. Model validation
– quantify
predictive
performance
6. Presentation
7. Reporting
8. Model validation
– external test
9. Impact studies
10. Implementation
} Phase 1: model development
} Phase 2: external validation
} Phase 3: impact evaluation
} Phase 4: implementation
26
} Point of intended use of the risk model
• Primary care (paper/computer/app)?
• Secondary care (beside)?
• Low resource setting?
} Complexity
• Number of predictors?
• Transparency of calculation?
• Should it be fast?
27
28
29
When one has three or more units (say,
individuals), and for each unit one can
calculate an average score (say, average
blood pressure), then the best guess of
future observations for each unit (say, blood
pressure tomorrow) is NOT the average
score.
30
James and Stein. Estimation with quadratic loss. Proceedings of the fourth Berkeley
symposium on mathematical statistics and probability. Vol. 1. 1961.
31
32
Squared prediction error reduced from .077 to .022
33
• Probably among the most surprising (and initially doubted)
phenomena in statistics
• Now a large “family”: shrinkage estimators reduce prediction variance
to an extent that typically outweighs the bias that is introduced
• Bias/variance trade-off principle has motivated many statistical and
machine learning developments
Expected prediction error = irreducible error + bias2 + variance
• 5% reduction in MSPE just by shrinkage
estimator
• Van Houwelingen and le Cessie’s heuristic
shrinkage factor
} To explain or to predict?
◦ Prediction often benefits from shrinkage, the
consequence is that regression coefficients are
biased
◦ Explanations with focus on coefficients may not
benefit from the bias that is introduced!
} When is shrinkage needed?
◦ In case risk of overfitting is high
◦ Risk of overfitting is high when sample size is small
(particularly if modelling choices are data driven)
Events per variable (EPV) for logistic/survival models:
number of events (smallest outcome group)
number of candidate predictor variables
EPV = 10 commonly used minimal criterion
“For EPV values of 10 or greater, no major
problems occurred. For EPV values less than 10,
however, the regression coefficients were biased
in both positive and negative directions”
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
• EPV values for reliable selection of
predictors from a larger set of candidate
predictors may be as large as 50
• Statistical simulation studies on the
minimal EPV rules are highly
heterogeneous and have large problems
• But what if we just use shrinkage?
“We conclude that, despite improved performance on average,
shrinkage often worked poorly in individual datasets, in
particular when it was most needed. The results imply that
shrinkage methods do not solve problems associated with
small sample size or low number of events per variable.”
} In short:
◦ Minimal sample size requirements for logistic,
survival and continuous outcomes
◦ 4 or 5 criteria to meet
– Minimizing risk of overfitting
– Ensuring sufficiently precise estimation of risk
} Software in R and Stata to simplify
calculations
} Sample size criteria for validation currently
under review
58
1. Before getting
started
2. Study design
3. Modelling
strategy
4. Model fitting
5. Model validation
– quantify
predictive
performance
6. Presentation
7. Reporting
8. Model validation
– external test
9. Impact studies
10. Implementation
} Phase 1: model development
} Phase 2: external validation
} Phase 3: impact evaluation
} Phase 4: implementation
59
Apparent
(Usually too optimistic)
i.e., predictions evaluated on the development
data
Internal
(optimism-corrected)
e.g. bootstrapping
External
60
Apparent
(Usually too optimistic)
i.e., predictions evaluated on the development
data
Internal
(optimism-corrected)
e.g. bootstrapping
External
n=232 models
22%
48%
20% as part of development study
10% independent
only 5% assessed calibration
doi.org/10.1136/bmj.m1328
61
Doi 10.1016/j.jclinepi.2020.01.028
62
NICE Framingham AUC 77.6, overestimated risk
Vs.
QRISK2–2011AUC 77.1, well calibrated
Treatment threshold 20%
206 per 1000 men
Vs
110 per 1000 men
63
Rms package
Val.prob.ci.2 doi 10.1016/j.jclinepi.2015.12.005
64
doi: 10.1186/s12916-019-1425-3
65
decisioncurveanalysis.org
Harm FP
Benefit TN
Harm FN
Benefit TP
66
} Split sample – easy but inefficient
◦ Unless
– huge data
– meaningful split
– Qcovid
doi: 10.1016/j.jclinepi.2015.04.005
67
doi: 10.1016/j.jclinepi.2015.04.005
Optimism-corrected performance =
apparent performance – optimism
1. Draw sample*
2. Built model* in sample* (repeat every step,
incl. variable selection, non-linearities)
• Bootstrap performance is performance of
model* on sample*
3. Apply model* to original Sample
• test performance is performance of
model* on sample
4. Optimism =
bootstrap performance – test performance
5. Repeat 100 times
Rms package
Can bee cumbersome for complex modeling
Heterogeneity
Test
accuracy
(affects 70% of
MA; Willis BMC
med res meth
2011)
Miscalibration
(Van Calster
et al. MDM
2015)
Disease
prevalence
(Hilden, Stat
Med., 2000)
69
doi: 10.1016/j.jclinepi.2015.04.005
70
Metamisc R package
Ultrasound-based risk model for preoperative prediction of lymph-node metastases in women with endometrial cancer
DOI: 10.1002/uog.21950
Se
Sp
P
Se
Sp
P
Se
Sp
P
Se Sp P
NB
Net Benefit
= (true positives – w × false
positives)/n
= Se×P – w × (1-Sp) × (1-P)
Within-setting
model
Between-setting
model
Distribution of NB
74
75
Model not
fit for
purpose
Not validated No impact Regulatory
frameworks
Not adopted in
clinical practice
Suboptimal for improving clinical practice
76
} Analysis of the impact of using the model in
clinical practice
◦ Calculate it
– Cost-effectiveness analysis
◦ Run an experiment
– (Cluster-) randomized trials or pre-post intervention
studies
– SPIRIT AI / CONSORT AI
77
Meertens LJE et al. Fetal diagn Ther. 2018 Jul 18:1-13.
78
} Before-after study
◦ Fewer adverse perinatal outcomes in nulliparous
women (OR 0.56, 95% CI 0.32 to 0.94)
◦ Lower mean cost per pregnant woman
(-2.766 euro, 95% CI -3.700 to -1.825)
Doi 10.1016/j.ajog.2020.02.036
79
Figure 4 Adherence rates of discussing low-dose aspirin prophylaxis during the study period.
Figure 4 Adherence rates of discussing low-dose aspirin prophylaxis during the study period.
Doi 10.1016/j.ajog.2020.02.036
80
} ‘No time’
} Black box/ new approach / do not trust model
predictions / do not believe it is applicable for
specific patient
} Not (yet) convinced of improvement
◦ Not (yet) aware of current situation
} ‘Aspirin is a medicine, thus potentially harmful’
◦ Difficulty in weighing risks
81

More Related Content

What's hot

Machine Learning for Survival Analysis
Machine Learning for Survival AnalysisMachine Learning for Survival Analysis
Machine Learning for Survival Analysis
Chandan Reddy
 
Synthetic Data Generation with DoppelGanger
Synthetic Data Generation with DoppelGangerSynthetic Data Generation with DoppelGanger
Synthetic Data Generation with DoppelGanger
QuantUniversity
 

What's hot (20)

Prognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient healthPrognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient health
 
Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?
 
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
 
Five questions about artificial intelligence
Five questions about artificial intelligenceFive questions about artificial intelligence
Five questions about artificial intelligence
 
Improving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxImproving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradox
 
Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead
 
Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?
 
Big data analytics in healthcare
Big data analytics in healthcareBig data analytics in healthcare
Big data analytics in healthcare
 
ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctors
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatistician
 
To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?
 
Predictimands
PredictimandsPredictimands
Predictimands
 
Machine Learning for Survival Analysis
Machine Learning for Survival AnalysisMachine Learning for Survival Analysis
Machine Learning for Survival Analysis
 
Uncertainty in AI
Uncertainty in AIUncertainty in AI
Uncertainty in AI
 
Synthetic Data Generation with DoppelGanger
Synthetic Data Generation with DoppelGangerSynthetic Data Generation with DoppelGanger
Synthetic Data Generation with DoppelGanger
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confounding
 
Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22
 
Clinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansClinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-Statisticians
 
Power and sample size calculations for survival analysis webinar Slides
Power and sample size calculations for survival analysis webinar SlidesPower and sample size calculations for survival analysis webinar Slides
Power and sample size calculations for survival analysis webinar Slides
 

Similar to Clinical prediction models: development, validation and beyond

Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size Challenges
nQuery
 
D6 transforming oncology development with adaptive studies - 2011-04
D6   transforming oncology development with adaptive studies - 2011-04D6   transforming oncology development with adaptive studies - 2011-04
D6 transforming oncology development with adaptive studies - 2011-04
therealreverendbayes
 
1115 wyatt wheres the science in hi for christchurch nz oct 2015
1115 wyatt wheres the science in hi   for christchurch nz oct 20151115 wyatt wheres the science in hi   for christchurch nz oct 2015
1115 wyatt wheres the science in hi for christchurch nz oct 2015
Health Informatics New Zealand
 

Similar to Clinical prediction models: development, validation and beyond (20)

Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
randomized clinical trials II
randomized clinical trials IIrandomized clinical trials II
randomized clinical trials II
 
Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size Challenges
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
Sample determinants and size
Sample determinants and sizeSample determinants and size
Sample determinants and size
 
Real world modified
Real world modifiedReal world modified
Real world modified
 
Sample Size Estimation and Statistical Test Selection
Sample Size Estimation  and Statistical Test SelectionSample Size Estimation  and Statistical Test Selection
Sample Size Estimation and Statistical Test Selection
 
Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student
 
5 essential steps for sample size determination in clinical trials slideshare
5 essential steps for sample size determination in clinical trials   slideshare5 essential steps for sample size determination in clinical trials   slideshare
5 essential steps for sample size determination in clinical trials slideshare
 
Sample size calculation
Sample  size calculationSample  size calculation
Sample size calculation
 
D6 transforming oncology development with adaptive studies - 2011-04
D6   transforming oncology development with adaptive studies - 2011-04D6   transforming oncology development with adaptive studies - 2011-04
D6 transforming oncology development with adaptive studies - 2011-04
 
Elashoff approach section in grant applications
Elashoff approach section in grant applicationsElashoff approach section in grant applications
Elashoff approach section in grant applications
 
RCT to causal inference.pptx
RCT to causal inference.pptxRCT to causal inference.pptx
RCT to causal inference.pptx
 
PPT on Sample Size, Importance of Sample Size,
PPT on Sample Size, Importance of Sample Size,PPT on Sample Size, Importance of Sample Size,
PPT on Sample Size, Importance of Sample Size,
 
Advanced Biostatistics and Data Analysis abdul ghafoor sajjad
Advanced Biostatistics and Data Analysis abdul ghafoor sajjadAdvanced Biostatistics and Data Analysis abdul ghafoor sajjad
Advanced Biostatistics and Data Analysis abdul ghafoor sajjad
 
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptxSAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
 
1115 wyatt wheres the science in hi for christchurch nz oct 2015
1115 wyatt wheres the science in hi   for christchurch nz oct 20151115 wyatt wheres the science in hi   for christchurch nz oct 2015
1115 wyatt wheres the science in hi for christchurch nz oct 2015
 
Hypo
HypoHypo
Hypo
 
screening for diseases.pptx . ...
screening for diseases.pptx .             ...screening for diseases.pptx .             ...
screening for diseases.pptx . ...
 
Ipac 2014
Ipac 2014Ipac 2014
Ipac 2014
 

More from Maarten van Smeden

More from Maarten van Smeden (15)

UMC Utrecht AI Methods Lab
UMC Utrecht AI Methods LabUMC Utrecht AI Methods Lab
UMC Utrecht AI Methods Lab
 
Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicine
 
Associate professor lecture
Associate professor lectureAssociate professor lecture
Associate professor lecture
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
 
Algorithm based medicine
Algorithm based medicineAlgorithm based medicine
Algorithm based medicine
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the future
 
Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19
 
The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirus
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problem
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science thread
 

Recently uploaded

Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Silpa
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 

Recently uploaded (20)

Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 

Clinical prediction models: development, validation and beyond

  • 1. 1 Dr Maarten van Smeden, UMC Utrecht, M.vanSmeden@umcutrecht.nl Dr Laure Wynants, Maastricht University, laure.wynants@maastrichtuniversity.nl
  • 2. 2 } Introduction prediction models vs everything else PART A: development } biased estimators and Stein’s paradox } overfitting/sample size Part B: validation and beyond } Metrics } Validation strategies } Impact and implementation PART C: open discussion on prediction and being an early career researcher
  • 4. 4 Cartoon of Jim Borgman, first published by the Cincinnati Inquirer and King Features Syndicate April 27 1997
  • 5. 5 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142 “We selected 50 common ingredients from random recipes of a cookbook”
  • 6. 6 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142 } veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato, lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive, mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster, potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon, cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla, hickory, molasses, almonds, baking soda, ginger, terrapin
  • 7. 7
  • 8. 8 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142 } veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato, lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive, mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster, potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon, cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla, hickory, molasses, almonds, baking soda, ginger, terrapin
  • 9. 9 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
  • 10. 10 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142 } veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato, lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive, mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster, potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon, cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla, hickory, molasses, almonds, baking soda, ginger, terrapin
  • 11. 11 Credits to Peter Tennant for identifying this example
  • 12. 12
  • 13. 13 } Explanatory models • Theory: interest in regression coefficients • Testing and comparing existing causal theories e.g. aetiology of illness, effect of treatment } Predictive models • Interest in (risk) predictions of future observations • No concern about causality • Concerns about overfitting and optimism e.g. prognostic or diagnostic prediction model } Descriptive models • Capture the data structure
  • 14. 14 } Explanatory models • Theory: interest in regression coefficients • Testing and comparing existing causal theories e.g. aetiology of illness, effect of treatment } Predictive models • Interest in (risk) predictions of future observations • No concern about causality • Concerns about overfitting and optimism e.g. prognostic or diagnostic prediction model } Descriptive models • Capture the data structure A L Y exposure outcome confounder
  • 15. 15 } Explanatory models • Theory: interest in regression coefficients • Testing and comparing existing causal theories e.g. aetiology of illness, effect of treatment } Predictive models • Interest in (risk) predictions of future observations • No concern about causality • Concerns about overfitting and optimism e.g. prognostic or diagnostic prediction model } Descriptive models • Capture the data structure
  • 16. 16 Van Smeden et al. Clinical prediction models: diagnosis versus prognosis, JCE, in press
  • 17. 17 } What is a prediction model? ◦ Mathematical formula (usually logistic or Cox regression, sometimes machine learning methods) ◦ Combining multiple predictors (independent variables) ◦ The outcome (dependent variable) is usually a diagnosis or prognosis ◦ Used to predict the outcome in new individuals } What is the advantage ◦ Uses multiple characteristics, simultanously ◦ Giving each of them appropriate weights ◦ Personalized evidence-based approach to healthcare (hopefully) } Medical guidelines are usually binary (dichotomania!) ◦ Treatment X if: Age>40 OR BMI>30 ◦ What about a patient of 39 years old, with a BMI of 29?
  • 18. 18
  • 19. 19
  • 20. 20 Wells et al., Lancet, 1997. doi: 10.1016/S0140-6736(97)08140-3
  • 21. 21 Apgar, JAMA, 1958. doi: 10.1001/jama.1958.03000150027007
  • 22. 22
  • 23. 23
  • 24. 24 1. Before getting started 2. Study design 3. Modelling strategy 4. Model fitting 5. Model validation – quantify predictive performance 6. Presentation 7. Reporting 8. Model validation – external test 9. Impact studies 10. Implementation } Phase 1: model development } Phase 2: external validation } Phase 3: impact evaluation } Phase 4: implementation
  • 25. 25 1. Before getting started 2. Study design 3. Modelling strategy 4. Model fitting 5. Model validation – quantify predictive performance 6. Presentation 7. Reporting 8. Model validation – external test 9. Impact studies 10. Implementation } Phase 1: model development } Phase 2: external validation } Phase 3: impact evaluation } Phase 4: implementation
  • 26. 26 } Point of intended use of the risk model • Primary care (paper/computer/app)? • Secondary care (beside)? • Low resource setting? } Complexity • Number of predictors? • Transparency of calculation? • Should it be fast?
  • 27. 27
  • 28. 28
  • 29. 29 When one has three or more units (say, individuals), and for each unit one can calculate an average score (say, average blood pressure), then the best guess of future observations for each unit (say, blood pressure tomorrow) is NOT the average score.
  • 30. 30 James and Stein. Estimation with quadratic loss. Proceedings of the fourth Berkeley symposium on mathematical statistics and probability. Vol. 1. 1961.
  • 31. 31
  • 32. 32 Squared prediction error reduced from .077 to .022
  • 33. 33 • Probably among the most surprising (and initially doubted) phenomena in statistics • Now a large “family”: shrinkage estimators reduce prediction variance to an extent that typically outweighs the bias that is introduced • Bias/variance trade-off principle has motivated many statistical and machine learning developments Expected prediction error = irreducible error + bias2 + variance
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45. • 5% reduction in MSPE just by shrinkage estimator • Van Houwelingen and le Cessie’s heuristic shrinkage factor
  • 46. } To explain or to predict? ◦ Prediction often benefits from shrinkage, the consequence is that regression coefficients are biased ◦ Explanations with focus on coefficients may not benefit from the bias that is introduced! } When is shrinkage needed? ◦ In case risk of overfitting is high ◦ Risk of overfitting is high when sample size is small (particularly if modelling choices are data driven)
  • 47. Events per variable (EPV) for logistic/survival models: number of events (smallest outcome group) number of candidate predictor variables EPV = 10 commonly used minimal criterion
  • 48. “For EPV values of 10 or greater, no major problems occurred. For EPV values less than 10, however, the regression coefficients were biased in both positive and negative directions”
  • 49. Citations based on Google Scholar, Oct 30 2020 citations: 5,736 “a minimum of 10 EPV […] may be too conservative” “substantial problems even if the number of EPV exceeds 10” For EPV values of 10 or greater, no major problems citations: 2,438 citations: 216
  • 50. Citations based on Google Scholar, Oct 30 2020 citations: 5,736 “a minimum of 10 EPV […] may be too conservative” “substantial problems even if the number of EPV exceeds 10” For EPV values of 10 or greater, no major problems citations: 2,438 citations: 216
  • 51. Citations based on Google Scholar, Oct 30 2020 citations: 5,736 “a minimum of 10 EPV […] may be too conservative” “substantial problems even if the number of EPV exceeds 10” For EPV values of 10 or greater, no major problems citations: 2,438 citations: 216
  • 52. • EPV values for reliable selection of predictors from a larger set of candidate predictors may be as large as 50 • Statistical simulation studies on the minimal EPV rules are highly heterogeneous and have large problems • But what if we just use shrinkage?
  • 53. “We conclude that, despite improved performance on average, shrinkage often worked poorly in individual datasets, in particular when it was most needed. The results imply that shrinkage methods do not solve problems associated with small sample size or low number of events per variable.”
  • 54.
  • 55.
  • 56.
  • 57. } In short: ◦ Minimal sample size requirements for logistic, survival and continuous outcomes ◦ 4 or 5 criteria to meet – Minimizing risk of overfitting – Ensuring sufficiently precise estimation of risk } Software in R and Stata to simplify calculations } Sample size criteria for validation currently under review
  • 58. 58 1. Before getting started 2. Study design 3. Modelling strategy 4. Model fitting 5. Model validation – quantify predictive performance 6. Presentation 7. Reporting 8. Model validation – external test 9. Impact studies 10. Implementation } Phase 1: model development } Phase 2: external validation } Phase 3: impact evaluation } Phase 4: implementation
  • 59. 59 Apparent (Usually too optimistic) i.e., predictions evaluated on the development data Internal (optimism-corrected) e.g. bootstrapping External
  • 60. 60 Apparent (Usually too optimistic) i.e., predictions evaluated on the development data Internal (optimism-corrected) e.g. bootstrapping External n=232 models 22% 48% 20% as part of development study 10% independent only 5% assessed calibration doi.org/10.1136/bmj.m1328
  • 62. 62 NICE Framingham AUC 77.6, overestimated risk Vs. QRISK2–2011AUC 77.1, well calibrated Treatment threshold 20% 206 per 1000 men Vs 110 per 1000 men
  • 63. 63 Rms package Val.prob.ci.2 doi 10.1016/j.jclinepi.2015.12.005
  • 66. 66 } Split sample – easy but inefficient ◦ Unless – huge data – meaningful split – Qcovid doi: 10.1016/j.jclinepi.2015.04.005
  • 67. 67 doi: 10.1016/j.jclinepi.2015.04.005 Optimism-corrected performance = apparent performance – optimism 1. Draw sample* 2. Built model* in sample* (repeat every step, incl. variable selection, non-linearities) • Bootstrap performance is performance of model* on sample* 3. Apply model* to original Sample • test performance is performance of model* on sample 4. Optimism = bootstrap performance – test performance 5. Repeat 100 times Rms package Can bee cumbersome for complex modeling
  • 68. Heterogeneity Test accuracy (affects 70% of MA; Willis BMC med res meth 2011) Miscalibration (Van Calster et al. MDM 2015) Disease prevalence (Hilden, Stat Med., 2000)
  • 71. Ultrasound-based risk model for preoperative prediction of lymph-node metastases in women with endometrial cancer DOI: 10.1002/uog.21950
  • 72.
  • 73. Se Sp P Se Sp P Se Sp P Se Sp P NB Net Benefit = (true positives – w × false positives)/n = Se×P – w × (1-Sp) × (1-P) Within-setting model Between-setting model Distribution of NB
  • 74. 74
  • 75. 75 Model not fit for purpose Not validated No impact Regulatory frameworks Not adopted in clinical practice Suboptimal for improving clinical practice
  • 76. 76 } Analysis of the impact of using the model in clinical practice ◦ Calculate it – Cost-effectiveness analysis ◦ Run an experiment – (Cluster-) randomized trials or pre-post intervention studies – SPIRIT AI / CONSORT AI
  • 77. 77 Meertens LJE et al. Fetal diagn Ther. 2018 Jul 18:1-13.
  • 78. 78 } Before-after study ◦ Fewer adverse perinatal outcomes in nulliparous women (OR 0.56, 95% CI 0.32 to 0.94) ◦ Lower mean cost per pregnant woman (-2.766 euro, 95% CI -3.700 to -1.825) Doi 10.1016/j.ajog.2020.02.036
  • 79. 79 Figure 4 Adherence rates of discussing low-dose aspirin prophylaxis during the study period. Figure 4 Adherence rates of discussing low-dose aspirin prophylaxis during the study period. Doi 10.1016/j.ajog.2020.02.036
  • 80. 80 } ‘No time’ } Black box/ new approach / do not trust model predictions / do not believe it is applicable for specific patient } Not (yet) convinced of improvement ◦ Not (yet) aware of current situation } ‘Aspirin is a medicine, thus potentially harmful’ ◦ Difficulty in weighing risks
  • 81. 81