SlideShare a Scribd company logo
1 of 37
Download to read offline
Dichotomania
and other challenges for the collaborating biostatistician
A perspective on principles, responsibilities and potential solutions
Laure Wynants PhD
laure.wynants@maastrichtuniversity.nl
@laure_wynants
Dichotomania
and other challenges for the collaborating biostatistician
A perspective on principles, responsibilities and potential solutions
Laure Wynants PhD
laure.wynants@maastrichtuniversity.nl
@laure_wynants
or teaching
2018 conference invitation
“Statistics made very easy for clinicians”
2018 conference invitation
“Statistics made very easy for clinicians”
“Free coffee and croissants while statisticians explain the P-
value crisis”
Goodman S. A dirty dozen: twelve p-value misconceptions. Semin Hematol. 2008 Jul;45(3):135-40.
6
Estimated probability of causal attribution according to the null P-value, modeled using fractional polynomials with a cutpoint at P = 0.05.
A Psychometric Experiment in Causal Inference to Estimate Evidential Weights Used by Epidemiologists
Holman, C. D’Arcy J.; Arnold-Reed, Diane E.; de Klerk, Nicholas; McComb, Christine; English, Dallas R., Epidemiology12(2):246-255, 2001.
https://xkcd.com/
A conversation between a researcher and a statistician
• R: “We need some statistical testing for these plots.”
• S: “Why? These are not your main research questions
in this paper.”
• R: “I am not fishing for significant findings. I am aware
of the dangers. These are hypotheses we investigated
in earlier work. If the tests are significant, we know this
is confirmed in our new data.”
A conversation between a researcher and a statistician
• R: “We need some statistical testing for these plots.”
• S: “Why? These are not your main research questions
in this paper.”
• R: “I am not fishing for significant findings. I am aware
of the dangers. These are hypotheses we investigated
in earlier work. If the tests are significant, we know this
is confirmed in our new data.”
• R: “If it is not significant, we will discuss further. We
just didn’t have enough power then.”
Reporting and publication bias
“Trim and fill” funnel plot of Ki-67 expression for overall
survival in ovarian cancer patients (Qiu et al. Arch
Gynecol Obstet 2019)
Missing studies Published studies
Science as a disorderly mass of stray
observations, inconclusive results and
fledgling explanations.
And yet, as soon as their hypotheses were
turned into peer-reviewed papers,
researchers claimed that such facts had
always spoken for themselves.
How (not) to report results
Replication crisis
Ioannidis JAMA 2005: all original research published in 3 major journals in 1990-2003 and cited >1000 times:
49 studies, 45 claimed that the intervention was effective.
32% could not be replicated:
16% was contradicted (no effect found)
16% estimated effects were too strong (to the point that subsequent studies cast doubt on effect being clinically
important)
44% were replicated
11% no subsequent larger/better designed replication studies
Problems were worst for small RCTS and non-randomized studies.
Begly & Ellis Nature 2012: replication of landmark preclinical cancer studies: only 11% could be reproduced.
Journal impact factor Number of articles Mean number of citations of
non-reproduced articles
Mean number of citations of
reproduced articles
>20 21 248 (range 3–800) 231 (range 82–519)
5–19 32 169 (range 6–1,909) 13 (range 3–24)
• SARS-Cov-2 “viral loads in the very young do not differ significantly from
those of adults. Based on these results, we have to caution against an
unlimited re-opening of schools and kindergartens in the present situation”
• Ill-defined research question, comparison between all age groups (45
comparisons), test as if non-ordered categories.
• Reanalyzes with more appropriate techniques finds opposite conclusion.
• https://medium.com/@d_spiegel/is-sars-cov-2-viral-load-lower-in-young-children-than-adults-
8b4116d28353
A mistake in the operating room can
threaten the life of one patient;
A mistake in the statistical analysis or
interpretation can lead to hundreds of early
deaths.
Andrew Vickers, Biostatistician, Memorial Sloan Kettering Cancer Center
Some reactions to previous presentations
MDs
- surprise that meta-analysis can be biased
- “Our statisticians did not tell us this”
Altman BMJ 1994,
republished almost unchanged 20 years later…
“Put simply, much poor research arises
because researchers feel compelled for
career reasons to carry out research that
they are ill equipped to perform, and
nobody stops them.”
Statisticians
- “Oh no not this again”
- “We know this already”
- “P-values are not the problem”
- “It’s not us, it’s them”
An ethical statistician…
identifies and mitigates any preferences on the part of the investigators or data providers that might
predetermine or influence the analyses/results
only support studies that have pre-defined objectives and that are capable of producing useful results
strives to explain any expected adverse consequences of failure to follow through on an agreed-upon
sampling or analytic plan
shall indicate the risks and possible consequences if their professional judgement is overruled
Views or opinions based on general knowledge or belief should be clearly distinguished from views or
opinions derived from the statistical analyses being reported.
Taken from RSS code and ASA ethical guidelines
An ethical statistician…
recognizes […] research practices and standards can differ across disciplines, and statisticians do not have
obligations to standards of other professions that conflict with these guidelines
shall take personal responsibility for work bearing their name
avoids compromising scientific validity for expediency
should always be aware of their overriding responsibility to the public good […] A Fellow’s obligations to
employers, clients and the profession can never override this; and Fellows should seek to avoid situations
and not enter into undertakings which compromise this responsibility
Taken from RSS code and ASA ethical guidelines
An ethical statistician…
conveys the findings in ways that are both honest and meaningful to the user/reader
shall seek to conform to recognised good practice including quality standards which are in their judgement
relevant, and shall encourage others to do likewise
shall seek to advance knowledge and understanding of statistical science and advocate its use. This
advocacy of statistical science should extend to employers, clients, colleagues and the general public
Taken from RSS code and ASA ethical guidelines
#AITA?
@numbersman77
What can we do better?
ATOM
- Accept uncertainty (no more ***, interpret confidence intervals)
- Be Thoughtful (research question, design, clinically relevant effect size, registered reports)
- Be Open (conflicts of interest, registration, share data, code, analysis protocols, publish all results)
- Be Modest (exploratory, retrospective, secondary analyses (no harking); interpret studies in broader context)
Ronald L. Wasserstein, Allen L. Schirm & Nicole A. Lazar (2019) Moving to a World Beyond “p < 0.05”, The American Statistician, 73:sup1, 1-19
What can we do better?
Distinguish between different applications
E.g. Cox 2020
- Two-decision situation (health screening: return tomorrow vs next year; control error)
- Subject-matter hypothesis (difference in trt, H0: there is no difference; p-value as measure of uncertainty)
- Dividing hypothesis (at which level does CI only contain positive/negative effects?)
- Tests of model adequacy(normality assumption, informal role, judgement required)
What can we do better?
• “How to” teaching vs understanding
Simulated data (Bishop 2020)
• Be explicit about how principles extend to observational
research
• Software
• Conceptual clarity in educational material (Greenland 2019)
“Significance level”: α or p-value?
“P-value”: observed value p or random variable P?
It won’t be easy
• Misconception fatigue in teaching
It won’t be easy
Wang et al Annals of Internal Medicine 2018
It won’t be easy
No statistician can do this alone
• A responsibility for each of us
• A role for professional organizations
• A necessity to put this on the agenda of ISCB
• Thanks to John Carlin and Jonathan Sterne - even if
there are no free croissants
Testimation bias
Steyerberg et al JCE 1999
Replicability without p-values?
• Hanson (1958) – anthropology, sociology, psychology
- Statements with “confirmation criteria” (test): >70% confirmed
- Statements without confirmation criteria: <46% confirmed
• Basic Applied Social Psychology banned p-values
- Overinterpret / overstate descriptive results
Lakens et al 2020

More Related Content

What's hot

What's hot (20)

Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
 
ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctors
 
Introduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part IIIntroduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part II
 
On p-values
On p-valuesOn p-values
On p-values
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
 
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyond
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutions
 
Improving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxImproving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradox
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression Models
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead
 
Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023
 
Basic Biostatistics and Data managment
Basic Biostatistics and Data managment Basic Biostatistics and Data managment
Basic Biostatistics and Data managment
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
 
Uncertainty in AI
Uncertainty in AIUncertainty in AI
Uncertainty in AI
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptx
 
5.2.1 dags
5.2.1 dags5.2.1 dags
5.2.1 dags
 
The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective Statisticians
 
Causal inference in practice
Causal inference in practiceCausal inference in practice
Causal inference in practice
 

Similar to Dichotomania and other challenges for the collaborating biostatistician

Running head OVERVIEW .docx
Running head OVERVIEW                                            .docxRunning head OVERVIEW                                            .docx
Running head OVERVIEW .docx
glendar3
 
Running head OVERVIEW .docx
Running head OVERVIEW                                            .docxRunning head OVERVIEW                                            .docx
Running head OVERVIEW .docx
todd581
 
20050325 Design of clinical trails in radiology
20050325 Design of clinical trails in radiology20050325 Design of clinical trails in radiology
20050325 Design of clinical trails in radiology
Internet Medical Journal
 
MethodUnderstandingStatisticalSignificanceMatthew .docx
MethodUnderstandingStatisticalSignificanceMatthew .docxMethodUnderstandingStatisticalSignificanceMatthew .docx
MethodUnderstandingStatisticalSignificanceMatthew .docx
ARIV4
 
Factors That Impacted Effective Diabetes Management Within...
Factors That Impacted Effective Diabetes Management Within...Factors That Impacted Effective Diabetes Management Within...
Factors That Impacted Effective Diabetes Management Within...
Susan Tullis
 
MAKING SENSE OFSTATISTICSWhat statistics tell you an.docx
MAKING SENSE OFSTATISTICSWhat statistics tell you an.docxMAKING SENSE OFSTATISTICSWhat statistics tell you an.docx
MAKING SENSE OFSTATISTICSWhat statistics tell you an.docx
smile790243
 

Similar to Dichotomania and other challenges for the collaborating biostatistician (20)

Big Data: Learning from MIMIC- Celi
Big Data: Learning from MIMIC- CeliBig Data: Learning from MIMIC- Celi
Big Data: Learning from MIMIC- Celi
 
1_Intro to Research.pdf
1_Intro to Research.pdf1_Intro to Research.pdf
1_Intro to Research.pdf
 
Running head OVERVIEW .docx
Running head OVERVIEW                                            .docxRunning head OVERVIEW                                            .docx
Running head OVERVIEW .docx
 
Running head OVERVIEW .docx
Running head OVERVIEW                                            .docxRunning head OVERVIEW                                            .docx
Running head OVERVIEW .docx
 
02 young vpi lecture 2014
02 young vpi lecture 201402 young vpi lecture 2014
02 young vpi lecture 2014
 
Mind the Gap Health Systems Research and the Search for Answers
Mind the Gap Health Systems Research and the Search for AnswersMind the Gap Health Systems Research and the Search for Answers
Mind the Gap Health Systems Research and the Search for Answers
 
Primary-Care-Research-2017.ppt
Primary-Care-Research-2017.pptPrimary-Care-Research-2017.ppt
Primary-Care-Research-2017.ppt
 
Care research
Care researchCare research
Care research
 
Primary-Care-Research-2017.ppt
Primary-Care-Research-2017.pptPrimary-Care-Research-2017.ppt
Primary-Care-Research-2017.ppt
 
Surviving statistics lecture 1
Surviving statistics lecture 1Surviving statistics lecture 1
Surviving statistics lecture 1
 
Published Research, Flawed, Misleading, Nefarious - Use of Reporting Guidelin...
Published Research, Flawed, Misleading, Nefarious - Use of Reporting Guidelin...Published Research, Flawed, Misleading, Nefarious - Use of Reporting Guidelin...
Published Research, Flawed, Misleading, Nefarious - Use of Reporting Guidelin...
 
20050325 Design of clinical trails in radiology
20050325 Design of clinical trails in radiology20050325 Design of clinical trails in radiology
20050325 Design of clinical trails in radiology
 
Primary-Care-Research-2017.pptx
Primary-Care-Research-2017.pptxPrimary-Care-Research-2017.pptx
Primary-Care-Research-2017.pptx
 
Sample Of Research Essay
Sample Of Research EssaySample Of Research Essay
Sample Of Research Essay
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...
 
MethodUnderstandingStatisticalSignificanceMatthew .docx
MethodUnderstandingStatisticalSignificanceMatthew .docxMethodUnderstandingStatisticalSignificanceMatthew .docx
MethodUnderstandingStatisticalSignificanceMatthew .docx
 
Factors That Impacted Effective Diabetes Management Within...
Factors That Impacted Effective Diabetes Management Within...Factors That Impacted Effective Diabetes Management Within...
Factors That Impacted Effective Diabetes Management Within...
 
MAKING SENSE OFSTATISTICSWhat statistics tell you an.docx
MAKING SENSE OFSTATISTICSWhat statistics tell you an.docxMAKING SENSE OFSTATISTICSWhat statistics tell you an.docx
MAKING SENSE OFSTATISTICSWhat statistics tell you an.docx
 
What is research
What is researchWhat is research
What is research
 
تحليل البيانات وتفسير المعطيات
تحليل البيانات وتفسير المعطياتتحليل البيانات وتفسير المعطيات
تحليل البيانات وتفسير المعطيات
 

Recently uploaded

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 

Recently uploaded (20)

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 

Dichotomania and other challenges for the collaborating biostatistician

  • 1. Dichotomania and other challenges for the collaborating biostatistician A perspective on principles, responsibilities and potential solutions Laure Wynants PhD laure.wynants@maastrichtuniversity.nl @laure_wynants
  • 2. Dichotomania and other challenges for the collaborating biostatistician A perspective on principles, responsibilities and potential solutions Laure Wynants PhD laure.wynants@maastrichtuniversity.nl @laure_wynants or teaching
  • 3. 2018 conference invitation “Statistics made very easy for clinicians”
  • 4. 2018 conference invitation “Statistics made very easy for clinicians” “Free coffee and croissants while statisticians explain the P- value crisis”
  • 5. Goodman S. A dirty dozen: twelve p-value misconceptions. Semin Hematol. 2008 Jul;45(3):135-40.
  • 6. 6 Estimated probability of causal attribution according to the null P-value, modeled using fractional polynomials with a cutpoint at P = 0.05. A Psychometric Experiment in Causal Inference to Estimate Evidential Weights Used by Epidemiologists Holman, C. D’Arcy J.; Arnold-Reed, Diane E.; de Klerk, Nicholas; McComb, Christine; English, Dallas R., Epidemiology12(2):246-255, 2001. https://xkcd.com/
  • 7.
  • 8. A conversation between a researcher and a statistician • R: “We need some statistical testing for these plots.” • S: “Why? These are not your main research questions in this paper.” • R: “I am not fishing for significant findings. I am aware of the dangers. These are hypotheses we investigated in earlier work. If the tests are significant, we know this is confirmed in our new data.”
  • 9. A conversation between a researcher and a statistician • R: “We need some statistical testing for these plots.” • S: “Why? These are not your main research questions in this paper.” • R: “I am not fishing for significant findings. I am aware of the dangers. These are hypotheses we investigated in earlier work. If the tests are significant, we know this is confirmed in our new data.” • R: “If it is not significant, we will discuss further. We just didn’t have enough power then.”
  • 10. Reporting and publication bias “Trim and fill” funnel plot of Ki-67 expression for overall survival in ovarian cancer patients (Qiu et al. Arch Gynecol Obstet 2019) Missing studies Published studies
  • 11. Science as a disorderly mass of stray observations, inconclusive results and fledgling explanations. And yet, as soon as their hypotheses were turned into peer-reviewed papers, researchers claimed that such facts had always spoken for themselves.
  • 12. How (not) to report results
  • 13. Replication crisis Ioannidis JAMA 2005: all original research published in 3 major journals in 1990-2003 and cited >1000 times: 49 studies, 45 claimed that the intervention was effective. 32% could not be replicated: 16% was contradicted (no effect found) 16% estimated effects were too strong (to the point that subsequent studies cast doubt on effect being clinically important) 44% were replicated 11% no subsequent larger/better designed replication studies Problems were worst for small RCTS and non-randomized studies. Begly & Ellis Nature 2012: replication of landmark preclinical cancer studies: only 11% could be reproduced. Journal impact factor Number of articles Mean number of citations of non-reproduced articles Mean number of citations of reproduced articles >20 21 248 (range 3–800) 231 (range 82–519) 5–19 32 169 (range 6–1,909) 13 (range 3–24)
  • 14. • SARS-Cov-2 “viral loads in the very young do not differ significantly from those of adults. Based on these results, we have to caution against an unlimited re-opening of schools and kindergartens in the present situation” • Ill-defined research question, comparison between all age groups (45 comparisons), test as if non-ordered categories. • Reanalyzes with more appropriate techniques finds opposite conclusion. • https://medium.com/@d_spiegel/is-sars-cov-2-viral-load-lower-in-young-children-than-adults- 8b4116d28353
  • 15. A mistake in the operating room can threaten the life of one patient; A mistake in the statistical analysis or interpretation can lead to hundreds of early deaths. Andrew Vickers, Biostatistician, Memorial Sloan Kettering Cancer Center
  • 16. Some reactions to previous presentations MDs - surprise that meta-analysis can be biased - “Our statisticians did not tell us this”
  • 17. Altman BMJ 1994, republished almost unchanged 20 years later… “Put simply, much poor research arises because researchers feel compelled for career reasons to carry out research that they are ill equipped to perform, and nobody stops them.”
  • 18. Statisticians - “Oh no not this again” - “We know this already” - “P-values are not the problem” - “It’s not us, it’s them”
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. An ethical statistician… identifies and mitigates any preferences on the part of the investigators or data providers that might predetermine or influence the analyses/results only support studies that have pre-defined objectives and that are capable of producing useful results strives to explain any expected adverse consequences of failure to follow through on an agreed-upon sampling or analytic plan shall indicate the risks and possible consequences if their professional judgement is overruled Views or opinions based on general knowledge or belief should be clearly distinguished from views or opinions derived from the statistical analyses being reported. Taken from RSS code and ASA ethical guidelines
  • 24. An ethical statistician… recognizes […] research practices and standards can differ across disciplines, and statisticians do not have obligations to standards of other professions that conflict with these guidelines shall take personal responsibility for work bearing their name avoids compromising scientific validity for expediency should always be aware of their overriding responsibility to the public good […] A Fellow’s obligations to employers, clients and the profession can never override this; and Fellows should seek to avoid situations and not enter into undertakings which compromise this responsibility Taken from RSS code and ASA ethical guidelines
  • 25. An ethical statistician… conveys the findings in ways that are both honest and meaningful to the user/reader shall seek to conform to recognised good practice including quality standards which are in their judgement relevant, and shall encourage others to do likewise shall seek to advance knowledge and understanding of statistical science and advocate its use. This advocacy of statistical science should extend to employers, clients, colleagues and the general public Taken from RSS code and ASA ethical guidelines
  • 27. What can we do better? ATOM - Accept uncertainty (no more ***, interpret confidence intervals) - Be Thoughtful (research question, design, clinically relevant effect size, registered reports) - Be Open (conflicts of interest, registration, share data, code, analysis protocols, publish all results) - Be Modest (exploratory, retrospective, secondary analyses (no harking); interpret studies in broader context) Ronald L. Wasserstein, Allen L. Schirm & Nicole A. Lazar (2019) Moving to a World Beyond “p < 0.05”, The American Statistician, 73:sup1, 1-19
  • 28. What can we do better? Distinguish between different applications E.g. Cox 2020 - Two-decision situation (health screening: return tomorrow vs next year; control error) - Subject-matter hypothesis (difference in trt, H0: there is no difference; p-value as measure of uncertainty) - Dividing hypothesis (at which level does CI only contain positive/negative effects?) - Tests of model adequacy(normality assumption, informal role, judgement required)
  • 29. What can we do better? • “How to” teaching vs understanding Simulated data (Bishop 2020) • Be explicit about how principles extend to observational research • Software • Conceptual clarity in educational material (Greenland 2019) “Significance level”: α or p-value? “P-value”: observed value p or random variable P?
  • 30. It won’t be easy • Misconception fatigue in teaching
  • 31. It won’t be easy Wang et al Annals of Internal Medicine 2018
  • 33. No statistician can do this alone • A responsibility for each of us • A role for professional organizations • A necessity to put this on the agenda of ISCB • Thanks to John Carlin and Jonathan Sterne - even if there are no free croissants
  • 34.
  • 35.
  • 37. Replicability without p-values? • Hanson (1958) – anthropology, sociology, psychology - Statements with “confirmation criteria” (test): >70% confirmed - Statements without confirmation criteria: <46% confirmed • Basic Applied Social Psychology banned p-values - Overinterpret / overstate descriptive results Lakens et al 2020