This document summarizes a presentation given by Maarten van Smeden on explainable AI in medicine. Some key points from the presentation include:
- Van Smeden discusses several questions around AI in medicine, including whether AI is truly intelligent, if it is just new statistics, if it can explain its predictions, and if it is better at predictions.
- He notes the field of AI in medicine is very heterogeneous, using different types of data and models. Prediction models aim to predict outcomes but may not explain causal relationships.
- True explanatory models that can determine cause and effect are challenging, as AI systems cannot infer causal relationships from data alone without explicit domain knowledge.
- The ability of AI
1. Maarten van Smeden, PhD
Explainable AI workshop
12 April 2021
Five questions about AI in medicine
2. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Conflicts of interest
Financially
• I do not own (any) patents or stocks, and am I not involved in the
development of any Artificial Intelligence (AI) related products
• I am not paid for this talk
• I am involved in the development of a field standard for medical
AI, commissioned by the Dutch government, for which a financial
compensation was granted
Intellectually
• I am a statistician
• In interviews and on social media I have been quite sceptical
about AI (hype) in medicine
• Overall, I believe the interest in AI in medicine is net-beneficial
for someone in my position, although…
6. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Some general observations about AI in medicine
• Incredibly hot
• Incredibly heterogeneous
• Robots, data analyses, self-learning systems,…
• Types of data
• “Traditional” structured data
• Medical imaging
• Gene expression data
• Text mining electronic health records
• Analyzing social media posts (e.g. pharmacovigilance)
• Speech signal processing (e.g. )
• Incredibly opaque
• Limited information about actual use of AI in healthcare
• Almost no regulations (yet)
8. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Tech company business model
https://bit.ly/2HSp8X5; https://bit.ly/2Z0Pfop; https://bit.ly/2KIcpHG; https://bit.ly/33IJhr9
9. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Other success stories
https://go.nature.com/2VG2hS7; https://bbc.in/2Z1drXQ; https://bit.ly/2TAfRIP
10. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
IBM Watson winning Jeopardy! (2011)
https://bbc.in/2TMvV8I
11. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
IBM Watson for oncology
https://bit.ly/2LxiWGj
12. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Example: retinal disease
Gulshan et al, JAMA, 2016, 10.1001/jama.2016.17216; Picture retinopathy: https://bit.ly/2kB3X2w
Diabetic retinopathy
Deep learning (= Neural network)
• 128,000 images
• Transfer learning (preinitialization)
• Sensitivity and specificity > .90
• Estimated from training data
13. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Example: lymph node metastases
Bejnordi et al, JAMA, 2018, doi: 10.1001/jama.2017.14585. See our letter to the editor for a critical discussion: https://bit.ly/2kcYS0e
Deep learning competition
But:
• 390 teams signed up, 23 submitted
• “Only” 270 images for training
• Test AUC range: 0.56 to 0.99
14. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
AI is everywhere
https://bit.ly/2ka0HLq; https://go.nature.com/33TQgO6; https://bit.ly/2kp6X23; https://bit.ly/2lZuKWt; https://bit.ly/2lI298g
16. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
“As of today, we have deployed the system in 16 hospitals, and
it is performing over 1,300 screenings per day”
MedRxiv pre-print only, 23 March 2020,
doi.org/10.1101/2020.03.19.20039354
17. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Living review (update 3)
doi: 10.1136/bmj.m1328
18. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Living review (update 3)
doi: 10.1136/bmj.m1328
22. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
5 questions about AI in medicine
1. Is AI truly intelligent?
2. Is AI old statistics wine in new machine learning bottles?
3. Is AI able to explain?
4. Surely, AI is better at making predictions?
5. Will AI make healthcare better, faster and cheaper?
27. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Source: https://openai.com/blog/multimodal-neurons/
28. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
claiming that a classifier trained on
zillions of human-labelled images
containing cats and no cats, is
recognizing cats is just stupid – a
human can see a handful of cats,
including cartoons of pink panthers,
and lions and tigers and panthers, and
then can not only recognize many
other types of cats, but even if they
lose their sight, might have a pretty
good go at telling whether they are
holding their moggy or their doggy
https://bit.ly/326ghK8
Jon Crowcroft
29. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Adversarial example
https://bit.ly/2N4mQFo; https://bit.ly/2W7X9rF
30. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Skin cancer and rulers
Esteva et al., Nature, 2016, DOI: 10.1038/nature21056; https://bit.ly/2lE0vV0
31. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
https://arxiv.org/abs/2008.07371
32. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
All the impressive achievements of
deep learning amount to just curve
fitting
https://bit.ly/3t8kLfl
Judea Pearl
33. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Q2: is AI old statistics wine in
new machine learning bottles?
AI
100%
linear
models
34. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Terminology
In medical research, “artificial intelligence” usually
just means “machine learning” or “algorithm”
36. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
“Everything is an ML method”
https://bit.ly/2lEVn33
37. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
“ML methods come from computer science”
https://bit.ly/2zhbwPv; https://stanford.io/2TVp1xK; https://stanford.io/2ZfED0k
Leo Breiman Jerome H Friedman Trevor Hastie
CART, random forest Gradient boosting Elements of statistical learning
Education Physics/Math Physics Statistics
Job title Professor of Statistics Professor of Statistics Professor of Statistics
38. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Two cultures
Breiman, Stat Sci, 2001, DOI: 10.1214/ss/1009213726
39. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Statistics Machine learning
Covariates Features
Outcome variable Target
Model Network, graphs
Parameters Weights
Model for discrete var. Classifier
Model for continuous var. Regression
Log-likelihood Cross-entropy loss
Multinomial regression Softmax
Measurement error Noise
Subject/observation Sample/instance
Dummy coding One-hot encoding
Measurement invariance Concept drift
Statistics Machine learning
Prediction Supervised learning
Latent variable modeling Unsupervised learning
Fitting Learning
Prediction error Error
Sensitivity Recall
Positive predictive value Precision
Contingency table Confusion matrix
Measurement error model Noise-aware ML
Structural equation model Gaussian Bayesian network
Gold standard Ground truth
Derivation–validation Training–test
Experiment A/B test
Adapted from Daniel Obserski: https://bit.ly/2YN12Xf and Robert Tibshirani: https://stanford.io/2zqEGfr
Language
40. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Robert Tibshirani: https://stanford.io/2zqEGfr
Machine learning: large grant = $1,000,000
Statistics: large grant = $50,000
41. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
ML/AI refers to a culture, not to methods
Distinguishing between statistics and ML/AI
• Substantial overlap methods
• Substantial overlap analysis goals
• Attempts to distinguish frequently results in disagreement
42. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Beam & Kohane, JAMA, 2018, doi : 10.1001/jama.2017.18391
43. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Q3: Is AI able to explain?
BLACK BOX
INPUT EXPLANATION
45. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Explanatory models
• Theory, cause and effect
• aetiology of illness
• effect of treatment
Prediction models
• Interest in (risk) predictions of future observations
• Cause and effect not a direct concern
• prognosis and diagnosis
Descriptive models
• Capture the data structure
46. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
The Basketball thought experiment
47. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
The Basketball thought experiment
Relation of interest:
player height -> player talent (“got game”)
Third variable: professional basketball player
CONFOUNDER?
48. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Red = professional, black = amateur basketballer
49. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Red = professional, black = amateur basketballer
• The third variable professional basketball player is a collider
• An algorithm should not control for this collider (as one should
do for a confounder)
• How should an algorithm know it should ignore “professional
basketball player”?
It cannot know based on the data alone!
50. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
AI and causal inference
1See further: Kreiff and Diaz Ordaz; https://bit.ly/2m1eYdK
Small selection1
• Superlearner (e.g. van der Laan)
• High dimensional propensity scores (e.g. Schneeweiss)
• The book of why (Pearl)
51. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
• Understanding cause and effect crucial in understanding
aetiology, effect of interventions -> explanatory modelling
• There is a large difference between explaining why the AI is
predicting what it is predicting (e.g. feature importance) and the
ability of AI to “truly explain” -> separate causes from effects
• Explanatory modelling is already challenging in structured data
52. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Q4: surely, AI is better at making predictions?
Img: https://bit.ly/3saKFO7
54. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Systematic review clinical prediction models
Christodoulou et al. Journal of Clinical Epidemiology, 2019, doi: 10.1016/j.jclinepi.2019.02.004
55. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
56. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Irreducible error in medicine is often large
• Health and lack thereof complex to measure (‘no gold standard’)
• Predictors of diseases are often imperfectly and partly
measured
• We often don’t know all the causal mechanisms at play
• much easier to predict if you know the causal mechanisms!
• Predicting the future even more difficult
Understanding prediction uncertainty is key
Courtesy Cecile Janssens: https://bit.ly/2Jf5ft6
57. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Classifier Technology and the Illusion of Progress
Hand, Stat Sci, 2006, doi: 10.1214/088342306000000060
David Hand
58. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Predicting mortality – the conclusion
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
59. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Predicting mortality – the results
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
60. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Predicting mortality – the media
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344; https://bit.ly/2Q6H41R; https://bit.ly/2m3RLrn
61. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Q5: will AI make healthcare faster, better, cheaper?
Img: https://bit.ly/3wOv0aH
63. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Faster?
https://dl.acm.org/doi/abs/10.1145/3313831.3376718
64. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Faster?
https://dl.acm.org/doi/abs/10.1145/3313831.3376718
65. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Cheaper?
The costs of running (cloud computing) the Transformer
algorithm are estimated at 1 to 3 million Dollars
https://bit.ly/33Dj38X
66. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Flexible algorithms are data hungry
From slide deck Ben van Calster: https://bit.ly/38Aqmjs
67. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
https://twitter.com/DrHughHarvey/status/1230218991026819077
69. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
When used right, AI will able to do amazing things
… while being subject to many of the same issues of traditional
prediction modelling, including the leaky implementation pipeline
71. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Recidivism Algorithm
Pro-publica (2016) https://bit.ly/1XMKh5R
72. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
• Algorithms are not designed to automatically encourage equitable
healthcare and/or fair medical decision making
• Often we seem unaware of selection mechanisms in our data,
poorly reflecting society, enlarging existing inequalities or both
All photos of scientists I used in this presentation were white men
73. Explainable AI workshop, April 12 2021 Twitter: @MaartenvSmeden
Email: M.vanSmeden@umcutrecht.nl
Twitter: @MaartenvSmeden