The document discusses a clinical decision support system called SIADe for diagnosing dementia, Alzheimer's disease, and mild cognitive impairment. It lists participating institutions and outlines the agenda which includes motivation, objectives, clinical decision modeling, achievements, and future work. The motivation discusses the aging population and high prevalence of dementia. The objective is to design a decision support system to aid in diagnosis. It will use a knowledge base, inference engine, and mobile app for physicians. Clinical decision modeling involves identifying diagnostic guidelines, preprocessing patient records, building a Bayesian network model, and evaluating model performance. The system aims to address information overload and integrate evidence-based knowledge to help physicians with clinical decision making.
2. Participating Institutions
• Center for Studies and Research on Aging (CEPE-Rio),
Vital Brazil Institute, Rio de Janeiro.
• Center for Alzheimer's Disease and Related Disorder (CDA-IPUB-UFRJ),
Institute of Psychiatry, Federal University of Rio de Janeiro.
• Institute of Computing, Federal Fluminense University (IC-UFF), Niterói.
• Midiacom Lab, Federal Fluminense University, Niterói.
• Medical Sciences College, Rio de Janeiro State University, Rio de Janeiro.
• National Laboratory for Scientific Computing (INCT), Brazil.
• Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Rio de Janeiro.
• King’s College London (KCL).
4. Motivation
• Alzheimer’s disease represents 50-80% of dementia cases.
• Dementia has a prevalence of 7.8% of elderly from a local
community of São Paulo. Herrera et. al. (2002)
• Another survey indicated 6.9% of elderly from São Paulo.
Alzheimer’s represented 59% of dementia cases. Bottino et. al. (2006)
• Dementia has a prevalence from 4.6% to 9.7% of elderly.
Rodriguez et. al. (2008).
• In 2020, Brazil will occupy the sixth worldwide ranking in
terms of elderly population.
5. Motivation
Decision support systems have been designed for helping
physician in clinical decision making.
Benefits:
• Ability to address the information overload that
physicians face.
• Integrating evidence-based knowledge.
6. Objective
Design and develop a clinical decision support system for
diagnosis of Dementia, Alzheimer`s Disease and Mild
Cognitive Impairment.
Why?
• World-wide population aging.
• High prevalence of Dementia among elderly.
• Early diagnosis of Alzheimer’s Disease can improve
the treatment efficiency, patient quality of life and
reduce the costs for public health systems.
7. CDSS - Principal Components
Physician
Mobile
application
Communication
interface
Inference engine
Knowledge base
Ask for a decision
support for diagnosis.
Internet
HTTP messages
Provides suggestions
for possible diagnosis
that match a patient
signs and symptoms.
Clinical decision
support system
Published references
related to diagnosis
criteria
Knowledge
acquisition
Normal controls and
patients’ clinical
records
8. Decision Modeling Process
Decision
modeling for
a disease Identify the
diagnosis
guideline for
the disease
Diagnosis
criteria for
the disease
Preprocess the
clinical records
of patients and
normal controls
Training
database
Build a
Bayesian
network
structure
Perform
Bayesian
parameter
learning
Evaluate the
Bayesian
learning
Deploy the
decision
model Acceptable
performance
measures?
Review the
decision
model
Additional
attributes
Additional
clinical
records
Decision model
modeled
No
Yes
9. Patient care
requested
Take patient medical
history and/or carry out
clinical examinations for
dementia screening
Does the
patient have
possible
dementia?
Carry out
neuropsycholo
gical tests for
Dementia
Carry out
treatment for
other diseases
Treatment
follow-up (*)
If diagnosis
of Dementia
confirmed?
Carry out psychological
tests exams for Mild
Cognitive Impairment
Carry out
neuropsychological tests
and exams for Dementia
due to Alzheimer s
Disease
Treatment
follow-up (*)
Treatment for
Dementia due to
Alzheimer s Disease
follow-up (*)
Treatment
follow-up (*)
Treatment for
Mild Cognitive
Impairment
follow-up (*)
If diagnosis of
Alzheimer s Disease
confirmed?
No Yes
Yes
No
If diagnosis of Mild
Cognitive
Impairment
confirmed?
No Yes No Yes
DiagnosisofDementia,AlzheimersDiseaseand
MildCognitiveImpairment
(*) A treatment should be defined by a physician
Diagnosis Process for Dementia, AD and MCI
10. Decision Modeling Process
Decision
modeling for
a disease Identify the
diagnosis
guideline for
the disease
Diagnosis
criteria for
the disease
Preprocess the
clinical records
of patients and
normal controls
Training
database
Build a
Bayesian
network
structure
Perform
Bayesian
parameter
learning
Evaluate the
Bayesian
learning
Deploy the
decision
model Acceptable
performance
measures?
Review the
decision
model
Additional
attributes
Additional
clinical
records
Decision model
modeled
No
Yes
11. Preprocess the
patients’ health
records Integrate the patients’
health records spread
across multiple
spreadsheets in one
training database
Database
balancing
Attributes
selection
Discretize
numerical
attributes
Training
database
preprocessed
Preprocessing the Health Records
12. positive
135
negative
45
Alzheimer’s Disease
Dementia
Mild Cognitive Impairment
negative
67
positive
180
negative
35
positive
32
Composed by:
• Normal controls and patients’ health records provided by Center for Alzheimer's
Disease and Related Disorder, Institute of Psychiatry, UFRJ.
Project approved by Research Ethics Committee (2012).
Training Database
13. positive
135
negative
45
Alzheimer’s Disease Dementia Mild Cognitive Impairment
negative
67
positive
180
negative
35
positive
32
BeforebalancingAfterbalancing
negative
35
positive
32
negative
134
positive
180positive
135
negative
90
Data Balancing
Method:
SMOTE (Synthetic Minority Over-sampling Technique)1
1: Chawla, N. V.; Bowyer, K. W.; Hall, L. O.; Kegelmeyer, W. P. SMOTE: Synthetic Minority Over-Sampling Technique.
Journal of Artificial Intelligence Research, v. 16, p. 321-357, 2002.
14. Attribute( MD( Entropy(
Mini$mental*state*examination*score* 5* 0.2791*
Clinical*Dementia*rating*scale* 11* 0.2441*
Pfeffer*questionnaire*score* 12* 0.2074*
Verbal*fluency*test*score* 8* 0.1665*
Clock*drawing*test*scale* 12* 0.0881*
Trial*making*test*scale* 40* 0.0829*
Age* 4* 0.0684*
Lawton*scale* 58* 0.0342*
IQCode*score* 56* 0.0324*
Stroop*color*word*test* 60* 0.0209*
Gender* 9* 0.0001*
Depression* 16* 0.0001*
Education*level* 2* 0.0423*
Rey*Complex*Figure* 78* 0.0181*
Cambridge*Cognitive*Examination* 79* 0.0000*
Digit*symbol* 81* 0.0000*
Neuropsychiatric*inventory* 56* 0.0000*
Cornell*depression*scale* 62* 0.0000*
Timed*Up*and*Go* 64* 0.0000*
POMA* 85* 0.0000*
Sit$to$Stand*test* 97* 0.0000*
Digit*span*test* 62* 0.0000*
Rey*Auditory$Verbal*Learning* 93* 0.0000*
Brain*anatomical*structures*volume* 83* 0.0000*
Criteria:
Attributes filtered by missing
data rate (MD<60%)
AND
Information Gain
(Entropy>0.00001)
MD = Missing data ratio. It is calculated by
the ratio between the number of missing
data records and the total number of
records of the corresponding attribute.
Attributes Selection
15. Bayes’ Rule
Bayes’'rule:'
P(h | e) =
P(e | h)⋅ P(h)
P(e) !
the probability of a hypothesis h conditioned upon some evidence e is equal to its
likelihood P(e | h)
!
times its probability prior to any evidence P(h), normalized by
dividing P(e).
Definition: after applying Bayes’ theorem to obtain P(h | e) adopt that as your
posterior degree of belief in h, or Bel(h) = P(h | e).
Given dichotomous random variables (takes on one of only two possible values when
observed or measured):
P(h | e) =
P(e | h)⋅ P(h)
P(e | h)⋅ P(h)+ P(e |¬h)⋅ P(¬h) !
17. Example:
Suppose that we have this very simple model of flu causing a high temperature with
the following prior and conditional probabilities distribution values.
If an individual has a high temperature (i.e., the evidence available is Hi=True), the
computation for this diagnostic reasoning is as follows:
Bel(Flu = True) =α ⋅ P(Hi = True | Flu = True)⋅ P(Flu = True) =α ⋅0.05⋅0.9 =α ⋅0.045
Bel(Flu = False) =α ⋅ P(Hi = True | Flu = False)⋅ P(Flu = False) =α ⋅0.95⋅0.2 =α ⋅0.19
!
Pr(Flu=True) 5%
Pr(Flu=False) 95%
Pr(Hi=True | Flu=True) 90%
Pr(Hi=False | Flu=True) 10%
Pr(Hi=True | Flu=False) 20%
Pr(Hi=False | Flu=False) 80%
Bayesian Network
18. If an individual has a high temperature (i.e., the evidence available is Hi=True), the
computation for this diagnostic reasoning is as follows:
Bel(Flu = T) =α ⋅ P(Hi = T | Flu = T)⋅ P(Flu = T) =α ⋅0.05⋅0.9 =α ⋅0.045
Bel(Flu = F) =α ⋅ P(Hi = T | Flu = F)⋅ P(Flu = F) =α ⋅0.95⋅0.2 =α ⋅0.19
Bel(Flu = T)+ Bel(Flu = F) =1 given that variable states are mutually exclusive.
So,α ⋅0.045+α ⋅0.19 =1∴α =
1
0.045+ 0.19
Bel(Flu = True) =
0.045
0.19+ 0.045
= 0.19
Bel(Flu = False) =
0.19
0.19+ 0.045
= 0.81
Bayesian Network
19. Decision Modeling Process
Decision
modeling for
a disease Identify the
diagnosis
guideline for
the disease
Diagnosis
criteria for
the disease
Preprocess the
clinical records
of patients and
normal controls
Training
database
Build a
Bayesian
network
structure
Perform
Bayesian
parameter
learning
Evaluate the
Bayesian
learning
Deploy the
decision
model Acceptable
performance
measures?
Review the
decision
model
Additional
attributes
Additional
clinical
records
Decision model
modeled
No
Yes
21. Decision Modeling Process
Decision
modeling for
a disease Identify the
diagnosis
guideline for
the disease
Diagnosis
criteria for
the disease
Preprocess the
clinical records
of patients and
normal controls
Training
database
Build a
Bayesian
network
structure
Perform
Bayesian
parameter
learning
Evaluate the
Bayesian
learning
Deploy the
decision
model Acceptable
performance
measures?
Review the
decision
model
Additional
attributes
Additional
clinical
records
Decision model
modeled
No
Yes
23. Discretize Numerical Attributes
Minimum&Description&Length&(MDL)&(1):&
Occam’s razor: choose the shortest explanation for the observed data.
hMAP = argmaxP(D | h)⋅ P(h)
hMAP = argmax lgP(D | h)+ lgP(h)[ ]
hMAP = argmin −lgP(D | h)− lgP(h)[ ]
This equation can be interpreted as a statement that short hypotheses are preferred.
Assuming that LC(i) ≅ description length of message i with respect to C.
LCD|H
(D | h) = −logP(D | h) , where CD|h is the optimal code for describing data D.
LCH
(h) = −logP(h) , where CH is the optimal code for hypothesis space H.
So:
hMAP ∝argmin
H∈h
LCD|h
(D | h)+ LCH
(h)#$ %&
1: Kononenko, I. On biases in estimating multi-valued attributes. International Joint Conference on Artificial Intelligence, 1995.
Lawrence Erlbaum Associates. p.1034-1040.
24. Bayesian Learning: EM Algorithm
Expectation-Maximization algorithm (1)
(1/3):
• Find a maximum likelihood estimates for θ when given dataset is incomplete.
• Starts with random probability distributions.
• Alternates between two steps.
• Expectation step: “complete” the data set by using the current parameter
estimates ˆθ (calculate expectations for missing values).
• Maximization step: use the “complete” data set to find a new maximum
likelihood estimate ˆθ ' for the parameters.
1: Dempster, A. P.; Laird, N. M.; Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the
Royal Statistical Society. Series B (Methodological), v. 39, n. 1, p. 1-38, 1977. ISSN 0035-9246.
25. Bayesian Learning: EM Algorithm
Expectation-Maximization algorithm (2/3):
Let:
yi – observable variables.
zi – latent variables.
θ – all possible parameters in the model.
Goal is to find:
ˆθ = argmax
θ
P(θ | D)
P(θ | yi,..., yn )∝ P(y1...yn |θ)⋅ P(θ)∝ P(y1...yn |θ)
As P(y1...yn |θ) = P(y1...yn, z1...zn |θ)∫ dz
26. Bayesian Learning: EM Algorithm
Expectation-Maximization algorithm (3/3):
Using the auxiliary function:
Q(θ |θt ) = P(z1...zn |θt, y1...yn )∫ logP(θ, z | y1...yn )dz
What EM algorithm does is:
θt+1 = argmaxQ(θ |θt ), with random starting point.
E-Step: find the probabilities for z1…zn if all parameters are fixed to θt
M-Step: now that P(z1...zn |θt, y1...yn ) is fixed, find θ that maximizes the integral.
27. Utility
Dementia?
6%
94%
>13
0-13
Education
82%
18%
Female
Male
Gender
56%
44%
>72
0-72
Age
58%
42%
Positive
Negative
Diagnosis
1%
1%
16%
21%
50%
12%
5
4
3
2
1
0
Clock Drawing Test
(CDT) scale
20%
41%
39%
27-30
18-26
0-17
Mini Mental State Exam
(MMSE) score
51%
46%
16%
>11
5-11
0-4
Verbal Fluency Test
(VFT) score
19%
15%
32%
29%
6%
3-severe
2-moderate
1-mild
0.5-very mild
0-normal control
Clinical Dementia Rating (CDR)
scale
72%
28%
>3.55
0-3.55
IQCode (Informant
Questionnaire on Cognitive
Decline in the Elderly) score
74%
26%
>9
0-9
Lawton scale
71%
29%
>15
0-15
Stroop color word test
72%
18%
10%
>59
17-59
0-16
Trial Making Test (TMT)
39%
61%
>51
0-51
Berg balance scale
78%
8%
14%
>2
1-2
0
Pfeffer questionnaire
32%
68%
Presence
Absence
Depression
30. Decision Modeling Process
Decision
modeling for
a disease Identify the
diagnosis
guideline for
the disease
Diagnosis
criteria for
the disease
Preprocess the
clinical records
of patients and
normal controls
Training
database
Build a
Bayesian
network
structure
Perform
Bayesian
parameter
learning
Evaluate the
Bayesian
learning
Deploy the
decision
model Acceptable
performance
measures?
Review the
decision
model
Additional
attributes
Additional
clinical
records
Decision model
modeled
No
Yes
31. Bayesian Learning: Results Evaluation
1. Using cross-validation with 4 folds, we compared
Bayesian Network performance with other well-known
classifiers:
• Näive Bayes
• Logistic Regression
• Multilayer Perceptron
• Decision Table
• Decision Stump using Boost algorithm
• J48 Decision Tree
2. Qualitative evaluation of sensitivity analysis results.
32. Bayesian Learning: Results Evaluation
Classification performance measures:
Performance measure Acronym Domain Best score
Area under ROC curve AUC [0, 1] 1
Harmonic mean of
precision and recall
F1 [0, 1] 1
Mean square error MSE [0, 1] 0
Mean cross-entropy MXE [0, ∞) 0
39. 1. Design and develop a prototype application.
http://siade.midiacom.uff.br
Future Works
40.
41. 2. Evaluate the decision support system in a real clinical
daily routine.
3. Improve the decision model with a continuous Bayesian
network learning process.
4. Extend the clinical decision model to other domains.
Future Works
42. About Bayesian modeling:
1. How to establish a continuous parameters adjustment method for Bayesian
models?
2. A higher missing data ratio may cause bias, imprecision or confounding. Is it
possible finding out a model for missing data? What should be a reasonable
level of missing data ratio?
3. The independence between random variables with same parent is an
assumption from Bayesian-based models. What is the better way to deal
with it? What are its effects in the Bayesian results?
Questions
43. About Dementia and other related mental disorders:
4. How could we define a health cost-effective analysis for utility node?
5. Is there any other patients database with normal controls that could be used
as training database for Bayesian learning?
6. How could we integrate the identified decision points of the current clinical
guidelines with the decision boxes of Bayesian networks?
Questions
44. About Decision-Support System:
7. Is there any health information system that we could integrate with our
decision-support model?
8. Depends on (7), how could we assure the semantic interoperability between
the knowledge base mapped on decision-support model and the health
information system?
9. Our decision-support system has focused on clinical diagnosis process. Is
there another health care area that is relevant for designing and developing
a similar decision-support system? (e.g., patient-centered treatment
planning, health monitoring system...)
Questions
45. This research was partially supported by:
• FAPERJ (Research Support Foundation of the State of Rio de Janeiro).
• CNPQ (National Council for Scientific and Technological Development).
Acknowledgements
46. Acknowledgements
I would like to thank…
Robin Morris, Daniel Stahl (King’s College London),
Jerson Laks (Federal University of Rio de Janeiro), and
Daniel Mograbi (Pontifical Catholic University of Rio de Janeiro)
for such opportunity.
47. And I thank you for the
audience!
…any question?
Acknowledgements
seixas_flavioluiz@gmail.com