Oral presentation given in MEDI session at 2017 ACS in DC.
co-authors Kimberley M. Zorn, Mary A. Lingerfelt, Jair L. de Siqueira-Neto, Alex M. Clark, Sean Ekins
describes drug repurposing and machine learning - for more details see www.collaborationspharma.com
3. 3
▶ Asymptomatic for ~70% of people (infected for life)
▶ Fatal cardiac, neurological, & digestive symptoms can
develop up to 25 years later
▶ Curable… if caught early
▶ Current treatments are not approved in the United States
Chagas Disease
Nifurtimox
Benznidazole
4. 4
Epidemiology
Estimated of 300-500K
in the United States
Estimated 7-8 million
infected worldwide
https://www.cdc.gov/parasites/chagas/gen_info/vectors/index.html
https://www.dndi.org/diseases-projects/chagas/
5. Machine Learning and Drug Discovery
▶ Simply put: Molecular pattern recognition of biological data
▶ Fingerprints to identify these patterns
▶ Define active and inactive features
▶ Statistics to watch for: Receiver Operator Characteristic (ROC)
▶ Used to generate predictions for drug activity at a certain target
▶ Real life example - Pyronaridine (an approved antimalarial)
5
6. Pyronaridine, Repurposed
▶ Broad Institute, 4064 compounds
▶ PubChem AID 2044 (EC50)
▶ 1853 active compounds (EC50 < 1 µM)
▶ PubChem AID 2010 (Cytotoxicity)
▶ 1698 active compounds (>10 fold difference in EC50)
▶ ~ 100 compounds tested in vitro, eleven had EC50 < 10 µM
▶ Pyronaridine: 85% in vivo efficacy, EC50 = 225 nM
6
Vehicle | Pyronaridine
Ekins et al., PLoS Negl Trop
Dis. 2015 Jun 26;9(6):e0003878
7. How can the everyday scientist
use Machine Learning?
7
Private Data
Public Data
Predict Activity
10. Subvalidations in Assay Central
10
▶ Testing AID 2044 vs Ekins
▶ Defined testing/training set
▶ Threshold = 1 µM
▶ Six actives
▶ ROC = 0.72
▶ What else can we do with
Ekins results?
12. Chagas Models in Assay Central
12
▶ Tulahuen strains targeting specific life cycle stage
▶ Combined strains or stages
▶ Ki measurements
▶ PubChem data discussed herein
▶ Target specific models (cruzain & cruzipain)
▶ Various thresholds
▶ More to come!
13. 13
▶ CPI database currently contains > 150 models
▶ Molecular properties, Disease & ADME Targets
▶ Predictions for more than ten ongoing projects
▶ Assay Central compound predictions being selected for
T. cruzi bioactivity testing
▶ Share models with Java executable on any computer
www.assaycentral.org
14. How would you care to collaborate?
14
▶ Inexpensive, fast & easy
▶ We need more data & feedback
▶ Curious about your compounds?
Predict them in Assay Central!
▶ Ongoing projects for rare & neglected
disease drug discovery, including
Ebola & TB
More information at:
www.collaborationspharma.com
16. 16
Data Curation & Management
▶ Collect bioactivity data from public & private sources
▶ Bayesian algorithm
▶ ECFP6 descriptors
▶ GitHub to share datasets and models in-house
▶ Private server for additional data backup in-house
▶ Share executable files over Google Drive or DropBox
18. Drug Repurposing for Tuberculosis
18
▶ Tuberculosis (https://www.cdc.gov/tb/statistics/default.htm)
▶ 1/3 of the population is infected
▶ 1.8 million deaths in 2015
▶ Assay Central Models (~10)
▶ Public in vitro data & collaborator in vivo data
▶ Targeted models for PyrG & PanK
▶ Predicted compounds & sent for testing
▶ Vendor libraries + FDA approved drugs
▶ Two compounds active at either target, one at both
Work completed by Tom Lane