Open data, compound repurposing, and rare diseases (ISCB)
1. Open data, compound repurposing,
and rare diseases
Andrew Su, Ph.D.
@andrewsu
asu@scripps.edu
http://sulab.org
February 16, 2017
Slides: slideshare.net/andrewsu
5. “Undiscovered public knowledge”
5
Raynaud
disease
Fish oil / EPA
Abnormal
platelet
activity
Abnormal
blood
viscosity
High blood
viscosity
Elevated RBC
rigidity
Vasodilation
Low blood
triglycerides
Increased
prostacyclins
A
C
B
B
B
B
B
B
B
6. “Undiscovered public knowledge”
6
Raynaud
disease
Fish oil / EPA
Abnormal
platelet
activity
Abnormal
blood
viscosity
High blood
viscosity
Elevated RBC
rigidity
Vasodilation
Low blood
triglycerides
Increased
prostacyclins
A
C
B
B
B
B
B
B
B
8. Building a Network of BioThings (then)
8
Eicosapentaenoic
acid
Platelet
aggregation
Fatty Acid
Edge = co-mention
x 1000s article titles
9. Building a Network of BioThings (now)
9
Eicosapentaenoic
acid
Platelet
aggregation
Fatty Acid
x 1000s article titles
x 26 million articles…
… and full abstracts
decreases Edge = co-mention
= PubChem:446284 =
Timnodonic
acid
19. 20
Paid crowdsourcing
• F = 0.84
• 28 days
• 212 workers
• Total cost: $0
$$$
• F = 0.87
• 9 days
• 145 workers
• Total: $630.96
“Help science, please”
Citizen Science
20. Does Citizen Science scale?
21
1,000,000 articles * 10 AE / article
15,828
volunteers
needed
10,275 AE * 365 days
212 annotators* 28 days
AE = Annotation events
=
Number of annotation
events per year
Number of annotation
events per year
per volunteer
36. Ben GoodChunlei Wu Shirley Willis
Sebastien Lelong
Andra Waagmeester
Max Nanis
Cyrus Afrasiabi
Julia Turner
Ginger Tsueng
M2C M2C
Louis Gioia
Toby Li
Karthik G
Kevin Xin
Jake Bruggemann
Mike Mayers
DR
DR
Julee Adesara
Ramya Gamini Greg Stupp Sebastian
Burgstaller
Tim Putman Nuria Queralt
Rosinach
DRDR
DR
DR M2C
The Crowds Funding