SlideShare a Scribd company logo
1 of 49
Confidential + Proprietary
Deep learning in medicine: An
introduction and applications to
next-generation sequencing and
disease diagnostics
Allen Day, PhD, allenday@google.com, Twitter @allenday
Brain DeepMind
Cloud
Healthcare
Verily Calico
Google/Alphabet teams involved in healthcare
Confidential + Proprietary
The basics of ML
Confidential & Proprietary
Observation: programming a computer to be clever is harder than
programming a computer to learn to be clever.
Intro to machine learning and deep learning
Confidential & Proprietary
Traditional Machine Learning...vs the new way
The old way:
Write a computer program
with explicit rules to follow
if email contains V!agrå
then mark is-spam;
if email contains …
if email contains …
The new way:
Write a computer program to
learn from examples
try to classify some emails;
change self to reduce errors;
repeat;
Confidential + Proprietary
Proprietary & Confidential
Deep Neural Networks Step 1: training
Confidential + Proprietary
Proprietary & Confidential
Deep Neural Networks Step 2: inference
[Tiger-Dog]: 0.9890
[Tiger] : 0.9791
[Dog] : 0.9311
[Pet] : 0.8139
[Fence] : 0.7998
…
[ゴジラ  ]: 0.0120
Proprietary & Confidential
Confidential & Proprietary
Key Innovation: Learns Features from the Data
HIGH LEVEL COMPLEX DETECTORS
PARTS OF OBJECTS, MORE COMPLEX
PATTERNS
PRIMITIVE FEATURES: EDGES, BLOCKS
OF COLORS, ETC.
INPUT: RAW DATA
Confidential & Proprietary
“cat”
Deep Learning Revolution
Modern Reincarnation of Artificial Neural Networks
Collection of simple trainable mathematical units, organized in layers, that work together to solve
complicated tasks
Key Benefit
Learns features from raw, heterogeneous data
No explicit feature engineering required
What’s New
layered network architecture,
new training math, *scale*
Proprietary & Confidential
Accuracy
Scale (data size, model size)
1980s and 1990s
neural networks
other approaches
Proprietary & Confidential
more
computeAccuracy
Scale (data size, model size)
neural networks
other approaches
1980s and 1990s
Proprietary & Confidential
more
computeAccuracy
Scale (data size, model size)
neural networks
other approaches
Now
Szegedy et al, 2014
“Inception” Module.
Auxiliary Classifiers
Pr(dog)
GoogLeNet (aka “Inception”) Architecture
Main Classifier
Proprietary & Confidential
Confidential & Proprietary* Human Performance based on analysis done by Andrej Karpathy. More details here.
%errors
Year
Image understanding is getting better than human
level
ImageNet Challenge: Given
an image, predict one of
1000+ classes
Confidential & Proprietary
Search
Search ranking
Speech recognition
Gmail
Smart Reply
Spam classification
Photos
Photos search
Translate
text, graphic, and
speech translations
Android
Keyboard & speech input
Drive
Intelligence in Apps
YouTube
Video recommendations
Better thumbnails
Cardboard
Smart stitching
Play
App recommendations
Game developer experience
Ads
Richer Text Ads
Automated Bidding
Chrome
Search by Image
Maps
Street View image
Parsing Local Search
Machine learning has transformed Google’s products
Confidential + Proprietary
Google in Health
Confidential + Proprietary
Medical applications of deep learning technology
● Deep learning has remarkable efficacy
○ Amazing with images: photos, search, streetview, Android cameras, …
○ And with speech, language, data centers, …
● How and where can we apply this in medicine and biotechnology?
○ Medical imaging: ophthalmology, pathology, ...
○ Genomics
○ ...
Confidential + ProprietaryConfidential + Proprietary
Diabetes causes
blindness
5-10% of population is diabetic
Should be screened annually for
diabetic retinopathy
Fastest growing cause of blindness
# Diabetics >> qualified graders
● 387M diabetics, 200k ophthalmologists
● Grading is highly technical
Poor adherence to care plan
● No symptoms, preventive not curative
● 30-50% screened in US
● 10% in high risk populations
● Many lost to follow up
Confidential + Proprietary
How DR is Diagnosed: Retinal Fundus Images
Healthy Diseased
Hemorrhages
No DR Mild DR Moderate DR Severe DR Proliferative DR
Confidential + Proprietary
Even when available, ophthalmologists are not consistent...
Consistency: intragrader ~65%, intergrader ~60%
Ophthalmologist Graders
Patient
Images
Confidential + Proprietary
Adapt deep neural network to read fundus images
Conv Network - 26 layers
No DR
Mild DR
Moderate DR
Severe DR
Proliferative DR
Labeling tool
54 ophthalmologists
130k images
880k
diagnoses
Confidential + Proprietary
0.95
F-score
Algorithm Ophthalmologist
(median)
0.91
“The study by Gulshan and
colleagues truly represents the
brave new world in medicine.”
“Google just published this paper
in JAMA (impact factor 37) [...] It
actually lives up to the hype.”
Dr. Andrew Beam, Dr. Isaac Kohane
Harvard Medical School
Dr. Luke Oakden-Rayner
University of Adelaide
Confidential + Proprietary
Digital pathology
JAMA. 2015; 313(11):1122-1132
Correct
diagnosis
87%
48%
84%
96%
75%
Example: Breast Cancer Biopsies
Overdiagnosis
Underdiagnosis
1 in 12 breast cancer biopsies is misdiagnosed (population adjusted)
Similar for other cancer types (prostate 1 in 7, etc)
Confidential + Proprietary
Detecting breast cancer metastases in lymph nodes
detail ←→ context
Multi scale model
resembles microscope
magnifications
● Goal: train a deep learning
model to identify cancerous
cells in pathology slide images
● Output: a map over the whole
image, indicating the probability
that each region harbors cancer
cells
● Trained on ~23M images
patches extracted from
gigapixel slide images of normal
(n=127) and cancerous (n=88)
tissues from Camelyon16
dataset
Confidential + Proprietary
Tumor localization score (FROC) of 0.89 vs 0.73 for pathologist with unlimited time
(92% sensitivity with 8 false positives per slide vs. 73% sensitivity with 0 false positives per slide)
Slide level classification of AUC of 0.96 (on par with pathologist)
Predicted RegionsGround truth MaskOriginal Slide
Metastatic cell detection results are encouraging
Cancer
cells
Read more at https://arxiv.org/abs/1703.02442
Confidential + Proprietary
Deep learning in genomics
New application area
Example papers: Alipanahi et al (2015),
Park Y, Kellis M (2015); Xiong et al
(2015); Zhou, Troyanskaya (2015);
Angermueller et al (2016)
Deep learning to call variants
Goals: (1) replace statistical machinery
with single deep learning model; (2)
state-of-the-art or better performance;
(3) generalize to new technologies.
Start with human germline
Use the germline case to figure out
deep learning data representation and
models. Extend the approach to
somatic mutations, non-human, etc..
Variant calling
Key challenge in genomics due to
complex errors of NGS technologies.
Current error rates vary from <1% for
germline SNPs to >25% somatic indels.
Confidential + Proprietary
Where should we get started applying deep
learning to genetics and genomics problems?
Must-haves for deep learning
● Lots of data: >50k examples, >1M ideal.
● High-quality input data and labels for training.
● The mapping from data=>label is unknown but certainly exists.
● High-quality previous efforts so we know that deep learning is key.
○ i.e., hard to solve with classical statistical/ML approaches.
SNP and indel calling from NGS data
Confidential + Proprietary
Figuring out the true genome sequence from NGS data is
a computational and statistical challenge
.......... cttgggttga tattgtcttg gaacatggag gttgtgtcac cgtaatggca caggacaaac cgactgtcga
catagagctg gttacaacaa cagtcagcaa catggcggag gtaagatcct actgctatga ggcatcaata tcagacatgg
cttcggacag ..........
True genome sequence: 3 billion bases
in 23 contiguous chunks (chromosomes)
Actual sequencer output: ~1 billion ~100
basepair long DNA reads (30x coverage)
Reference: ...ttgtcttggaacatggaggttgtgtcaccgtaatggcacaggacaaacc...
Read1: ...ttgtcttggaacatggaggttgtgtgaccgtaatggcacaggacaaacc
Read2: ...ttgtcttggaacatggaggttgtgtgaccgtaatggcacaggacaaacc...
Read3: tggaacatggaggttgtgtgaccgtaatggcacaggacaaacc...
Align reads to a
reference genome
Infer the true genomic
sequence(s)*
Step 1 Step 2
Read1: cttgggttgatattgtcttggaacatggaggttgtgtcaccgtaatggcacaggacaaacc
Read2: gatattgtcttggaacatggaggttgtgtcaccgtaatggcacaggacaaaccgactgtcg
Read3: tggaacatggaggttgtgtcaccgtaatggcacaggacaaaccgactgtcgacatagagct
Read4: ggttgtgtcaccgtaatggcacaggacaaaccgactgtcgacatagagctggttactgtcg
....
Read 1,000,000,000: ....caactgtcgacatagagctggttactgtcgacatagagctggtt
Reads aligned to a reference genome
Same as reference Same as reference
Confidential + Proprietary
A complex error process makes it difficult to
call variants accurately in NGS data
Errors come from many
uncontrollable sources
Quality of the sample DNA
Protocol used to prepare
sample for the sequencer
From physical properties of
instrument itself
Data processing artifacts
Errors are correlated among
the reads
The most accurate variant
callers, such as the GATK,
use multiple techniques, e.g.
● Logistic regression
● Hidden Markov Models
● Bayesian inference
● Gaussian mixture
models
All make approximations
known to be invalid
Existing statistical techniques
work okay...
...but have well-known
drawbacks
Rely on hand-crafted features
Hand optimized parameters
Require years of work by
domain experts
Specialized to specific prep,
sequencer, tool chain, etc
Hard to generalize to new
technologies
Confidential + Proprietary
Other features
ACGTGCCCCAAACGTGATGATC
ACGTGCCCCAACC---------
--GTGCCCCAAACGT-------
----GCCCCAAACGTGA-----
-------CCAACCGTGATG---
--------CAAACGTGATGATC
----------ACCGTGATGATC
Ref
Read
bases
Qualities
Pileup image
A
A
A
C
C
C
A
Reference
Reads
Candidate site
0.01 0.95 0.04
hom
ref het
hom
alt
Heterozygous
variant call
Genotype
likelihoods
CNN
Find candidate variants Create pileup images Evaluate image and call variants
DeepVariant
Recasting variant calling for deep learning
Confidential + Proprietary
Recasting variant calling for deep learning
Encoding is roughly red = {A,C,G,T}; green = {quality score}; blue = {read strand};
alpha = {matches ref genome}
True
SNPs
True
Indels
False
variants
Encode reads and reference genome as images
Confidential + Proprietary
Recasting variant calling for deep learning
Use inception-v3 to call variant genotype
Szegedy et al. 2015, https://arxiv.org/abs/1512.00567
Confidential + Proprietary
Genome in a Bottle provides ground truth
human variation
● Extensive sequencing by orthogonal methods of single human (NA12878)
● Stringent criteria identify “callable genomic regions” and true variants
○ ~3.7M regions (covering 80% of genome) identified as callable
○ ~2.8M single nucleotide polymorphisms
○ ~350k small insertion/deletions
● Train and test on biological replicates of NA12878
○ Each germline WGS dataset provides ~3.7M labeled training variants
○ 2.1M true heterozygous variants
○ 1.3M true homozygous variants
○ 215k false positive variants
Zook et al. 2014
Confidential + Proprietary
DeepVariant works well in our in-house evaluations
Train model on
training
chromosomes
Evaluate on
held-out
chromosomes
Call
variants
Outperforms GATK on human dataMethodology
Confidential + Proprietary
Estimated P(error) [Phred-scaled, -10 log10(P(error))]
DeepVariant
GATK
Perfect calibration lineObservedP(error)
This is the
calibration for
heterozygous SNPs
but other variant
types and genotype
states are similar.
DeepVariant learns an accurate model of the
likelihood function P(genotype | reads)
Confidential + Proprietary
DeepVariant learns an accurate model of the
likelihood function P(genotype | reads)
● Variants should be
correct at the
assigned
confidence rate to
be well-calibrated
● Genotype
likelihoods are the
critical input to
genomic analyses
such as imputation,
de-novo mutation
and association
Most callers are overconfident in their likelihoods
Confidential + Proprietary
After lots of internal testing, we entered into the public
FDA-sponsored PrecisionFDA competition in April 2016
Unblinded training
sample
Blinded evaluation
sample
Confidential + Proprietary
99.85
98.91
DeepVariant won an award at the 2016
PrecisionFDA competition
v2 => v3 truth set
for unblinded
sample
Unblinded =>
blinded sample
with v3 truth set
F-measure is the harmonic mean of precision and recall.
Confidential + Proprietary
A trained DeepVariant model encodes everything needed
to call variants, enabling us to apply it in novel contexts
Training data Evaluation data F1
b37 chr1-19 b38 chr20-22 99.45%
b38 chr1-19 b38 chr20-22 99.47%
You can train on one genome build
and call variants on another
You can train on human data and call
mouse data
F1 is the harmonic mean of precision and recall.
Training data Evaluation data F1
Human chr1-19 Mouse chr18-19 98.29%
Mouse chr1-17 Mouse chr18-19 97.84%
Call variants on b38 using a model trained on
either b37 or b38 with effectively identical quality.
Means we can call on a genome build without
needing all of the metadata mapped to that build.
Robust to protocol differences; human: 50x
2x148bp HiSeq 2500; mouse: 27x 2x100bp GAII.
Leverage the larger and better truth data on
humans (e.g., ~5M in humans vs. ~700K in mouse)
to call variants in other organisms.
Confidential + Proprietary
Dataset
10X Chromium
75x WGS
Ion AmpliSeq
exome
PacBio raw
reads 40x WGS
SOLID SE 85x
WGS
Illumina
TruSeq exome
DeepVariant
(F1 metric)
99.3% 96.9% 92.7% 86.4% 96.1%
Comparator
(F1 metric)
98.2% 97.3%1
56.1%2
78.8%3
95.4%
Comparator
caller
Long Ranger TVC samtools GATK ensemble
1
Uses four lanes of data vs. one for DeepVariant; 2
No standard caller exists for this technology for human
samples; 3
Old technology without any maintained variant callers.
DeepVariant can learn to call variants in many
sequencing technologies
Confidential + Proprietary
DeepVariant can learn to call variants at a
range of input sequence depths
Sensitivity Precision
Sequencing depth Sequencing depth
GATK
DV 35-45x
DV 4-45x
DV 15-25x
GATK
DV 35-45x
DV 4-45x
DV 15-25x
Confidential + Proprietary
Proprietary & Confidential
DeepVariant outperforms GATK on low-coverage samples
Training on chromosomes 1-19
Evaluation on chromosomes 20-22
Confidential + Proprietary
DeepVariant conclusions
● Deep Learning is a remarkably powerful and flexible technology.
● Example of how to apply deep learning to a genomics problem.
● Equivalent or better performance than current variant calling tools.
● Works for many (any?) sequencing technology.
● Run now at https://cloud.google.com/genomics/v1alpha2/deepvariant
● Open-sourced version coming soon!
● Read more in our BioRxiv paper https://doi.org/10.1101/092890.
Google confidential │ Do not distribute
Google’s Data Research...
2002 2004 2006 2008 2010 2012 2014 2016
GFS
MapReduce TensorFlow
BigTable
Dremel
Colossus
Flume
Megastore
Spanner
Millwheel
PubSub
F1
Google confidential │ Do not distribute
...are the technologies used in DeepVariant...
2002 2004 2006 2008 2010 2012 2014 2016
GFS
MapReduce TensorFlow
BigTable
Dremel
Colossus
Flume
Megastore
Spanner
Millwheel
PubSub
F1
Google confidential │ Do not distribute
... which are available to you today on GCP
2002 2004 2006 2008 2010 2012 2014 2016
ML
PubSub
DataFlow
DataStore
DataFlow
Cloud Storage
BigQuery
BigTable
DataProc
Cloud Storage
Confidential + ProprietaryConfidential + Proprietary
Sharing our tools with researchers and developers
around the world
repository
for “machine learning”
category on GitHub
#1
TensorFlow released
in Nov. 2015
Build What’s Next
Thank You!
Allen Day, PhD // Science Advocate // @allenday // #genomics #ml #datascience
Brain DeepMind
Cloud
Healthcare
Verily Calico

More Related Content

What's hot

Deep learning health care
Deep learning health care  Deep learning health care
Deep learning health care Meenakshi Sood
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning David Voyles
 
Machine Learning vs. Deep Learning
Machine Learning vs. Deep LearningMachine Learning vs. Deep Learning
Machine Learning vs. Deep LearningBelatrix Software
 
AI in Healthcare: From Hype to Impact (updated)
AI in Healthcare: From Hype to Impact (updated)AI in Healthcare: From Hype to Impact (updated)
AI in Healthcare: From Hype to Impact (updated)Mei Chen, PhD
 
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...SlideTeam
 
An Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday RadiologistAn Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday RadiologistBrian Wells, MD, MS, MPH
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Krishnaram Kenthapadi
 
The Deep Learning Glossary
The Deep Learning GlossaryThe Deep Learning Glossary
The Deep Learning GlossaryNVIDIA
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
AI Governance – The Responsible Use of AI
AI Governance – The Responsible Use of AIAI Governance – The Responsible Use of AI
AI Governance – The Responsible Use of AINUS-ISS
 
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...SlideTeam
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 
Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Grigory Sapunov
 
Artificial intelligence - A human revolution
Artificial intelligence - A human revolutionArtificial intelligence - A human revolution
Artificial intelligence - A human revolutionAccenture BeLux
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIAI Frontiers
 
Artificial intelligence in medical image processing
Artificial intelligence in medical image processingArtificial intelligence in medical image processing
Artificial intelligence in medical image processingFarzad Jahedi
 

What's hot (20)

Deep learning health care
Deep learning health care  Deep learning health care
Deep learning health care
 
AI Presentation 1
AI Presentation 1AI Presentation 1
AI Presentation 1
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
 
Machine Learning vs. Deep Learning
Machine Learning vs. Deep LearningMachine Learning vs. Deep Learning
Machine Learning vs. Deep Learning
 
AI in Healthcare: From Hype to Impact (updated)
AI in Healthcare: From Hype to Impact (updated)AI in Healthcare: From Hype to Impact (updated)
AI in Healthcare: From Hype to Impact (updated)
 
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
 
An Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday RadiologistAn Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday Radiologist
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 
Machine learning
Machine learningMachine learning
Machine learning
 
The Deep Learning Glossary
The Deep Learning GlossaryThe Deep Learning Glossary
The Deep Learning Glossary
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
AI Governance – The Responsible Use of AI
AI Governance – The Responsible Use of AIAI Governance – The Responsible Use of AI
AI Governance – The Responsible Use of AI
 
Machine Learning in Healthcare and Life Science
Machine Learning in Healthcare and Life ScienceMachine Learning in Healthcare and Life Science
Machine Learning in Healthcare and Life Science
 
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
Artificial Intelligence High Technology PowerPoint Presentation Slides Comple...
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018
 
Artificial intelligence - A human revolution
Artificial intelligence - A human revolutionArtificial intelligence - A human revolution
Artificial intelligence - A human revolution
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
 
Artificial intelligence in medical image processing
Artificial intelligence in medical image processingArtificial intelligence in medical image processing
Artificial intelligence in medical image processing
 
Medical image analysis
Medical image analysisMedical image analysis
Medical image analysis
 

Similar to Deep learning in medicine: An introduction and applications to next-generation sequencing and disease diagnostics

20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - Amsterdam20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - AmsterdamAllen Day, PhD
 
Cloud Accelerated Genomics by Allen Day of Google
Cloud Accelerated Genomics by Allen Day of GoogleCloud Accelerated Genomics by Allen Day of Google
Cloud Accelerated Genomics by Allen Day of GoogleData Con LA
 
20170428 - Look to Precision Agriculture to Bootstrap Precision Medicine - Cu...
20170428 - Look to Precision Agriculture to Bootstrap Precision Medicine - Cu...20170428 - Look to Precision Agriculture to Bootstrap Precision Medicine - Cu...
20170428 - Look to Precision Agriculture to Bootstrap Precision Medicine - Cu...Allen Day, PhD
 
20170426 - Deep Learning Applications in Genomics - Vancouver - Simon Fraser ...
20170426 - Deep Learning Applications in Genomics - Vancouver - Simon Fraser ...20170426 - Deep Learning Applications in Genomics - Vancouver - Simon Fraser ...
20170426 - Deep Learning Applications in Genomics - Vancouver - Simon Fraser ...Allen Day, PhD
 
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchII-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchDr. Haxel Consult
 
Mammography with Inception
Mammography with InceptionMammography with Inception
Mammography with InceptionSeldon
 
Mammography with Inception
Mammography with Inception Mammography with Inception
Mammography with Inception Stephen Morrell
 
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...Servio Fernando Lima Reina
 
Deep Learning for AI (3)
Deep Learning for AI (3)Deep Learning for AI (3)
Deep Learning for AI (3)Dongheon Lee
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsXavier Amatriain
 
Skin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptxSkin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptxVishalLabde
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformaticsJan Aerts
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision TreesSara Hooker
 
Lung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdfLung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdfjagan477830
 
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...Data Con LA
 
machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...Armando Vieira
 
Lecture1.pptx
Lecture1.pptxLecture1.pptx
Lecture1.pptxSanjarBey
 
Gmo Persuasive Essay
Gmo Persuasive EssayGmo Persuasive Essay
Gmo Persuasive EssayJessica Hill
 
AI at GSK_Kim Branson_mHealth Israel
AI at GSK_Kim Branson_mHealth IsraelAI at GSK_Kim Branson_mHealth Israel
AI at GSK_Kim Branson_mHealth IsraelLevi Shapiro
 

Similar to Deep learning in medicine: An introduction and applications to next-generation sequencing and disease diagnostics (20)

20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - Amsterdam20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - Amsterdam
 
Cloud Accelerated Genomics by Allen Day of Google
Cloud Accelerated Genomics by Allen Day of GoogleCloud Accelerated Genomics by Allen Day of Google
Cloud Accelerated Genomics by Allen Day of Google
 
20170428 - Look to Precision Agriculture to Bootstrap Precision Medicine - Cu...
20170428 - Look to Precision Agriculture to Bootstrap Precision Medicine - Cu...20170428 - Look to Precision Agriculture to Bootstrap Precision Medicine - Cu...
20170428 - Look to Precision Agriculture to Bootstrap Precision Medicine - Cu...
 
20170426 - Deep Learning Applications in Genomics - Vancouver - Simon Fraser ...
20170426 - Deep Learning Applications in Genomics - Vancouver - Simon Fraser ...20170426 - Deep Learning Applications in Genomics - Vancouver - Simon Fraser ...
20170426 - Deep Learning Applications in Genomics - Vancouver - Simon Fraser ...
 
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchII-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
 
Mammography with Inception
Mammography with InceptionMammography with Inception
Mammography with Inception
 
Mammography with Inception
Mammography with Inception Mammography with Inception
Mammography with Inception
 
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
 
Deep Learning for AI (3)
Deep Learning for AI (3)Deep Learning for AI (3)
Deep Learning for AI (3)
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
Skin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptxSkin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptx
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformatics
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
 
Lung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdfLung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdf
 
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
 
machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...
 
Lecture1.pptx
Lecture1.pptxLecture1.pptx
Lecture1.pptx
 
Gmo Persuasive Essay
Gmo Persuasive EssayGmo Persuasive Essay
Gmo Persuasive Essay
 
T1 2018 bioinformatics
T1 2018 bioinformaticsT1 2018 bioinformatics
T1 2018 bioinformatics
 
AI at GSK_Kim Branson_mHealth Israel
AI at GSK_Kim Branson_mHealth IsraelAI at GSK_Kim Branson_mHealth Israel
AI at GSK_Kim Branson_mHealth Israel
 

More from Allen Day, PhD

20170424 - Big Data in Biology - Vancouver - Simon Fraser University
20170424 - Big Data in Biology - Vancouver - Simon Fraser University20170424 - Big Data in Biology - Vancouver - Simon Fraser University
20170424 - Big Data in Biology - Vancouver - Simon Fraser UniversityAllen Day, PhD
 
20170406 Genomics@Google - KeyGene - Wageningen
20170406 Genomics@Google - KeyGene - Wageningen20170406 Genomics@Google - KeyGene - Wageningen
20170406 Genomics@Google - KeyGene - WageningenAllen Day, PhD
 
20170315 Cloud Accelerated Genomics - Tel Aviv / Phoenix
20170315 Cloud Accelerated Genomics - Tel Aviv / Phoenix20170315 Cloud Accelerated Genomics - Tel Aviv / Phoenix
20170315 Cloud Accelerated Genomics - Tel Aviv / PhoenixAllen Day, PhD
 
Genome Analysis Pipelines with Spark and ADAM
Genome Analysis Pipelines with Spark and ADAMGenome Analysis Pipelines with Spark and ADAM
Genome Analysis Pipelines with Spark and ADAMAllen Day, PhD
 
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGIHadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGIAllen Day, PhD
 
Hadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBI
Hadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBIHadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBI
Hadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBIAllen Day, PhD
 
Hadoop and Genomics - What You Need to Know - London - Viadex RCC - 2015.03.17
Hadoop and Genomics - What You Need to Know - London - Viadex RCC - 2015.03.17Hadoop and Genomics - What You Need to Know - London - Viadex RCC - 2015.03.17
Hadoop and Genomics - What You Need to Know - London - Viadex RCC - 2015.03.17Allen Day, PhD
 
Hadoop as a Platform for Genomics - Strata 2015, San Jose
Hadoop as a Platform for Genomics - Strata 2015, San JoseHadoop as a Platform for Genomics - Strata 2015, San Jose
Hadoop as a Platform for Genomics - Strata 2015, San JoseAllen Day, PhD
 
Genomics isn't Special
Genomics isn't SpecialGenomics isn't Special
Genomics isn't SpecialAllen Day, PhD
 
Renaissance in Medicine - Strata - NoSQL and Genomics
Renaissance in Medicine - Strata - NoSQL and GenomicsRenaissance in Medicine - Strata - NoSQL and Genomics
Renaissance in Medicine - Strata - NoSQL and GenomicsAllen Day, PhD
 
2014.06.16 - BGI - Genomics BigData Workloads - Shenzhen China
2014.06.16 - BGI - Genomics BigData Workloads - Shenzhen China2014.06.16 - BGI - Genomics BigData Workloads - Shenzhen China
2014.06.16 - BGI - Genomics BigData Workloads - Shenzhen ChinaAllen Day, PhD
 
2014.06.30 - Renaissance in Medicine - Singapore Management University - Data...
2014.06.30 - Renaissance in Medicine - Singapore Management University - Data...2014.06.30 - Renaissance in Medicine - Singapore Management University - Data...
2014.06.30 - Renaissance in Medicine - Singapore Management University - Data...Allen Day, PhD
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseAllen Day, PhD
 
Human Genetics & Big Data [sans Ethics]
Human Genetics & Big Data [sans Ethics]Human Genetics & Big Data [sans Ethics]
Human Genetics & Big Data [sans Ethics]Allen Day, PhD
 
Building Data Science Teams, Abbreviated
Building Data Science Teams, AbbreviatedBuilding Data Science Teams, Abbreviated
Building Data Science Teams, AbbreviatedAllen Day, PhD
 
Genomics Crash Course for Data Engineers
Genomics Crash Course for Data EngineersGenomics Crash Course for Data Engineers
Genomics Crash Course for Data EngineersAllen Day, PhD
 
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production SuccessAllen Day, PhD
 
20131212 - Sydney - Garvan Institute - Human Genetics and Big Data
20131212 - Sydney - Garvan Institute - Human Genetics and Big Data20131212 - Sydney - Garvan Institute - Human Genetics and Big Data
20131212 - Sydney - Garvan Institute - Human Genetics and Big DataAllen Day, PhD
 
2013.12.12 - Sydney - Big Data Analytics
2013.12.12 - Sydney - Big Data Analytics2013.12.12 - Sydney - Big Data Analytics
2013.12.12 - Sydney - Big Data AnalyticsAllen Day, PhD
 
20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design PatternsAllen Day, PhD
 

More from Allen Day, PhD (20)

20170424 - Big Data in Biology - Vancouver - Simon Fraser University
20170424 - Big Data in Biology - Vancouver - Simon Fraser University20170424 - Big Data in Biology - Vancouver - Simon Fraser University
20170424 - Big Data in Biology - Vancouver - Simon Fraser University
 
20170406 Genomics@Google - KeyGene - Wageningen
20170406 Genomics@Google - KeyGene - Wageningen20170406 Genomics@Google - KeyGene - Wageningen
20170406 Genomics@Google - KeyGene - Wageningen
 
20170315 Cloud Accelerated Genomics - Tel Aviv / Phoenix
20170315 Cloud Accelerated Genomics - Tel Aviv / Phoenix20170315 Cloud Accelerated Genomics - Tel Aviv / Phoenix
20170315 Cloud Accelerated Genomics - Tel Aviv / Phoenix
 
Genome Analysis Pipelines with Spark and ADAM
Genome Analysis Pipelines with Spark and ADAMGenome Analysis Pipelines with Spark and ADAM
Genome Analysis Pipelines with Spark and ADAM
 
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGIHadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
 
Hadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBI
Hadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBIHadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBI
Hadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBI
 
Hadoop and Genomics - What You Need to Know - London - Viadex RCC - 2015.03.17
Hadoop and Genomics - What You Need to Know - London - Viadex RCC - 2015.03.17Hadoop and Genomics - What You Need to Know - London - Viadex RCC - 2015.03.17
Hadoop and Genomics - What You Need to Know - London - Viadex RCC - 2015.03.17
 
Hadoop as a Platform for Genomics - Strata 2015, San Jose
Hadoop as a Platform for Genomics - Strata 2015, San JoseHadoop as a Platform for Genomics - Strata 2015, San Jose
Hadoop as a Platform for Genomics - Strata 2015, San Jose
 
Genomics isn't Special
Genomics isn't SpecialGenomics isn't Special
Genomics isn't Special
 
Renaissance in Medicine - Strata - NoSQL and Genomics
Renaissance in Medicine - Strata - NoSQL and GenomicsRenaissance in Medicine - Strata - NoSQL and Genomics
Renaissance in Medicine - Strata - NoSQL and Genomics
 
2014.06.16 - BGI - Genomics BigData Workloads - Shenzhen China
2014.06.16 - BGI - Genomics BigData Workloads - Shenzhen China2014.06.16 - BGI - Genomics BigData Workloads - Shenzhen China
2014.06.16 - BGI - Genomics BigData Workloads - Shenzhen China
 
2014.06.30 - Renaissance in Medicine - Singapore Management University - Data...
2014.06.30 - Renaissance in Medicine - Singapore Management University - Data...2014.06.30 - Renaissance in Medicine - Singapore Management University - Data...
2014.06.30 - Renaissance in Medicine - Singapore Management University - Data...
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
 
Human Genetics & Big Data [sans Ethics]
Human Genetics & Big Data [sans Ethics]Human Genetics & Big Data [sans Ethics]
Human Genetics & Big Data [sans Ethics]
 
Building Data Science Teams, Abbreviated
Building Data Science Teams, AbbreviatedBuilding Data Science Teams, Abbreviated
Building Data Science Teams, Abbreviated
 
Genomics Crash Course for Data Engineers
Genomics Crash Course for Data EngineersGenomics Crash Course for Data Engineers
Genomics Crash Course for Data Engineers
 
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
 
20131212 - Sydney - Garvan Institute - Human Genetics and Big Data
20131212 - Sydney - Garvan Institute - Human Genetics and Big Data20131212 - Sydney - Garvan Institute - Human Genetics and Big Data
20131212 - Sydney - Garvan Institute - Human Genetics and Big Data
 
2013.12.12 - Sydney - Big Data Analytics
2013.12.12 - Sydney - Big Data Analytics2013.12.12 - Sydney - Big Data Analytics
2013.12.12 - Sydney - Big Data Analytics
 
20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns
 

Recently uploaded

Principles & Formulation of Hair Care Products
Principles & Formulation of Hair Care  ProductsPrinciples & Formulation of Hair Care  Products
Principles & Formulation of Hair Care Productspurwaborkar@gmail.com
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...PirithiRaju
 
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTMARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTjipexe1248
 
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrashi Coaching
 
KeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceKeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceLayne Sadler
 
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Sérgio Sacani
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxRahulVishwakarma71547
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPirithiRaju
 
World Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabWorld Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabkiyorndlab
 
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPirithiRaju
 
Controlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentControlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentRahulVishwakarma71547
 
Pests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPirithiRaju
 
Main Exam Applied biochemistry final year
Main Exam Applied biochemistry final yearMain Exam Applied biochemistry final year
Main Exam Applied biochemistry final yearmarwaahmad357
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxUalikhanKalkhojayev1
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxAkinrotimiOluwadunsi
 
TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)chatterjeesoumili50
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)GRAPE
 
M.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsM.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsSumathi Arumugam
 

Recently uploaded (20)

Principles & Formulation of Hair Care Products
Principles & Formulation of Hair Care  ProductsPrinciples & Formulation of Hair Care  Products
Principles & Formulation of Hair Care Products
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
 
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTMARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
 
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
 
KeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceKeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data science
 
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptx
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
 
World Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabWorld Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlab
 
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
 
Cheminformatics tools supporting dissemination of data associated with US EPA...
Cheminformatics tools supporting dissemination of data associated with US EPA...Cheminformatics tools supporting dissemination of data associated with US EPA...
Cheminformatics tools supporting dissemination of data associated with US EPA...
 
Controlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentControlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform Environment
 
Pests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPR
 
Main Exam Applied biochemistry final year
Main Exam Applied biochemistry final yearMain Exam Applied biochemistry final year
Main Exam Applied biochemistry final year
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptx
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
 
TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)
 
M.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsM.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery Systems
 
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 

Deep learning in medicine: An introduction and applications to next-generation sequencing and disease diagnostics

  • 1. Confidential + Proprietary Deep learning in medicine: An introduction and applications to next-generation sequencing and disease diagnostics Allen Day, PhD, allenday@google.com, Twitter @allenday
  • 4. Confidential & Proprietary Observation: programming a computer to be clever is harder than programming a computer to learn to be clever. Intro to machine learning and deep learning
  • 5. Confidential & Proprietary Traditional Machine Learning...vs the new way The old way: Write a computer program with explicit rules to follow if email contains V!agrå then mark is-spam; if email contains … if email contains … The new way: Write a computer program to learn from examples try to classify some emails; change self to reduce errors; repeat;
  • 6. Confidential + Proprietary Proprietary & Confidential Deep Neural Networks Step 1: training
  • 7. Confidential + Proprietary Proprietary & Confidential Deep Neural Networks Step 2: inference
  • 8. [Tiger-Dog]: 0.9890 [Tiger] : 0.9791 [Dog] : 0.9311 [Pet] : 0.8139 [Fence] : 0.7998 … [ゴジラ  ]: 0.0120 Proprietary & Confidential
  • 9. Confidential & Proprietary Key Innovation: Learns Features from the Data HIGH LEVEL COMPLEX DETECTORS PARTS OF OBJECTS, MORE COMPLEX PATTERNS PRIMITIVE FEATURES: EDGES, BLOCKS OF COLORS, ETC. INPUT: RAW DATA
  • 10. Confidential & Proprietary “cat” Deep Learning Revolution Modern Reincarnation of Artificial Neural Networks Collection of simple trainable mathematical units, organized in layers, that work together to solve complicated tasks Key Benefit Learns features from raw, heterogeneous data No explicit feature engineering required What’s New layered network architecture, new training math, *scale*
  • 11. Proprietary & Confidential Accuracy Scale (data size, model size) 1980s and 1990s neural networks other approaches
  • 12. Proprietary & Confidential more computeAccuracy Scale (data size, model size) neural networks other approaches 1980s and 1990s
  • 13. Proprietary & Confidential more computeAccuracy Scale (data size, model size) neural networks other approaches Now
  • 14. Szegedy et al, 2014 “Inception” Module. Auxiliary Classifiers Pr(dog) GoogLeNet (aka “Inception”) Architecture Main Classifier Proprietary & Confidential
  • 15. Confidential & Proprietary* Human Performance based on analysis done by Andrej Karpathy. More details here. %errors Year Image understanding is getting better than human level ImageNet Challenge: Given an image, predict one of 1000+ classes
  • 16. Confidential & Proprietary Search Search ranking Speech recognition Gmail Smart Reply Spam classification Photos Photos search Translate text, graphic, and speech translations Android Keyboard & speech input Drive Intelligence in Apps YouTube Video recommendations Better thumbnails Cardboard Smart stitching Play App recommendations Game developer experience Ads Richer Text Ads Automated Bidding Chrome Search by Image Maps Street View image Parsing Local Search Machine learning has transformed Google’s products
  • 18. Confidential + Proprietary Medical applications of deep learning technology ● Deep learning has remarkable efficacy ○ Amazing with images: photos, search, streetview, Android cameras, … ○ And with speech, language, data centers, … ● How and where can we apply this in medicine and biotechnology? ○ Medical imaging: ophthalmology, pathology, ... ○ Genomics ○ ...
  • 19. Confidential + ProprietaryConfidential + Proprietary Diabetes causes blindness 5-10% of population is diabetic Should be screened annually for diabetic retinopathy Fastest growing cause of blindness # Diabetics >> qualified graders ● 387M diabetics, 200k ophthalmologists ● Grading is highly technical Poor adherence to care plan ● No symptoms, preventive not curative ● 30-50% screened in US ● 10% in high risk populations ● Many lost to follow up
  • 20. Confidential + Proprietary How DR is Diagnosed: Retinal Fundus Images Healthy Diseased Hemorrhages No DR Mild DR Moderate DR Severe DR Proliferative DR
  • 21. Confidential + Proprietary Even when available, ophthalmologists are not consistent... Consistency: intragrader ~65%, intergrader ~60% Ophthalmologist Graders Patient Images
  • 22. Confidential + Proprietary Adapt deep neural network to read fundus images Conv Network - 26 layers No DR Mild DR Moderate DR Severe DR Proliferative DR Labeling tool 54 ophthalmologists 130k images 880k diagnoses
  • 23. Confidential + Proprietary 0.95 F-score Algorithm Ophthalmologist (median) 0.91 “The study by Gulshan and colleagues truly represents the brave new world in medicine.” “Google just published this paper in JAMA (impact factor 37) [...] It actually lives up to the hype.” Dr. Andrew Beam, Dr. Isaac Kohane Harvard Medical School Dr. Luke Oakden-Rayner University of Adelaide
  • 24. Confidential + Proprietary Digital pathology JAMA. 2015; 313(11):1122-1132 Correct diagnosis 87% 48% 84% 96% 75% Example: Breast Cancer Biopsies Overdiagnosis Underdiagnosis 1 in 12 breast cancer biopsies is misdiagnosed (population adjusted) Similar for other cancer types (prostate 1 in 7, etc)
  • 25. Confidential + Proprietary Detecting breast cancer metastases in lymph nodes detail ←→ context Multi scale model resembles microscope magnifications ● Goal: train a deep learning model to identify cancerous cells in pathology slide images ● Output: a map over the whole image, indicating the probability that each region harbors cancer cells ● Trained on ~23M images patches extracted from gigapixel slide images of normal (n=127) and cancerous (n=88) tissues from Camelyon16 dataset
  • 26. Confidential + Proprietary Tumor localization score (FROC) of 0.89 vs 0.73 for pathologist with unlimited time (92% sensitivity with 8 false positives per slide vs. 73% sensitivity with 0 false positives per slide) Slide level classification of AUC of 0.96 (on par with pathologist) Predicted RegionsGround truth MaskOriginal Slide Metastatic cell detection results are encouraging Cancer cells Read more at https://arxiv.org/abs/1703.02442
  • 27. Confidential + Proprietary Deep learning in genomics New application area Example papers: Alipanahi et al (2015), Park Y, Kellis M (2015); Xiong et al (2015); Zhou, Troyanskaya (2015); Angermueller et al (2016) Deep learning to call variants Goals: (1) replace statistical machinery with single deep learning model; (2) state-of-the-art or better performance; (3) generalize to new technologies. Start with human germline Use the germline case to figure out deep learning data representation and models. Extend the approach to somatic mutations, non-human, etc.. Variant calling Key challenge in genomics due to complex errors of NGS technologies. Current error rates vary from <1% for germline SNPs to >25% somatic indels.
  • 28. Confidential + Proprietary Where should we get started applying deep learning to genetics and genomics problems? Must-haves for deep learning ● Lots of data: >50k examples, >1M ideal. ● High-quality input data and labels for training. ● The mapping from data=>label is unknown but certainly exists. ● High-quality previous efforts so we know that deep learning is key. ○ i.e., hard to solve with classical statistical/ML approaches. SNP and indel calling from NGS data
  • 29. Confidential + Proprietary Figuring out the true genome sequence from NGS data is a computational and statistical challenge .......... cttgggttga tattgtcttg gaacatggag gttgtgtcac cgtaatggca caggacaaac cgactgtcga catagagctg gttacaacaa cagtcagcaa catggcggag gtaagatcct actgctatga ggcatcaata tcagacatgg cttcggacag .......... True genome sequence: 3 billion bases in 23 contiguous chunks (chromosomes) Actual sequencer output: ~1 billion ~100 basepair long DNA reads (30x coverage) Reference: ...ttgtcttggaacatggaggttgtgtcaccgtaatggcacaggacaaacc... Read1: ...ttgtcttggaacatggaggttgtgtgaccgtaatggcacaggacaaacc Read2: ...ttgtcttggaacatggaggttgtgtgaccgtaatggcacaggacaaacc... Read3: tggaacatggaggttgtgtgaccgtaatggcacaggacaaacc... Align reads to a reference genome Infer the true genomic sequence(s)* Step 1 Step 2 Read1: cttgggttgatattgtcttggaacatggaggttgtgtcaccgtaatggcacaggacaaacc Read2: gatattgtcttggaacatggaggttgtgtcaccgtaatggcacaggacaaaccgactgtcg Read3: tggaacatggaggttgtgtcaccgtaatggcacaggacaaaccgactgtcgacatagagct Read4: ggttgtgtcaccgtaatggcacaggacaaaccgactgtcgacatagagctggttactgtcg .... Read 1,000,000,000: ....caactgtcgacatagagctggttactgtcgacatagagctggtt Reads aligned to a reference genome Same as reference Same as reference
  • 30. Confidential + Proprietary A complex error process makes it difficult to call variants accurately in NGS data Errors come from many uncontrollable sources Quality of the sample DNA Protocol used to prepare sample for the sequencer From physical properties of instrument itself Data processing artifacts Errors are correlated among the reads The most accurate variant callers, such as the GATK, use multiple techniques, e.g. ● Logistic regression ● Hidden Markov Models ● Bayesian inference ● Gaussian mixture models All make approximations known to be invalid Existing statistical techniques work okay... ...but have well-known drawbacks Rely on hand-crafted features Hand optimized parameters Require years of work by domain experts Specialized to specific prep, sequencer, tool chain, etc Hard to generalize to new technologies
  • 31. Confidential + Proprietary Other features ACGTGCCCCAAACGTGATGATC ACGTGCCCCAACC--------- --GTGCCCCAAACGT------- ----GCCCCAAACGTGA----- -------CCAACCGTGATG--- --------CAAACGTGATGATC ----------ACCGTGATGATC Ref Read bases Qualities Pileup image A A A C C C A Reference Reads Candidate site 0.01 0.95 0.04 hom ref het hom alt Heterozygous variant call Genotype likelihoods CNN Find candidate variants Create pileup images Evaluate image and call variants DeepVariant Recasting variant calling for deep learning
  • 32. Confidential + Proprietary Recasting variant calling for deep learning Encoding is roughly red = {A,C,G,T}; green = {quality score}; blue = {read strand}; alpha = {matches ref genome} True SNPs True Indels False variants Encode reads and reference genome as images
  • 33. Confidential + Proprietary Recasting variant calling for deep learning Use inception-v3 to call variant genotype Szegedy et al. 2015, https://arxiv.org/abs/1512.00567
  • 34. Confidential + Proprietary Genome in a Bottle provides ground truth human variation ● Extensive sequencing by orthogonal methods of single human (NA12878) ● Stringent criteria identify “callable genomic regions” and true variants ○ ~3.7M regions (covering 80% of genome) identified as callable ○ ~2.8M single nucleotide polymorphisms ○ ~350k small insertion/deletions ● Train and test on biological replicates of NA12878 ○ Each germline WGS dataset provides ~3.7M labeled training variants ○ 2.1M true heterozygous variants ○ 1.3M true homozygous variants ○ 215k false positive variants Zook et al. 2014
  • 35. Confidential + Proprietary DeepVariant works well in our in-house evaluations Train model on training chromosomes Evaluate on held-out chromosomes Call variants Outperforms GATK on human dataMethodology
  • 36. Confidential + Proprietary Estimated P(error) [Phred-scaled, -10 log10(P(error))] DeepVariant GATK Perfect calibration lineObservedP(error) This is the calibration for heterozygous SNPs but other variant types and genotype states are similar. DeepVariant learns an accurate model of the likelihood function P(genotype | reads)
  • 37. Confidential + Proprietary DeepVariant learns an accurate model of the likelihood function P(genotype | reads) ● Variants should be correct at the assigned confidence rate to be well-calibrated ● Genotype likelihoods are the critical input to genomic analyses such as imputation, de-novo mutation and association Most callers are overconfident in their likelihoods
  • 38. Confidential + Proprietary After lots of internal testing, we entered into the public FDA-sponsored PrecisionFDA competition in April 2016 Unblinded training sample Blinded evaluation sample
  • 39. Confidential + Proprietary 99.85 98.91 DeepVariant won an award at the 2016 PrecisionFDA competition v2 => v3 truth set for unblinded sample Unblinded => blinded sample with v3 truth set F-measure is the harmonic mean of precision and recall.
  • 40. Confidential + Proprietary A trained DeepVariant model encodes everything needed to call variants, enabling us to apply it in novel contexts Training data Evaluation data F1 b37 chr1-19 b38 chr20-22 99.45% b38 chr1-19 b38 chr20-22 99.47% You can train on one genome build and call variants on another You can train on human data and call mouse data F1 is the harmonic mean of precision and recall. Training data Evaluation data F1 Human chr1-19 Mouse chr18-19 98.29% Mouse chr1-17 Mouse chr18-19 97.84% Call variants on b38 using a model trained on either b37 or b38 with effectively identical quality. Means we can call on a genome build without needing all of the metadata mapped to that build. Robust to protocol differences; human: 50x 2x148bp HiSeq 2500; mouse: 27x 2x100bp GAII. Leverage the larger and better truth data on humans (e.g., ~5M in humans vs. ~700K in mouse) to call variants in other organisms.
  • 41. Confidential + Proprietary Dataset 10X Chromium 75x WGS Ion AmpliSeq exome PacBio raw reads 40x WGS SOLID SE 85x WGS Illumina TruSeq exome DeepVariant (F1 metric) 99.3% 96.9% 92.7% 86.4% 96.1% Comparator (F1 metric) 98.2% 97.3%1 56.1%2 78.8%3 95.4% Comparator caller Long Ranger TVC samtools GATK ensemble 1 Uses four lanes of data vs. one for DeepVariant; 2 No standard caller exists for this technology for human samples; 3 Old technology without any maintained variant callers. DeepVariant can learn to call variants in many sequencing technologies
  • 42. Confidential + Proprietary DeepVariant can learn to call variants at a range of input sequence depths Sensitivity Precision Sequencing depth Sequencing depth GATK DV 35-45x DV 4-45x DV 15-25x GATK DV 35-45x DV 4-45x DV 15-25x
  • 43. Confidential + Proprietary Proprietary & Confidential DeepVariant outperforms GATK on low-coverage samples Training on chromosomes 1-19 Evaluation on chromosomes 20-22
  • 44. Confidential + Proprietary DeepVariant conclusions ● Deep Learning is a remarkably powerful and flexible technology. ● Example of how to apply deep learning to a genomics problem. ● Equivalent or better performance than current variant calling tools. ● Works for many (any?) sequencing technology. ● Run now at https://cloud.google.com/genomics/v1alpha2/deepvariant ● Open-sourced version coming soon! ● Read more in our BioRxiv paper https://doi.org/10.1101/092890.
  • 45. Google confidential │ Do not distribute Google’s Data Research... 2002 2004 2006 2008 2010 2012 2014 2016 GFS MapReduce TensorFlow BigTable Dremel Colossus Flume Megastore Spanner Millwheel PubSub F1
  • 46. Google confidential │ Do not distribute ...are the technologies used in DeepVariant... 2002 2004 2006 2008 2010 2012 2014 2016 GFS MapReduce TensorFlow BigTable Dremel Colossus Flume Megastore Spanner Millwheel PubSub F1
  • 47. Google confidential │ Do not distribute ... which are available to you today on GCP 2002 2004 2006 2008 2010 2012 2014 2016 ML PubSub DataFlow DataStore DataFlow Cloud Storage BigQuery BigTable DataProc Cloud Storage
  • 48. Confidential + ProprietaryConfidential + Proprietary Sharing our tools with researchers and developers around the world repository for “machine learning” category on GitHub #1 TensorFlow released in Nov. 2015
  • 49. Build What’s Next Thank You! Allen Day, PhD // Science Advocate // @allenday // #genomics #ml #datascience Brain DeepMind Cloud Healthcare Verily Calico