SlideShare a Scribd company logo
1 of 24
Download to read offline
Open Source Bioinformatics 

for Data Scientists
Amanda Schierz
Recent Projects
! Druggability prediction
! 3D structure
! Protein Sequence
! Predict a protein’s druggability based on it’s position in the
protein-protein interaction network
! Drug Resistance
! Therapeutic opportunities
! Identification of new gene targets for cancer
! Are they Druggable?
! Candidate Compounds
! Compounds more likely to be a hit for a bioassay
Drug Discovery Process
Early-stage:
Discovery
Optimisation ADMET
Clinical
Trials
Paperwork
• Target Evaluation
• Compound
Screening
• Computational
Chemistry
• Structure-
based Drug
Design
• Absorption
Distribution
Metabolism
Excretion
Toxicity
• Patient
Stratification
• Protocol
• Drug Approval
Biology 101
! There is a many to many relationship between Gene and Protein
! A Protein is a large molecule; a Drug is a small molecule
! Gene Expression data
! The amount of a gene produced. Epigenetics.
! highly / lowly / over / under – fold change
! Warning: Platforms and preprocessing
! Gene Copy Number
! Loss / Gain a gene
! On one strand or 2?
! There are only approx. 400 genetic targets of approved
pharmaceuticals
! Only from a handful of Protein Families
! Desperate need for diversity
! TCGGTCAGGCTAGCCGTTACAGGG
Target Identification
! Prediction of disease-associated genes
! patient level
! gene / protein level
! network
! Prediction of mechanisms of disease
! Epigenetic targets – meta-targets
! Prediction of protein function – from sequence / structure / network
! multi-class; multi-label
! Prediction of 3D structure
! Prediction of protein binding
! New immune targets
Druggability Prediction
! Drugs – FDA Approved ~350 Very strict – know
therapeutic benefit
! Drugbank – loose – binds but no therapeutic benefit
! Tractable or Druggable
! Rule of 5 compliant
! Precedence-based
- Druggable families / Homology
- Ligand-based scoring
- Uniprot, bioassays – EBI and Pubchem bioassay
- Statistical analysis
Druggability Prediction
! Sequence Analysis
- Amino Acid motifs and composition
- Physicochemical descriptors
- infinite amount – very wide data set
- Supervised classification
! FASTA - can download all human sequences from Uniprot
>seq0
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTD
! R ProtR ; R Bioconductor
! species,mhc,peptide_length,A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y
,V,scl1.lag1,scl2.lag1,scl1.lag2,scl2.lag2,scl1.2.lag1,scl2.1.lag1,scl1.2.l
ag2,scl2.1.lag2,AA,RA,NA,DA,CA,EA,QA,GA,HA,IA,LA,KA,MA,F
A,PA,SA,TA,WA,YA,VA,AR,RR,NR,DR,CR ..... ,Schneider.Xr.K,Schn
eider.Xr.M,Schneider.Xr.F, Grantham.Xr.A,Grantham.Xr.R,
Druggability Prediction
! 3D structure
- Pockets, surface area
- Ligand interaction fingerprints
- Supervised classification
3D Structure
! PDB, ProtDCal, PockDrug
Druggability Prediction
! Interaction Network
! Many use cases
! Data from EBI and Y2H
! List of binary interactions
! Becareful 1: Data is inherently biased
! Becareful 2: Complex interactions
! R iGraph; Gephi for visualisation
! Topological properties
! Community analysis
! Subgraph analysis
! Statistical analysis, network analysis and supervised
classification
Drug Resistance
Drug Resistance
Compound Bioactivity
! Brute force mass screening
! 1000s compounds screened in batches
! Primary Assays; Secondary / confirmatory assays
! Can be binary classification or regression
! The IC50 is a measure of how effective a drug is.
! Active / inactive : IC50 threshold
! Goal is also to identify diverse compound structures
! Scaffold Hopping
! Same kind of method as Protein Sequence conversion
! Pharmacophore fingerprints
! https://www.chemaxon.com/free-software/
Compound ADMET
! Many use cases
! ADMET of hits
! Absorption
! Distribution
! Metabolism
! Excretion
! Toxicity
! Mutagenecity
! Protein binding
General Resources
! EBI European Bioinformatics Institute / Pubchem
! API
! Integrates several downloadable Data Sources (expression, Copy
Number, Bioassays, network, disease-specific)
! Baseline data (Normal not diseased)
! Protein Data Bank – 3D Structures
! DrugBank
! Cancer – The Cancer Genome Atlas (TCGA) and International
Cancer Genome Consortium (ICGC)
! Coding Tools – R Bioconductor , BioPerl, BioPython
! https://docs.chemaxon.com/display/docs/Documentation
General Resources
! canSAR database
! Integration of biological, pharmacological, chemical, structural
biology and protein network data
Beware 101
! Non-standard Gene names
! Some experiments Genes, some are Proteins
! We need new Drug Targets, different from established ones.
! Keep in mind when analysing results
! Cancer is difficult
! Drug resistance
! Data is not up with the science
! Tumour Heterogeneity
! Wide data = random patterns
! Different expression / sequencing platforms
Therapeutic Opportunities
! Approximately only 350 - 400 protein targets
! DNA damage response (DDR) is essential for maintaining
the genomic integrity of the cell
! Currently targeted by chemotherapy and radiation. Goal is for
small molecule targeting
! TCGA Patient Analysis: Expression, Copy Number Variation
and Mutation data.
! 15 cancer disease types
! Telegraph March 2015
! New drugs to tackle cancer cell weak spots could end
'scattergun' chemotherapy
Laurence H. Pearl, Amanda C. Schierz, Simon E. Ward, Bissan Al-Lazikani, Frances M. G.
Pearl. Therapeutic opportunities within the DNA Damage Response. Nature Cancer Reviews
Therapeutic Opportunities
! Statistical analysis of DDR deregulation in patients compared
to a random set of genes
! Druggability prediction of deregulated DDR genes
! Synthetic Lethality analysis of Yeast DDR orthologues
! Two genes are synthetic lethal if mutation of either alone is fine
but mutation of both leads to cell death. Targeting a gene that is
synthetic lethal to a cancer-relevant mutation theoretically will
kill only cancer cells.
Therapeutic Opportunities
DDR Pathway Signatures

More Related Content

What's hot

Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...laserxiong
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challengesinside-BigData.com
 
Big Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowBig Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowKnome_Inc
 
Next Generation Sequencing application in virology
Next Generation Sequencing application in virologyNext Generation Sequencing application in virology
Next Generation Sequencing application in virologyEben Titus
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicJoaquin Dopazo
 
BE Retreat 2015 Poster
BE Retreat 2015 PosterBE Retreat 2015 Poster
BE Retreat 2015 PosterEric Ma
 
Robert Pesich_PAVA_Stanford Resume v. 8_22_16
Robert Pesich_PAVA_Stanford Resume v. 8_22_16Robert Pesich_PAVA_Stanford Resume v. 8_22_16
Robert Pesich_PAVA_Stanford Resume v. 8_22_16Robert Pesich
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisDespoina Kalfakakou
 
APPLICATION OF NEXT GENERATION SEQUENCING (NGS) IN CANCER TREATMENT
APPLICATION OF  NEXT GENERATION SEQUENCING (NGS)  IN CANCER TREATMENTAPPLICATION OF  NEXT GENERATION SEQUENCING (NGS)  IN CANCER TREATMENT
APPLICATION OF NEXT GENERATION SEQUENCING (NGS) IN CANCER TREATMENTDinie Fariz
 
The Monarch Initiative: A semantic phenomics approach to disease discovery
The Monarch Initiative: A semantic phenomics approach to disease discoveryThe Monarch Initiative: A semantic phenomics approach to disease discovery
The Monarch Initiative: A semantic phenomics approach to disease discoverymhaendel
 
Next generation sequencing in cancer treatment
Next generation sequencing in cancer treatment  Next generation sequencing in cancer treatment
Next generation sequencing in cancer treatment MarliaGan
 
Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...Despoina Kalfakakou
 
The Application of Next Generation Sequencing (NGS) in cancer treatment
The Application of Next Generation Sequencing (NGS) in cancer treatmentThe Application of Next Generation Sequencing (NGS) in cancer treatment
The Application of Next Generation Sequencing (NGS) in cancer treatmentPremadarshini Sai
 
The Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision MedicineThe Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision Medicinemhaendel
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput SequencingMark Pallen
 
Chimeric Antigen Receptors (paper with corresponding power point)
Chimeric Antigen Receptors (paper with corresponding power point)Chimeric Antigen Receptors (paper with corresponding power point)
Chimeric Antigen Receptors (paper with corresponding power point)Kevin B Hugins
 

What's hot (19)

Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
 
Big Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowBig Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey Nislow
 
Next Generation Sequencing application in virology
Next Generation Sequencing application in virologyNext Generation Sequencing application in virology
Next Generation Sequencing application in virology
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
Gamida cell ppt_english_5-5-2010
Gamida cell ppt_english_5-5-2010Gamida cell ppt_english_5-5-2010
Gamida cell ppt_english_5-5-2010
 
BE Retreat 2015 Poster
BE Retreat 2015 PosterBE Retreat 2015 Poster
BE Retreat 2015 Poster
 
Robert Pesich_PAVA_Stanford Resume v. 8_22_16
Robert Pesich_PAVA_Stanford Resume v. 8_22_16Robert Pesich_PAVA_Stanford Resume v. 8_22_16
Robert Pesich_PAVA_Stanford Resume v. 8_22_16
 
Single cell pcr
Single cell pcrSingle cell pcr
Single cell pcr
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
APPLICATION OF NEXT GENERATION SEQUENCING (NGS) IN CANCER TREATMENT
APPLICATION OF  NEXT GENERATION SEQUENCING (NGS)  IN CANCER TREATMENTAPPLICATION OF  NEXT GENERATION SEQUENCING (NGS)  IN CANCER TREATMENT
APPLICATION OF NEXT GENERATION SEQUENCING (NGS) IN CANCER TREATMENT
 
The Monarch Initiative: A semantic phenomics approach to disease discovery
The Monarch Initiative: A semantic phenomics approach to disease discoveryThe Monarch Initiative: A semantic phenomics approach to disease discovery
The Monarch Initiative: A semantic phenomics approach to disease discovery
 
Next generation sequencing in cancer treatment
Next generation sequencing in cancer treatment  Next generation sequencing in cancer treatment
Next generation sequencing in cancer treatment
 
Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...
 
The Application of Next Generation Sequencing (NGS) in cancer treatment
The Application of Next Generation Sequencing (NGS) in cancer treatmentThe Application of Next Generation Sequencing (NGS) in cancer treatment
The Application of Next Generation Sequencing (NGS) in cancer treatment
 
The Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision MedicineThe Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision Medicine
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput Sequencing
 
Chimeric Antigen Receptors (paper with corresponding power point)
Chimeric Antigen Receptors (paper with corresponding power point)Chimeric Antigen Receptors (paper with corresponding power point)
Chimeric Antigen Receptors (paper with corresponding power point)
 
Rossen eccmid2015v1.5
Rossen eccmid2015v1.5Rossen eccmid2015v1.5
Rossen eccmid2015v1.5
 

Viewers also liked

Biology Endocrine Powerpoint
Biology Endocrine PowerpointBiology Endocrine Powerpoint
Biology Endocrine Powerpointjulie92
 
Pure White 2008 Cdr
Pure White 2008 CdrPure White 2008 Cdr
Pure White 2008 CdrLilian Koh
 
Soluble Lectin-Like Oxidized LDL Receptor-1 and High-Sensitivity Troponin T a...
Soluble Lectin-Like Oxidized LDL Receptor-1 and High-Sensitivity Troponin T a...Soluble Lectin-Like Oxidized LDL Receptor-1 and High-Sensitivity Troponin T a...
Soluble Lectin-Like Oxidized LDL Receptor-1 and High-Sensitivity Troponin T a...Nagi Abdalla
 
Role of Leptin in Obesity
Role of Leptin in Obesity Role of Leptin in Obesity
Role of Leptin in Obesity Rajat Chaudhary
 
molecular docking
molecular dockingmolecular docking
molecular dockingKOUSHIK DEB
 
Human endocrine system
Human endocrine systemHuman endocrine system
Human endocrine systemGotov .kz
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionArindam Ghosh
 
Protein-Ligand Docking
Protein-Ligand DockingProtein-Ligand Docking
Protein-Ligand Dockingbaoilleach
 
Protein 3D structure and classification database
Protein 3D structure and classification database Protein 3D structure and classification database
Protein 3D structure and classification database nadeem akhter
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure predictionkaramveer prajapat
 
Disorders of pigmentation
Disorders of pigmentationDisorders of pigmentation
Disorders of pigmentationdrangelosmith
 
protein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modellingprotein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modellingDileep Paruchuru
 
Molecular docking
Molecular dockingMolecular docking
Molecular dockingpalliyath91
 
Molecular docking and_virtual_screening
Molecular docking and_virtual_screeningMolecular docking and_virtual_screening
Molecular docking and_virtual_screeningFlorent Barbault
 
Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionRoshan Karunarathna
 

Viewers also liked (18)

Biology Endocrine Powerpoint
Biology Endocrine PowerpointBiology Endocrine Powerpoint
Biology Endocrine Powerpoint
 
Abhishek seminar
Abhishek seminarAbhishek seminar
Abhishek seminar
 
Pure White 2008 Cdr
Pure White 2008 CdrPure White 2008 Cdr
Pure White 2008 Cdr
 
Soluble Lectin-Like Oxidized LDL Receptor-1 and High-Sensitivity Troponin T a...
Soluble Lectin-Like Oxidized LDL Receptor-1 and High-Sensitivity Troponin T a...Soluble Lectin-Like Oxidized LDL Receptor-1 and High-Sensitivity Troponin T a...
Soluble Lectin-Like Oxidized LDL Receptor-1 and High-Sensitivity Troponin T a...
 
Role of Leptin in Obesity
Role of Leptin in Obesity Role of Leptin in Obesity
Role of Leptin in Obesity
 
molecular docking
molecular dockingmolecular docking
molecular docking
 
Human endocrine system
Human endocrine systemHuman endocrine system
Human endocrine system
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Melanin synthesis
Melanin synthesisMelanin synthesis
Melanin synthesis
 
Protein-Ligand Docking
Protein-Ligand DockingProtein-Ligand Docking
Protein-Ligand Docking
 
Protein 3D structure and classification database
Protein 3D structure and classification database Protein 3D structure and classification database
Protein 3D structure and classification database
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
Disorders of pigmentation
Disorders of pigmentationDisorders of pigmentation
Disorders of pigmentation
 
protein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modellingprotein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modelling
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
Molecular docking and_virtual_screening
Molecular docking and_virtual_screeningMolecular docking and_virtual_screening
Molecular docking and_virtual_screening
 
Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure prediction
 

Similar to Open-Source Bioinformatics for Data Scientists with Amanda Schierz

Molecular techniques for pathology research - MDX .pdf
Molecular techniques for pathology research - MDX .pdfMolecular techniques for pathology research - MDX .pdf
Molecular techniques for pathology research - MDX .pdfsabyabby
 
Genomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentGenomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentSuchittaU
 
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
2014 11-27 ODDP 2014 course, Amsterdam, Alain van GoolAlain van Gool
 
TRANSPARENT AI/ML TO DISCOVER NOVEL THERAPEUTICS FOR RNA SPLICING-MEDIATED DI...
TRANSPARENT AI/ML TO DISCOVER NOVEL THERAPEUTICS FOR RNA SPLICING-MEDIATED DI...TRANSPARENT AI/ML TO DISCOVER NOVEL THERAPEUTICS FOR RNA SPLICING-MEDIATED DI...
TRANSPARENT AI/ML TO DISCOVER NOVEL THERAPEUTICS FOR RNA SPLICING-MEDIATED DI...iQHub
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance
 
Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...Syed Muhammad Ali Hasnain
 
PadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxPadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxDESMONDEZIEKE1
 
2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekingeProf. Wim Van Criekinge
 
Sundaram et al. 2018 Presentation
Sundaram et al. 2018 PresentationSundaram et al. 2018 Presentation
Sundaram et al. 2018 PresentationBrianSchilder
 
Cell Authentication By STR Profiling
Cell Authentication By STR ProfilingCell Authentication By STR Profiling
Cell Authentication By STR ProfilingCreative-Bioarray
 
Introduction to the drug discovery process
Introduction to the drug discovery processIntroduction to the drug discovery process
Introduction to the drug discovery processThanh Truong
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designingDr NEETHU ASOKAN
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureAffymetrix
 
Drug metabolism and toxicity 2013
Drug metabolism and toxicity 2013Drug metabolism and toxicity 2013
Drug metabolism and toxicity 2013Elsa von Licy
 

Similar to Open-Source Bioinformatics for Data Scientists with Amanda Schierz (20)

Molecular techniques for pathology research - MDX .pdf
Molecular techniques for pathology research - MDX .pdfMolecular techniques for pathology research - MDX .pdf
Molecular techniques for pathology research - MDX .pdf
 
Genomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentGenomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and development
 
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
 
TRANSPARENT AI/ML TO DISCOVER NOVEL THERAPEUTICS FOR RNA SPLICING-MEDIATED DI...
TRANSPARENT AI/ML TO DISCOVER NOVEL THERAPEUTICS FOR RNA SPLICING-MEDIATED DI...TRANSPARENT AI/ML TO DISCOVER NOVEL THERAPEUTICS FOR RNA SPLICING-MEDIATED DI...
TRANSPARENT AI/ML TO DISCOVER NOVEL THERAPEUTICS FOR RNA SPLICING-MEDIATED DI...
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...
 
Genomics
GenomicsGenomics
Genomics
 
Biomarkers & Clinical Research
Biomarkers & Clinical ResearchBiomarkers & Clinical Research
Biomarkers & Clinical Research
 
PadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxPadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptx
 
Genomics and proteomics
Genomics and proteomicsGenomics and proteomics
Genomics and proteomics
 
2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge
 
EID_lec3_Bishai.pdf
EID_lec3_Bishai.pdfEID_lec3_Bishai.pdf
EID_lec3_Bishai.pdf
 
Sundaram et al. 2018 Presentation
Sundaram et al. 2018 PresentationSundaram et al. 2018 Presentation
Sundaram et al. 2018 Presentation
 
Marsh pers strat-mednov2014
Marsh pers strat-mednov2014Marsh pers strat-mednov2014
Marsh pers strat-mednov2014
 
MLGG_for_linkedIn
MLGG_for_linkedInMLGG_for_linkedIn
MLGG_for_linkedIn
 
Cell Authentication By STR Profiling
Cell Authentication By STR ProfilingCell Authentication By STR Profiling
Cell Authentication By STR Profiling
 
Introduction to the drug discovery process
Introduction to the drug discovery processIntroduction to the drug discovery process
Introduction to the drug discovery process
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designing
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
 
Drug metabolism and toxicity 2013
Drug metabolism and toxicity 2013Drug metabolism and toxicity 2013
Drug metabolism and toxicity 2013
 

More from Jessica Willis

ODSC Hackathon for Health October 2016
ODSC Hackathon for Health October 2016ODSC Hackathon for Health October 2016
ODSC Hackathon for Health October 2016Jessica Willis
 
Jon Sedar topic modelling presentation #odsc 2016
Jon Sedar topic modelling presentation #odsc 2016Jon Sedar topic modelling presentation #odsc 2016
Jon Sedar topic modelling presentation #odsc 2016Jessica Willis
 
Knime customer intelligence on social media odsc london
Knime customer intelligence on social media odsc london   Knime customer intelligence on social media odsc london
Knime customer intelligence on social media odsc london Jessica Willis
 
Deep learning frameworks v0.40
Deep learning frameworks v0.40Deep learning frameworks v0.40
Deep learning frameworks v0.40Jessica Willis
 
Ian huston getting started with cloud foundry
Ian huston   getting started with cloud foundryIan huston   getting started with cloud foundry
Ian huston getting started with cloud foundryJessica Willis
 
Iot analytics in wearables
Iot analytics in wearables Iot analytics in wearables
Iot analytics in wearables Jessica Willis
 
Data Science for Internet of Things with Ajit Jaokar
Data Science for Internet of Things with Ajit JaokarData Science for Internet of Things with Ajit Jaokar
Data Science for Internet of Things with Ajit JaokarJessica Willis
 

More from Jessica Willis (7)

ODSC Hackathon for Health October 2016
ODSC Hackathon for Health October 2016ODSC Hackathon for Health October 2016
ODSC Hackathon for Health October 2016
 
Jon Sedar topic modelling presentation #odsc 2016
Jon Sedar topic modelling presentation #odsc 2016Jon Sedar topic modelling presentation #odsc 2016
Jon Sedar topic modelling presentation #odsc 2016
 
Knime customer intelligence on social media odsc london
Knime customer intelligence on social media odsc london   Knime customer intelligence on social media odsc london
Knime customer intelligence on social media odsc london
 
Deep learning frameworks v0.40
Deep learning frameworks v0.40Deep learning frameworks v0.40
Deep learning frameworks v0.40
 
Ian huston getting started with cloud foundry
Ian huston   getting started with cloud foundryIan huston   getting started with cloud foundry
Ian huston getting started with cloud foundry
 
Iot analytics in wearables
Iot analytics in wearables Iot analytics in wearables
Iot analytics in wearables
 
Data Science for Internet of Things with Ajit Jaokar
Data Science for Internet of Things with Ajit JaokarData Science for Internet of Things with Ajit Jaokar
Data Science for Internet of Things with Ajit Jaokar
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Open-Source Bioinformatics for Data Scientists with Amanda Schierz

  • 1. Open Source Bioinformatics 
 for Data Scientists Amanda Schierz
  • 2. Recent Projects ! Druggability prediction ! 3D structure ! Protein Sequence ! Predict a protein’s druggability based on it’s position in the protein-protein interaction network ! Drug Resistance ! Therapeutic opportunities ! Identification of new gene targets for cancer ! Are they Druggable? ! Candidate Compounds ! Compounds more likely to be a hit for a bioassay
  • 3. Drug Discovery Process Early-stage: Discovery Optimisation ADMET Clinical Trials Paperwork • Target Evaluation • Compound Screening • Computational Chemistry • Structure- based Drug Design • Absorption Distribution Metabolism Excretion Toxicity • Patient Stratification • Protocol • Drug Approval
  • 4. Biology 101 ! There is a many to many relationship between Gene and Protein ! A Protein is a large molecule; a Drug is a small molecule ! Gene Expression data ! The amount of a gene produced. Epigenetics. ! highly / lowly / over / under – fold change ! Warning: Platforms and preprocessing ! Gene Copy Number ! Loss / Gain a gene ! On one strand or 2? ! There are only approx. 400 genetic targets of approved pharmaceuticals ! Only from a handful of Protein Families ! Desperate need for diversity
  • 6. Target Identification ! Prediction of disease-associated genes ! patient level ! gene / protein level ! network ! Prediction of mechanisms of disease ! Epigenetic targets – meta-targets ! Prediction of protein function – from sequence / structure / network ! multi-class; multi-label ! Prediction of 3D structure ! Prediction of protein binding ! New immune targets
  • 7. Druggability Prediction ! Drugs – FDA Approved ~350 Very strict – know therapeutic benefit ! Drugbank – loose – binds but no therapeutic benefit ! Tractable or Druggable ! Rule of 5 compliant ! Precedence-based - Druggable families / Homology - Ligand-based scoring - Uniprot, bioassays – EBI and Pubchem bioassay - Statistical analysis
  • 8. Druggability Prediction ! Sequence Analysis - Amino Acid motifs and composition - Physicochemical descriptors - infinite amount – very wide data set - Supervised classification ! FASTA - can download all human sequences from Uniprot >seq0 FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTD ! R ProtR ; R Bioconductor ! species,mhc,peptide_length,A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y ,V,scl1.lag1,scl2.lag1,scl1.lag2,scl2.lag2,scl1.2.lag1,scl2.1.lag1,scl1.2.l ag2,scl2.1.lag2,AA,RA,NA,DA,CA,EA,QA,GA,HA,IA,LA,KA,MA,F A,PA,SA,TA,WA,YA,VA,AR,RR,NR,DR,CR ..... ,Schneider.Xr.K,Schn eider.Xr.M,Schneider.Xr.F, Grantham.Xr.A,Grantham.Xr.R,
  • 9. Druggability Prediction ! 3D structure - Pockets, surface area - Ligand interaction fingerprints - Supervised classification
  • 10. 3D Structure ! PDB, ProtDCal, PockDrug
  • 11. Druggability Prediction ! Interaction Network ! Many use cases ! Data from EBI and Y2H ! List of binary interactions ! Becareful 1: Data is inherently biased ! Becareful 2: Complex interactions ! R iGraph; Gephi for visualisation ! Topological properties ! Community analysis ! Subgraph analysis ! Statistical analysis, network analysis and supervised classification
  • 12.
  • 15. Compound Bioactivity ! Brute force mass screening ! 1000s compounds screened in batches ! Primary Assays; Secondary / confirmatory assays ! Can be binary classification or regression ! The IC50 is a measure of how effective a drug is. ! Active / inactive : IC50 threshold ! Goal is also to identify diverse compound structures ! Scaffold Hopping ! Same kind of method as Protein Sequence conversion ! Pharmacophore fingerprints ! https://www.chemaxon.com/free-software/
  • 16.
  • 17. Compound ADMET ! Many use cases ! ADMET of hits ! Absorption ! Distribution ! Metabolism ! Excretion ! Toxicity ! Mutagenecity ! Protein binding
  • 18. General Resources ! EBI European Bioinformatics Institute / Pubchem ! API ! Integrates several downloadable Data Sources (expression, Copy Number, Bioassays, network, disease-specific) ! Baseline data (Normal not diseased) ! Protein Data Bank – 3D Structures ! DrugBank ! Cancer – The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) ! Coding Tools – R Bioconductor , BioPerl, BioPython ! https://docs.chemaxon.com/display/docs/Documentation
  • 19. General Resources ! canSAR database ! Integration of biological, pharmacological, chemical, structural biology and protein network data
  • 20. Beware 101 ! Non-standard Gene names ! Some experiments Genes, some are Proteins ! We need new Drug Targets, different from established ones. ! Keep in mind when analysing results ! Cancer is difficult ! Drug resistance ! Data is not up with the science ! Tumour Heterogeneity ! Wide data = random patterns ! Different expression / sequencing platforms
  • 21. Therapeutic Opportunities ! Approximately only 350 - 400 protein targets ! DNA damage response (DDR) is essential for maintaining the genomic integrity of the cell ! Currently targeted by chemotherapy and radiation. Goal is for small molecule targeting ! TCGA Patient Analysis: Expression, Copy Number Variation and Mutation data. ! 15 cancer disease types ! Telegraph March 2015 ! New drugs to tackle cancer cell weak spots could end 'scattergun' chemotherapy Laurence H. Pearl, Amanda C. Schierz, Simon E. Ward, Bissan Al-Lazikani, Frances M. G. Pearl. Therapeutic opportunities within the DNA Damage Response. Nature Cancer Reviews
  • 22. Therapeutic Opportunities ! Statistical analysis of DDR deregulation in patients compared to a random set of genes ! Druggability prediction of deregulated DDR genes ! Synthetic Lethality analysis of Yeast DDR orthologues ! Two genes are synthetic lethal if mutation of either alone is fine but mutation of both leads to cell death. Targeting a gene that is synthetic lethal to a cancer-relevant mutation theoretically will kill only cancer cells.