SlideShare a Scribd company logo
1 of 17
Download to read offline
1
Mohamed A. Khamis, Walid Gomaa, Walaa F. Ahmed,
Machine learning in computational docking, Artificial Intelligence In Medicine (2015),
http://dx.doi.org/10.1016/j.artmed.2015.02.002
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Objective
http://dx.doi.org/10.1016/j.artmed.2015.02.0022
 The objective of this paper is to highlight the state-of-the-art
machine learning (ML) techniques in computational docking.
 The use of smart computational methods in the life cycle of
drug design is relatively a recent development that has
gained much popularity and interest over the last few years.
 Computational docking is the process of predicting the best
pose (orientation + conformation) of a small molecule (drug
candidate) when bound to a target larger receptor molecule
(protein) in order to form a stable complex molecule.
Background
3
• Background for protein-ligand interactions:
 Physical, chemical, and biological
• Molecular data formats:
 e.g., .mol, .pdb, .sdf, etc.
• Docking software programs:
 e.g., AutoDock, eHiTS, iDock, etc.
• Molecular databases:
 Containing data of proteins with their possible ligands
 e.g., PDB, PDBbind, Binding DB, DUD etc.
http://dx.doi.org/10.1016/j.artmed.2015.02.002
4
ligand (small drug molecule)large protein molecule stable complex molecule
Fitting Puzzle Pieces
Drug Design: Docking of Ligand with Target Protein
Binding Site
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Protein
HIV-1 protease (hsg1.pdb)
5
Ligand (Drug)
Indinavir (ind.pdb)
Formula:C36H47N5O4
Indinavir (IDV; trade name Crixivan,
manufactured by Merck) is inhibitor used
to treat HIV infection and AIDS.
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Complex molecule: Indinavir when fit into binding pocket of
receptor protein HIV-1 protease
6 http://dx.doi.org/10.1016/j.artmed.2015.02.002
Traditional Drug Design Methods
7
• Traditional drug design techniques - such as random screening and
chance discovery are essentially trial and error methods.
• And so they are very time consuming (10-15 years), very
expensive ($300M), with extremely low yield.
• For instance, over last 50 years, 500,000 compounds have been
tested for anti-cancer;
 Only 25 are in wide use today [1].
• On other hand, CADD is target specific, structure-based,
automatic, fast, and very low cost with high success rate.
1. Denny, William A., New Zealand Institute of Chemistry, The Design and
development of anti-cancer drugs. Available at
http://nzic.org.nz/ChemProcesses/biotech/12J.pdf.
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Scoring Function
8
 Is mathematical predictive model that produces a
score that represents binding free energy and
hence stability of resulting complex molecule.
 Generally, such function should produce set of credible
ligands ranked according to their binding stability
along with their binding poses
X-Score: Wang R, Lai L, Wang S. Further development and validation of
empirical scoring functions for structure-based binding affinity
prediction. J Computer-Aided Molecular Design 2002;16:11–26.
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Powers of Scoring Functions
9
 Scoring Power: Score protein-ligand complex;
correlation coefficient between predicted &
experimentally determined binding affinity.
 Ranking Power: Rank different ligands bound to
same target protein; successful ranking percentage.
 Docking Power: Identify native binding pose among
computer-generated decoys.
 Screening Power: Classification; True binders vs.
Negative Binders (random molecules).
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Classical Scoring functions
10
 Classical scoring functions e.g., X-Score rely only on fixed set of
molecular features (e.g., energy terms)
 Summed in linear weighted manner that fails to model non-linear
relationships among individual energy terms.
 In addition, weights of those individual energy terms are calibrated
based on specific protein family (using linear regression),
 Hence, classical scoring functions are more prone to over-fitting.
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Machine Learning-based Scoring functions
11
Ballester PJ. Machine learning approaches to predicting protein-ligand
binding. Presentation; Cambridge Computational Biology Institute - European
Molecular Biology Laboratory EMBL-EBI; Cambridge, United Kingdom; 2013.
Value to be
predicted
using
regression
techniques
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Training & Testing sets of PDBbind v. 2007
12
Ballester PJ. Machine learning approaches to predicting protein-ligand
binding. Presentation; Cambridge Computational Biology Institute - European
Molecular Biology Laboratory EMBL-EBI; Cambridge, United Kingdom; 2013.
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Results
http://dx.doi.org/10.1016/j.artmed.2015.02.00213
 We survey this paradigm shift elaborating on the main building
components of ML approaches used in molecular docking.
 For instance, the best random forest (RF)-based scoring function
(Li, 2014) on PDBbind v2007 achieves a Pearson correlation
coefficient between the predicted and experimentally
determined binding affinities of 0.803 while the best classical
scoring function achieves 0.644 (Cheng, 2009).
 The best RF-based ranking power (Ashtawy, 2012) ranks the
ligands correctly based on their experimentally determined
binding affinities with accuracy 62.5% and identifies the top
binding ligand with accuracy 78.1%.
Conclusion
14
 Machine Learning techniques give ability to utilize as
many relevant molecular features (e.g., geometric
features, pharmacophore features, etc.) as possible.
 Particularly, ensemble-based machine learning
approaches (e.g., random forest, boosted
regression trees, etc.) are resilient to over fitting.
 Yield good results not only on training complexes but on
any testing complexes as well.
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Acknowledgement
15
 This work is supported:
 Mainly by Information Technology Industry
Development Agency (ITIDA) under ITAC Program
grant number CFP#58
 In part by E-JUST Research Fellowship
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Publications
16
 Mohamed A. Khamis, Walid Gomaa, 2015,
Comparative Assessment of Scoring and Ranking
Powers of Machine-Learning-Based Scoring
Functions on an Updated Benchmark PDBbind
2013, Engineering Applications of Artificial Intelligence,
Elsevier. (submitted)
 Mohamed A. Khamis, Walid Gomaa, Basem Galal, 2015,
Deep Learning Competes Random Forest in
Computational Docking, IEEE/ACM Transactions on
Computational Biology and Bioinformatics. (submitted)
http://dx.doi.org/10.1016/j.artmed.2015.02.002
Questions
http://dx.doi.org/10.1016/j.artmed.2015.02.00217
 E-mail:
 mohamed.khamis@ejust.edu.eg
 mohamed.abdelaziz.khamis@gmail.com

More Related Content

What's hot

MLconf NYC Chang Wang
MLconf NYC Chang Wang MLconf NYC Chang Wang
MLconf NYC Chang Wang MLconf
 
Ligand efficiency: nice concept shame about the metrics
Ligand efficiency: nice concept shame about the metricsLigand efficiency: nice concept shame about the metrics
Ligand efficiency: nice concept shame about the metricsPeter Kenny
 
MedChemica Levinthal Lecture at Openeye CUP XX 2020
MedChemica Levinthal Lecture at Openeye CUP XX 2020MedChemica Levinthal Lecture at Openeye CUP XX 2020
MedChemica Levinthal Lecture at Openeye CUP XX 2020Ed Griffen
 
The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immu...
The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immu...The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immu...
The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immu...IJECEIAES
 
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACH
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACHGPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACH
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACHijdms
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AIDatabricks
 
Assessment of Decision Tree Algorithms on Student’s Recital
Assessment of Decision Tree Algorithms on Student’s RecitalAssessment of Decision Tree Algorithms on Student’s Recital
Assessment of Decision Tree Algorithms on Student’s RecitalIRJET Journal
 
Advanced statistical manual part i
Advanced statistical manual part iAdvanced statistical manual part i
Advanced statistical manual part iAyurdata
 
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS santosh Kumbhar
 
Docking Score Functions
Docking Score FunctionsDocking Score Functions
Docking Score FunctionsSAKEEL AHMED
 
Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...University Medicine Greifswald
 
Thermodynamics for medicinal chemistry design
Thermodynamics for medicinal chemistry designThermodynamics for medicinal chemistry design
Thermodynamics for medicinal chemistry designPeter Kenny
 
Molecular docking and_virtual_screening
Molecular docking and_virtual_screeningMolecular docking and_virtual_screening
Molecular docking and_virtual_screeningFlorent Barbault
 
RSC Hatfield 2018 Kinase meeting : potency patents MMPA approaches
RSC Hatfield 2018  Kinase meeting : potency patents MMPA approachesRSC Hatfield 2018  Kinase meeting : potency patents MMPA approaches
RSC Hatfield 2018 Kinase meeting : potency patents MMPA approachesEd Griffen
 
Digital webinar master deck final
Digital webinar master deck finalDigital webinar master deck final
Digital webinar master deck finalPistoia Alliance
 
Data Mining Using a Consensus Algorithm
Data Mining Using a Consensus AlgorithmData Mining Using a Consensus Algorithm
Data Mining Using a Consensus Algorithmjimarnold_slideshare
 

What's hot (20)

MLconf NYC Chang Wang
MLconf NYC Chang Wang MLconf NYC Chang Wang
MLconf NYC Chang Wang
 
Ligand efficiency: nice concept shame about the metrics
Ligand efficiency: nice concept shame about the metricsLigand efficiency: nice concept shame about the metrics
Ligand efficiency: nice concept shame about the metrics
 
Docking
DockingDocking
Docking
 
MedChemica Levinthal Lecture at Openeye CUP XX 2020
MedChemica Levinthal Lecture at Openeye CUP XX 2020MedChemica Levinthal Lecture at Openeye CUP XX 2020
MedChemica Levinthal Lecture at Openeye CUP XX 2020
 
The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immu...
The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immu...The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immu...
The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immu...
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACH
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACHGPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACH
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACH
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AI
 
Assessment of Decision Tree Algorithms on Student’s Recital
Assessment of Decision Tree Algorithms on Student’s RecitalAssessment of Decision Tree Algorithms on Student’s Recital
Assessment of Decision Tree Algorithms on Student’s Recital
 
BrazMedChem2014
BrazMedChem2014BrazMedChem2014
BrazMedChem2014
 
Dissertation
DissertationDissertation
Dissertation
 
Advanced statistical manual part i
Advanced statistical manual part iAdvanced statistical manual part i
Advanced statistical manual part i
 
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
 
Docking Score Functions
Docking Score FunctionsDocking Score Functions
Docking Score Functions
 
Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...
 
Thermodynamics for medicinal chemistry design
Thermodynamics for medicinal chemistry designThermodynamics for medicinal chemistry design
Thermodynamics for medicinal chemistry design
 
Molecular docking and_virtual_screening
Molecular docking and_virtual_screeningMolecular docking and_virtual_screening
Molecular docking and_virtual_screening
 
RSC Hatfield 2018 Kinase meeting : potency patents MMPA approaches
RSC Hatfield 2018  Kinase meeting : potency patents MMPA approachesRSC Hatfield 2018  Kinase meeting : potency patents MMPA approaches
RSC Hatfield 2018 Kinase meeting : potency patents MMPA approaches
 
Digital webinar master deck final
Digital webinar master deck finalDigital webinar master deck final
Digital webinar master deck final
 
Data Mining Using a Consensus Algorithm
Data Mining Using a Consensus AlgorithmData Mining Using a Consensus Algorithm
Data Mining Using a Consensus Algorithm
 

Viewers also liked

Protein-Ligand Docking
Protein-Ligand DockingProtein-Ligand Docking
Protein-Ligand Dockingbaoilleach
 
Whole Genome Regression using Bayesian Lasso
Whole Genome Regression using Bayesian LassoWhole Genome Regression using Bayesian Lasso
Whole Genome Regression using Bayesian LassoJinseob Kim
 
Robotics: Modelling, Planning and Control
Robotics: Modelling, Planning and ControlRobotics: Modelling, Planning and Control
Robotics: Modelling, Planning and ControlCody Ray
 
A LASSO for Linked Data
A LASSO for Linked DataA LASSO for Linked Data
A LASSO for Linked Datathosch
 
DRUG DESIGN AND DISCOVERY
DRUG DESIGN AND DISCOVERY DRUG DESIGN AND DISCOVERY
DRUG DESIGN AND DISCOVERY Rahul B S
 
Structure based drug design
Structure based drug designStructure based drug design
Structure based drug designADAM S
 
Molecular docking
Molecular dockingMolecular docking
Molecular dockingpalliyath91
 
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screeningDeependra Ban
 

Viewers also liked (14)

MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
 
Protein-Ligand Docking
Protein-Ligand DockingProtein-Ligand Docking
Protein-Ligand Docking
 
Whole Genome Regression using Bayesian Lasso
Whole Genome Regression using Bayesian LassoWhole Genome Regression using Bayesian Lasso
Whole Genome Regression using Bayesian Lasso
 
Robotics: Modelling, Planning and Control
Robotics: Modelling, Planning and ControlRobotics: Modelling, Planning and Control
Robotics: Modelling, Planning and Control
 
A LASSO for Linked Data
A LASSO for Linked DataA LASSO for Linked Data
A LASSO for Linked Data
 
Compactor
CompactorCompactor
Compactor
 
CADD Lecture
CADD LectureCADD Lecture
CADD Lecture
 
DRUG DESIGN AND DISCOVERY
DRUG DESIGN AND DISCOVERY DRUG DESIGN AND DISCOVERY
DRUG DESIGN AND DISCOVERY
 
Error analysis revised
Error analysis revisedError analysis revised
Error analysis revised
 
Drug design
Drug designDrug design
Drug design
 
Structure based drug design
Structure based drug designStructure based drug design
Structure based drug design
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
High Performance Concrete
High Performance ConcreteHigh Performance Concrete
High Performance Concrete
 
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
 

Similar to Machine learning in computational docking

Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...ijtsrd
 
Significance of computational tools in drug discovery
Significance of computational tools in drug discoverySignificance of computational tools in drug discovery
Significance of computational tools in drug discoveryDrMopuriDeepaReddy
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmShikha Popali
 
43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdfUmeshYadava1
 
Prediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR Model
Prediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR ModelPrediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR Model
Prediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR ModelIRJET Journal
 
Ieee projects 2012 2013 - Bio Informatics
Ieee projects 2012 2013 - Bio InformaticsIeee projects 2012 2013 - Bio Informatics
Ieee projects 2012 2013 - Bio InformaticsK Sundaresh Ka
 
Computer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
Computer Assisted Drug Design By Rauf Pathan and Patel Mo ShaffanComputer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
Computer Assisted Drug Design By Rauf Pathan and Patel Mo ShaffanPathan Rauf Khan
 
Computer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery toolComputer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery toolVikas Soni
 
COMPUTER AISES DRUG DESIGN .BY JAYA NT NIMKAR
COMPUTER AISES DRUG DESIGN .BY JAYA NT NIMKARCOMPUTER AISES DRUG DESIGN .BY JAYA NT NIMKAR
COMPUTER AISES DRUG DESIGN .BY JAYA NT NIMKAR78JAYANTNIMKAR
 
COMPUTER AIDED DRUG DESIGN BYJayant_Nimkar
COMPUTER AIDED DRUG DESIGN BYJayant_NimkarCOMPUTER AIDED DRUG DESIGN BYJayant_Nimkar
COMPUTER AIDED DRUG DESIGN BYJayant_Nimkar78JAYANTNIMKAR
 
NanoAgents: Molecular Docking Using Multi-Agent Technology
NanoAgents: Molecular Docking Using Multi-Agent TechnologyNanoAgents: Molecular Docking Using Multi-Agent Technology
NanoAgents: Molecular Docking Using Multi-Agent TechnologyCSCJournals
 
Basics Of Molecular Docking
Basics Of Molecular DockingBasics Of Molecular Docking
Basics Of Molecular DockingSatarupa Deb
 
Computer aided drug design(CADD)
Computer aided drug design(CADD)Computer aided drug design(CADD)
Computer aided drug design(CADD)Sameh Abdel-ghany
 
Caddfinal 170310151334
Caddfinal 170310151334Caddfinal 170310151334
Caddfinal 170310151334TimurKharsiev
 
PCOS Detect using Machine Learning Algorithms
PCOS Detect using Machine Learning AlgorithmsPCOS Detect using Machine Learning Algorithms
PCOS Detect using Machine Learning AlgorithmsIRJET Journal
 
Development Of Antimalarial Drugs by Computational Analysis of Malarial Paras...
Development Of Antimalarial Drugs by Computational Analysis of Malarial Paras...Development Of Antimalarial Drugs by Computational Analysis of Malarial Paras...
Development Of Antimalarial Drugs by Computational Analysis of Malarial Paras...IRJET Journal
 
Various Computational Tools used in Drug Design
Various Computational Tools used in Drug DesignVarious Computational Tools used in Drug Design
Various Computational Tools used in Drug DesignFirujAhmed2
 
Drug design based on bioinformatic tools
Drug design based on bioinformatic toolsDrug design based on bioinformatic tools
Drug design based on bioinformatic toolsSujeethKrishnan
 

Similar to Machine learning in computational docking (20)

Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
 
Significance of computational tools in drug discovery
Significance of computational tools in drug discoverySignificance of computational tools in drug discovery
Significance of computational tools in drug discovery
 
Docking
DockingDocking
Docking
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
 
43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf
 
Prediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR Model
Prediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR ModelPrediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR Model
Prediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR Model
 
Ieee projects 2012 2013 - Bio Informatics
Ieee projects 2012 2013 - Bio InformaticsIeee projects 2012 2013 - Bio Informatics
Ieee projects 2012 2013 - Bio Informatics
 
DD.pptx
DD.pptxDD.pptx
DD.pptx
 
Computer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
Computer Assisted Drug Design By Rauf Pathan and Patel Mo ShaffanComputer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
Computer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
 
Computer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery toolComputer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery tool
 
COMPUTER AISES DRUG DESIGN .BY JAYA NT NIMKAR
COMPUTER AISES DRUG DESIGN .BY JAYA NT NIMKARCOMPUTER AISES DRUG DESIGN .BY JAYA NT NIMKAR
COMPUTER AISES DRUG DESIGN .BY JAYA NT NIMKAR
 
COMPUTER AIDED DRUG DESIGN BYJayant_Nimkar
COMPUTER AIDED DRUG DESIGN BYJayant_NimkarCOMPUTER AIDED DRUG DESIGN BYJayant_Nimkar
COMPUTER AIDED DRUG DESIGN BYJayant_Nimkar
 
NanoAgents: Molecular Docking Using Multi-Agent Technology
NanoAgents: Molecular Docking Using Multi-Agent TechnologyNanoAgents: Molecular Docking Using Multi-Agent Technology
NanoAgents: Molecular Docking Using Multi-Agent Technology
 
Basics Of Molecular Docking
Basics Of Molecular DockingBasics Of Molecular Docking
Basics Of Molecular Docking
 
Computer aided drug design(CADD)
Computer aided drug design(CADD)Computer aided drug design(CADD)
Computer aided drug design(CADD)
 
Caddfinal 170310151334
Caddfinal 170310151334Caddfinal 170310151334
Caddfinal 170310151334
 
PCOS Detect using Machine Learning Algorithms
PCOS Detect using Machine Learning AlgorithmsPCOS Detect using Machine Learning Algorithms
PCOS Detect using Machine Learning Algorithms
 
Development Of Antimalarial Drugs by Computational Analysis of Malarial Paras...
Development Of Antimalarial Drugs by Computational Analysis of Malarial Paras...Development Of Antimalarial Drugs by Computational Analysis of Malarial Paras...
Development Of Antimalarial Drugs by Computational Analysis of Malarial Paras...
 
Various Computational Tools used in Drug Design
Various Computational Tools used in Drug DesignVarious Computational Tools used in Drug Design
Various Computational Tools used in Drug Design
 
Drug design based on bioinformatic tools
Drug design based on bioinformatic toolsDrug design based on bioinformatic tools
Drug design based on bioinformatic tools
 

Machine learning in computational docking

  • 1. 1 Mohamed A. Khamis, Walid Gomaa, Walaa F. Ahmed, Machine learning in computational docking, Artificial Intelligence In Medicine (2015), http://dx.doi.org/10.1016/j.artmed.2015.02.002 http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 2. Objective http://dx.doi.org/10.1016/j.artmed.2015.02.0022  The objective of this paper is to highlight the state-of-the-art machine learning (ML) techniques in computational docking.  The use of smart computational methods in the life cycle of drug design is relatively a recent development that has gained much popularity and interest over the last few years.  Computational docking is the process of predicting the best pose (orientation + conformation) of a small molecule (drug candidate) when bound to a target larger receptor molecule (protein) in order to form a stable complex molecule.
  • 3. Background 3 • Background for protein-ligand interactions:  Physical, chemical, and biological • Molecular data formats:  e.g., .mol, .pdb, .sdf, etc. • Docking software programs:  e.g., AutoDock, eHiTS, iDock, etc. • Molecular databases:  Containing data of proteins with their possible ligands  e.g., PDB, PDBbind, Binding DB, DUD etc. http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 4. 4 ligand (small drug molecule)large protein molecule stable complex molecule Fitting Puzzle Pieces Drug Design: Docking of Ligand with Target Protein Binding Site http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 5. Protein HIV-1 protease (hsg1.pdb) 5 Ligand (Drug) Indinavir (ind.pdb) Formula:C36H47N5O4 Indinavir (IDV; trade name Crixivan, manufactured by Merck) is inhibitor used to treat HIV infection and AIDS. http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 6. Complex molecule: Indinavir when fit into binding pocket of receptor protein HIV-1 protease 6 http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 7. Traditional Drug Design Methods 7 • Traditional drug design techniques - such as random screening and chance discovery are essentially trial and error methods. • And so they are very time consuming (10-15 years), very expensive ($300M), with extremely low yield. • For instance, over last 50 years, 500,000 compounds have been tested for anti-cancer;  Only 25 are in wide use today [1]. • On other hand, CADD is target specific, structure-based, automatic, fast, and very low cost with high success rate. 1. Denny, William A., New Zealand Institute of Chemistry, The Design and development of anti-cancer drugs. Available at http://nzic.org.nz/ChemProcesses/biotech/12J.pdf. http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 8. Scoring Function 8  Is mathematical predictive model that produces a score that represents binding free energy and hence stability of resulting complex molecule.  Generally, such function should produce set of credible ligands ranked according to their binding stability along with their binding poses X-Score: Wang R, Lai L, Wang S. Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J Computer-Aided Molecular Design 2002;16:11–26. http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 9. Powers of Scoring Functions 9  Scoring Power: Score protein-ligand complex; correlation coefficient between predicted & experimentally determined binding affinity.  Ranking Power: Rank different ligands bound to same target protein; successful ranking percentage.  Docking Power: Identify native binding pose among computer-generated decoys.  Screening Power: Classification; True binders vs. Negative Binders (random molecules). http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 10. Classical Scoring functions 10  Classical scoring functions e.g., X-Score rely only on fixed set of molecular features (e.g., energy terms)  Summed in linear weighted manner that fails to model non-linear relationships among individual energy terms.  In addition, weights of those individual energy terms are calibrated based on specific protein family (using linear regression),  Hence, classical scoring functions are more prone to over-fitting. http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 11. Machine Learning-based Scoring functions 11 Ballester PJ. Machine learning approaches to predicting protein-ligand binding. Presentation; Cambridge Computational Biology Institute - European Molecular Biology Laboratory EMBL-EBI; Cambridge, United Kingdom; 2013. Value to be predicted using regression techniques http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 12. Training & Testing sets of PDBbind v. 2007 12 Ballester PJ. Machine learning approaches to predicting protein-ligand binding. Presentation; Cambridge Computational Biology Institute - European Molecular Biology Laboratory EMBL-EBI; Cambridge, United Kingdom; 2013. http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 13. Results http://dx.doi.org/10.1016/j.artmed.2015.02.00213  We survey this paradigm shift elaborating on the main building components of ML approaches used in molecular docking.  For instance, the best random forest (RF)-based scoring function (Li, 2014) on PDBbind v2007 achieves a Pearson correlation coefficient between the predicted and experimentally determined binding affinities of 0.803 while the best classical scoring function achieves 0.644 (Cheng, 2009).  The best RF-based ranking power (Ashtawy, 2012) ranks the ligands correctly based on their experimentally determined binding affinities with accuracy 62.5% and identifies the top binding ligand with accuracy 78.1%.
  • 14. Conclusion 14  Machine Learning techniques give ability to utilize as many relevant molecular features (e.g., geometric features, pharmacophore features, etc.) as possible.  Particularly, ensemble-based machine learning approaches (e.g., random forest, boosted regression trees, etc.) are resilient to over fitting.  Yield good results not only on training complexes but on any testing complexes as well. http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 15. Acknowledgement 15  This work is supported:  Mainly by Information Technology Industry Development Agency (ITIDA) under ITAC Program grant number CFP#58  In part by E-JUST Research Fellowship http://dx.doi.org/10.1016/j.artmed.2015.02.002
  • 16. Publications 16  Mohamed A. Khamis, Walid Gomaa, 2015, Comparative Assessment of Scoring and Ranking Powers of Machine-Learning-Based Scoring Functions on an Updated Benchmark PDBbind 2013, Engineering Applications of Artificial Intelligence, Elsevier. (submitted)  Mohamed A. Khamis, Walid Gomaa, Basem Galal, 2015, Deep Learning Competes Random Forest in Computational Docking, IEEE/ACM Transactions on Computational Biology and Bioinformatics. (submitted) http://dx.doi.org/10.1016/j.artmed.2015.02.002