SlideShare a Scribd company logo
1 of 22
Download to read offline
ty
                         si
   SVM based approach




                       er
 to assess the reliability of




                    iv
                  Un
protein-protein interactions

              on
              s
           Ma
        ge


Meher Preethi Boorgula, Ronak Shah,
     or




         Neerja Katiyar
   Ge
Motivation:




                                 ty
                              si
                            er
 Protein interactions play a key role in many




                         iv
 cellular processes.




                       Un
 Distortion of protein interfaces may lead to

                   on
 development of many diseases.
                  s
               Ma
 Reliable Protein-protein interactions (PPIs)
            ge


 conserved among different species and that are
         or




 involved in diseases would be very helpful for
       Ge




 researchers.
Problem Statement:




                                 ty
                               si
                            er
 Protein-Protein Interactions (PPIs) are very




                          iv
                       Un
 helpful in functional annotation of proteins. It


                    on
 is important that the PPI data is reliable.
                   s
                Ma
 Thus, we try to predict the reliability of PPIs
 with respect to a disease causing bacterium.
            ge
         or
       Ge
Objective:




                                  ty
                               si
                             er
  To create a prediction model based on Kernel




                          iv
                        Un
  method (SVM) to assess the reliability of PPIs


                    on
  in Treponema pallidum obtained from Yeast
                   s
  Two Hybrid (Y2H) system.
                Ma
  To classify the interactions as reliable and not
             ge
          or




  reliable.
        Ge
Introduction:




                                    ty
                                 si
                               er
  Protein-protein interactions can be identified




                            iv
  with the help of high-throughput techniques like




                         Un
  the Yeast-two Hybrid (Y2H) and Mass

                     on
  Spectrometry (MS).
                    s
                 Ma
  The main disadvantage with these existing
  techniques is the amount of false-positives in the
             ge



  data obtained.
          or
        Ge




  So, assessing the reliability of PPIs is necessary.
Methodology:




                                    ty
                                   si
                             er
         Preparation of data sets




                           iv
                        Un
          Extract the attributes

                  son
               Ma
    Create & test model using SVM light
           ge



     Evaluate the performance of the model
       or
     Ge




     Analyze the reliability of PPI data sets
Datasets:




                                ty
                             si
                           er
  Raw data of interactions was obtained from




                        iv
  Y2H experiments performed at J.Craig Venter




                      Un
  Institute.

                  on
  This data was then organized into train and
                  s
               Ma
  test sets by considering equal number of
            ge


  positive and negative examples.
         or




  Positive – High Confidence data
       Ge




  Negative – Low Confidence data
Dataset (Contd…)




                               ty
                            si
                          er
 All Interactions = 2993




                       iv
 High Confidence = 721



                     Un
 Common Interactions = 66
                 son
 Total (excluding common) = 3648
              Ma

 Train & Test datasets were made by taking
           ge
        or




 1824 interactions.
      Ge
Extracting Attributes:




                                 ty
                              si
                            er
 Attributes chosen include:




                         iv
 - Sequence based:




                       Un
    i. occurrence of 5-mers in the sequence data
                  son
   ii. Hydrophobicity
               Ma
 - Non-sequence based:
            ge
         or




   i. Jaccard coefficient
       Ge




   ii. GO Annotation
Hydrophobicity:




                                   ty
                                 si
                              er
 Protein interaction depends on the nature of the




                           iv
 active/binding site.




                         Un
 Hydrophobicity profile was used in order to extract


                     on
 this feature.
                    s
                 Ma
 Average Hydropathy was calculated for a sequence
 based on the hydrophobicity of each amino acid
            ge



 residue.
         or
       Ge




 This was obtained using the tool “ProteinGRAVY”.
Jaccard coefficient:




                                  ty
                               si
                             er
  In a PPI network, the neighbors of interacting




                          iv
  proteins also tend to interact.




                        Un
 Jaccard coefficient:

                    on
     |N(v) U N(u)| / |N(v) ∩ N(u)|
                   s
                Ma
 where u, v are the interacting proteins
            ge


 N(X) = set of neighbors of protein X in the PPI
         or




  network
       Ge
GO Annotations:




                                 ty
                              si
                            er
 Proteins that are present in the same cellular




                         iv
 component or that participate in same biological




                       Un
 processes are more likely to interact.

                   on
 This was captured with the help of extracting
                  s
               Ma
 identical GO IDs for the interacting proteins.
            ge


 Interacting proteins with atleast one common GO
         or




 ID was considered reliable.
       Ge
Occurrence of 5-mers




                                ty
                             si
                           er
 Spectrum kernel models a sequence in the




                        iv
 space of all k-mers (5-mers).




                      Un
 All possible 5-mers in the protein sequences

                  on
 were obtained for the data.
                  s
               Ma
 Number of times each 5-mer appears in the
           ge


 sequence data for both bait and prey proteins
        or




 was extracted.
      Ge
Creating & Testing Model:




                                ty
                             si
                           er
 SVM Light was used to create a classification




                        iv
 model based on linear & sigmoid kernel.



                      Un
 Test data was applied to the model in order to

                  on
                  s
 classify it.
               Ma
 The performance of the model was evaluated
           ge



 based on Accuracy, Precision and Recall
        or
      Ge




 values.
Experiments Performed:




                                  ty
                               si
                             er
1) Model generated using the attribute




                          iv
   Hydrophobicity.




                        Un
2) Model generated using the attribute JC
                   son
3) Model generated using both of these
                Ma
   attributes.
             ge



4) Model generated using both these attributes
          or
        Ge




   on a different data set (equal number of
   positive and negative examples).
Results for Linear Kernel:




                                       ty
                                     si
                                  er
                                iv
             Exp-1        Exp-2      Exp-3   Exp-4




                           Un
                     on
 Accuracy    79.99        79.99      79.88   51.23
                      s
                   Ma
   (%)
             ge


 Precision     -            -           -      -
         or




   (%)
       Ge




  Recall      0.0          0.0         0.0    0.0
   (%)
Results for Sigmoid Kernel:




                                   ty
                                  si
                              er
            Exp-1    Exp-2        Exp-3   Exp-4




                             iv
                         Un
                    on
Accuracy      -          -        79.88   57.26
                     s
                  Ma
  (%)
Precision     -          -         0.0    57.80
             ge
          or




  (%)
        Ge




 Recall       -          -         0.0    45.79
  (%)
Observation:




                                ty
                             si
                           er
 Results obtained were not reliable as the




                        iv
 model was built using only two attributes.




                      Un
 This would not be efficient in discriminating

                  on
 the positive & negative examples.
                  s
               Ma
 Also, it was observed that there was no
           ge


 significance of the positive examples while
        or




 creating the model.
      Ge
To Be done:




                                ty
                              si
                           er
 Extracting the attribute “occurrence of 5-




                         iv
 mers” for the protein pairs and perform all the




                      Un
 experiments.

                   on
 Obtain data from INTACT database to
                  s
 increase the number of positive examples and
               Ma
 to overcome the number of false positives in
           ge


 the data since it is from Y2H experiments.
         or
       Ge




 Compare the performance with the existing
 model based on “Logistic Regression”.
Problems:




                                ty
                             si
                           er
 The major problem for extracting attributes




                        iv
                      Un
 which were dependent on the annotation was


                  on
 that Treponema is not fully annotated.
                  s
               Ma
 The interaction data for Treponema is also not
 reliable.
           ge
        or
      Ge
Future Work:




                               ty
                             si
                          er
 We would like to apply this model to




                        iv
 Streptococcus Pneumoniae.




                     Un
 Using PSSM scores by performing PSI-Blast

                  on
 would be helpful.
                 s
              Ma
 Analyze for the biological relevance of our
           ge


 predictions and then test experimentally the
        or




 interactions predicted to be reliable by the
      Ge




 model.
References:




                                        ty
                                    si
                                   er
 Dr.Peter Uetz et al (J.Craig Venter Institute)




                               iv
 Kernel methods for predicting protein–protein



                            Un
 interactions by Asa Ben-Hur & William Stafford

                       on
 Noble
                      s
                   Ma
 SVM Light: http://svmlight.joachims.org/
              ge


 Protein GRAVY:
           or




 http://www.bioinformatics.org/sms2/protein_gravy.html
         Ge




 PIR: http://pir.georgetown.edu/

More Related Content

Viewers also liked

Protein interaction networks
Protein interaction networksProtein interaction networks
Protein interaction networks
Lars Juhl Jensen
 

Viewers also liked (20)

PhD viva - 11th November 2015
PhD viva - 11th November 2015PhD viva - 11th November 2015
PhD viva - 11th November 2015
 
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of CognitionTowards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
 
Identifying, annotating, and filtering arguments and opinions on the social w...
Identifying, annotating, and filtering arguments and opinions on the social w...Identifying, annotating, and filtering arguments and opinions on the social w...
Identifying, annotating, and filtering arguments and opinions on the social w...
 
Towards Social semantic journalism
Towards Social semantic journalismTowards Social semantic journalism
Towards Social semantic journalism
 
Beyond Journalism Chicago
Beyond Journalism ChicagoBeyond Journalism Chicago
Beyond Journalism Chicago
 
Specificity and Evolvability in Eukaryotic Protein Interaction Networks
Specificity and Evolvability in Eukaryotic Protein Interaction NetworksSpecificity and Evolvability in Eukaryotic Protein Interaction Networks
Specificity and Evolvability in Eukaryotic Protein Interaction Networks
 
From protein interaction networks to human phenotypes
From protein  interaction networks to human phenotypesFrom protein  interaction networks to human phenotypes
From protein interaction networks to human phenotypes
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsLeveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
 
Data Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesData Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data Lakes
 
Sabrina Kirrane INSIGHT Viva Presentation
Sabrina Kirrane INSIGHT Viva Presentation Sabrina Kirrane INSIGHT Viva Presentation
Sabrina Kirrane INSIGHT Viva Presentation
 
2016 07 12_purdue_bigdatainomics_seandavis
2016 07 12_purdue_bigdatainomics_seandavis2016 07 12_purdue_bigdatainomics_seandavis
2016 07 12_purdue_bigdatainomics_seandavis
 
Industry Report: The State of Customer Data Integration in 2013
Industry Report: The State of Customer Data Integration in 2013Industry Report: The State of Customer Data Integration in 2013
Industry Report: The State of Customer Data Integration in 2013
 
Data Journalism - Start working with Data
Data Journalism  - Start working with DataData Journalism  - Start working with Data
Data Journalism - Start working with Data
 
Semantic annotation of biomedical data
Semantic annotation of biomedical dataSemantic annotation of biomedical data
Semantic annotation of biomedical data
 
Protein interaction networks from yeast to human
Protein interaction networks from yeast to humanProtein interaction networks from yeast to human
Protein interaction networks from yeast to human
 
Systematic discovery of phosphorylation networks - Combining linear motifs an...
Systematic discovery of phosphorylation networks - Combining linear motifs an...Systematic discovery of phosphorylation networks - Combining linear motifs an...
Systematic discovery of phosphorylation networks - Combining linear motifs an...
 
Protein interaction networks
Protein interaction networksProtein interaction networks
Protein interaction networks
 
Introduction to Network Mapping
Introduction to Network MappingIntroduction to Network Mapping
Introduction to Network Mapping
 
Data Journalism - Finding Data
Data Journalism - Finding DataData Journalism - Finding Data
Data Journalism - Finding Data
 

Similar to Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene Ontology

Multi-scale network biology model & the model library
Multi-scale network biology model & the model libraryMulti-scale network biology model & the model library
Multi-scale network biology model & the model library
laserxiong
 
AI at GSK_Kim Branson_mHealth Israel
AI at GSK_Kim Branson_mHealth IsraelAI at GSK_Kim Branson_mHealth Israel
AI at GSK_Kim Branson_mHealth Israel
Levi Shapiro
 
Friend NIEHS 2013-03-01
Friend NIEHS 2013-03-01Friend NIEHS 2013-03-01
Friend NIEHS 2013-03-01
Sage Base
 
Protein-Protein Interaction Presentation
Protein-Protein Interaction PresentationProtein-Protein Interaction Presentation
Protein-Protein Interaction Presentation
Usman (Ali) Ahmed
 

Similar to Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene Ontology (20)

(050407)protein chip
(050407)protein chip(050407)protein chip
(050407)protein chip
 
Multi-scale network biology model & the model library
Multi-scale network biology model & the model libraryMulti-scale network biology model & the model library
Multi-scale network biology model & the model library
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...
 
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
 
presentation
presentationpresentation
presentation
 
AI at GSK_Kim Branson_mHealth Israel
AI at GSK_Kim Branson_mHealth IsraelAI at GSK_Kim Branson_mHealth Israel
AI at GSK_Kim Branson_mHealth Israel
 
Microbiome Profiling with the Microbial Genomics Pro Suite
Microbiome Profiling with the Microbial Genomics Pro SuiteMicrobiome Profiling with the Microbial Genomics Pro Suite
Microbiome Profiling with the Microbial Genomics Pro Suite
 
parth vavadia
parth vavadiaparth vavadia
parth vavadia
 
Journal Club sept 13,2022.pptx
Journal Club sept 13,2022.pptxJournal Club sept 13,2022.pptx
Journal Club sept 13,2022.pptx
 
Proteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data setsProteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data sets
 
Friend NIEHS 2013-03-01
Friend NIEHS 2013-03-01Friend NIEHS 2013-03-01
Friend NIEHS 2013-03-01
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Protein-Protein Interaction Presentation
Protein-Protein Interaction PresentationProtein-Protein Interaction Presentation
Protein-Protein Interaction Presentation
 
Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18
 
Techniques in proteomics
Techniques in proteomicsTechniques in proteomics
Techniques in proteomics
 
Prediction and Meta-Analysis
Prediction and Meta-AnalysisPrediction and Meta-Analysis
Prediction and Meta-Analysis
 
Prediction and Meta-Analysis
Prediction and Meta-AnalysisPrediction and Meta-Analysis
Prediction and Meta-Analysis
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...
 
Genomic Prediction Methods in SVS
Genomic Prediction Methods in SVSGenomic Prediction Methods in SVS
Genomic Prediction Methods in SVS
 
Reverse pharmacognosy
Reverse pharmacognosyReverse pharmacognosy
Reverse pharmacognosy
 

More from Ronak Shah

Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Ronak Shah
 
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
Ronak Shah
 
Genome assembly snapshot flow diagram
Genome assembly snapshot flow diagramGenome assembly snapshot flow diagram
Genome assembly snapshot flow diagram
Ronak Shah
 
Ion torrent data analysis
Ion torrent data analysisIon torrent data analysis
Ion torrent data analysis
Ronak Shah
 

More from Ronak Shah (7)

Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
Poster at EMBL: Diagnosis and monitoring of Leptomeningeal Disease using Circ...
 
Comparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detectionComparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detection
 
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...
 
Detecting clinically actionable somatic structural aberrations from targeted ...
Detecting clinically actionable somatic structural aberrations from targeted ...Detecting clinically actionable somatic structural aberrations from targeted ...
Detecting clinically actionable somatic structural aberrations from targeted ...
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...
 
Genome assembly snapshot flow diagram
Genome assembly snapshot flow diagramGenome assembly snapshot flow diagram
Genome assembly snapshot flow diagram
 
Ion torrent data analysis
Ion torrent data analysisIon torrent data analysis
Ion torrent data analysis
 

Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene Ontology

  • 1. ty si SVM based approach er to assess the reliability of iv Un protein-protein interactions on s Ma ge Meher Preethi Boorgula, Ronak Shah, or Neerja Katiyar Ge
  • 2. Motivation: ty si er Protein interactions play a key role in many iv cellular processes. Un Distortion of protein interfaces may lead to on development of many diseases. s Ma Reliable Protein-protein interactions (PPIs) ge conserved among different species and that are or involved in diseases would be very helpful for Ge researchers.
  • 3. Problem Statement: ty si er Protein-Protein Interactions (PPIs) are very iv Un helpful in functional annotation of proteins. It on is important that the PPI data is reliable. s Ma Thus, we try to predict the reliability of PPIs with respect to a disease causing bacterium. ge or Ge
  • 4. Objective: ty si er To create a prediction model based on Kernel iv Un method (SVM) to assess the reliability of PPIs on in Treponema pallidum obtained from Yeast s Two Hybrid (Y2H) system. Ma To classify the interactions as reliable and not ge or reliable. Ge
  • 5. Introduction: ty si er Protein-protein interactions can be identified iv with the help of high-throughput techniques like Un the Yeast-two Hybrid (Y2H) and Mass on Spectrometry (MS). s Ma The main disadvantage with these existing techniques is the amount of false-positives in the ge data obtained. or Ge So, assessing the reliability of PPIs is necessary.
  • 6. Methodology: ty si er Preparation of data sets iv Un Extract the attributes son Ma Create & test model using SVM light ge Evaluate the performance of the model or Ge Analyze the reliability of PPI data sets
  • 7. Datasets: ty si er Raw data of interactions was obtained from iv Y2H experiments performed at J.Craig Venter Un Institute. on This data was then organized into train and s Ma test sets by considering equal number of ge positive and negative examples. or Positive – High Confidence data Ge Negative – Low Confidence data
  • 8. Dataset (Contd…) ty si er All Interactions = 2993 iv High Confidence = 721 Un Common Interactions = 66 son Total (excluding common) = 3648 Ma Train & Test datasets were made by taking ge or 1824 interactions. Ge
  • 9. Extracting Attributes: ty si er Attributes chosen include: iv - Sequence based: Un i. occurrence of 5-mers in the sequence data son ii. Hydrophobicity Ma - Non-sequence based: ge or i. Jaccard coefficient Ge ii. GO Annotation
  • 10. Hydrophobicity: ty si er Protein interaction depends on the nature of the iv active/binding site. Un Hydrophobicity profile was used in order to extract on this feature. s Ma Average Hydropathy was calculated for a sequence based on the hydrophobicity of each amino acid ge residue. or Ge This was obtained using the tool “ProteinGRAVY”.
  • 11. Jaccard coefficient: ty si er In a PPI network, the neighbors of interacting iv proteins also tend to interact. Un Jaccard coefficient: on |N(v) U N(u)| / |N(v) ∩ N(u)| s Ma where u, v are the interacting proteins ge N(X) = set of neighbors of protein X in the PPI or network Ge
  • 12. GO Annotations: ty si er Proteins that are present in the same cellular iv component or that participate in same biological Un processes are more likely to interact. on This was captured with the help of extracting s Ma identical GO IDs for the interacting proteins. ge Interacting proteins with atleast one common GO or ID was considered reliable. Ge
  • 13. Occurrence of 5-mers ty si er Spectrum kernel models a sequence in the iv space of all k-mers (5-mers). Un All possible 5-mers in the protein sequences on were obtained for the data. s Ma Number of times each 5-mer appears in the ge sequence data for both bait and prey proteins or was extracted. Ge
  • 14. Creating & Testing Model: ty si er SVM Light was used to create a classification iv model based on linear & sigmoid kernel. Un Test data was applied to the model in order to on s classify it. Ma The performance of the model was evaluated ge based on Accuracy, Precision and Recall or Ge values.
  • 15. Experiments Performed: ty si er 1) Model generated using the attribute iv Hydrophobicity. Un 2) Model generated using the attribute JC son 3) Model generated using both of these Ma attributes. ge 4) Model generated using both these attributes or Ge on a different data set (equal number of positive and negative examples).
  • 16. Results for Linear Kernel: ty si er iv Exp-1 Exp-2 Exp-3 Exp-4 Un on Accuracy 79.99 79.99 79.88 51.23 s Ma (%) ge Precision - - - - or (%) Ge Recall 0.0 0.0 0.0 0.0 (%)
  • 17. Results for Sigmoid Kernel: ty si er Exp-1 Exp-2 Exp-3 Exp-4 iv Un on Accuracy - - 79.88 57.26 s Ma (%) Precision - - 0.0 57.80 ge or (%) Ge Recall - - 0.0 45.79 (%)
  • 18. Observation: ty si er Results obtained were not reliable as the iv model was built using only two attributes. Un This would not be efficient in discriminating on the positive & negative examples. s Ma Also, it was observed that there was no ge significance of the positive examples while or creating the model. Ge
  • 19. To Be done: ty si er Extracting the attribute “occurrence of 5- iv mers” for the protein pairs and perform all the Un experiments. on Obtain data from INTACT database to s increase the number of positive examples and Ma to overcome the number of false positives in ge the data since it is from Y2H experiments. or Ge Compare the performance with the existing model based on “Logistic Regression”.
  • 20. Problems: ty si er The major problem for extracting attributes iv Un which were dependent on the annotation was on that Treponema is not fully annotated. s Ma The interaction data for Treponema is also not reliable. ge or Ge
  • 21. Future Work: ty si er We would like to apply this model to iv Streptococcus Pneumoniae. Un Using PSSM scores by performing PSI-Blast on would be helpful. s Ma Analyze for the biological relevance of our ge predictions and then test experimentally the or interactions predicted to be reliable by the Ge model.
  • 22. References: ty si er Dr.Peter Uetz et al (J.Craig Venter Institute) iv Kernel methods for predicting protein–protein Un interactions by Asa Ben-Hur & William Stafford on Noble s Ma SVM Light: http://svmlight.joachims.org/ ge Protein GRAVY: or http://www.bioinformatics.org/sms2/protein_gravy.html Ge PIR: http://pir.georgetown.edu/