SlideShare a Scribd company logo
1 of 41
Download to read offline
From Sequence to Knowledge
Assembly, Annotation, and Analysis
of Phage genomes from Genomic
and Metagenomic Data Sets
A helping hand through
The Annotation Bottleneck
Ramy K. Aziz
Workshop presenters
6 Aug 2017 Phage Genomics - Evergreen 2017
Alejandro Reyes
AR
Ramy Aziz
RA
Jason Gill
JG
PRELUDE
6 Aug 2017 Phage Genomics - Evergreen 2017
A bit of history…
• Since 2009, the Genomics Workshop has
become an essential part of the Evergreen
phage meeting
• The challenge always is: how to meet
needs/expectations that are so many and
so diverse, in ~4 hours
6 Aug 2017 Phage Genomics - Evergreen 2017
A bit of history…
• Since 2009, the Genomics Workshop has
become an essential part of the Evergreen
phage meeting
• The challenge always is: how to meet
needs/expectations that are so many and
so diverse, in ~4 hours
• The answer is:
…….
6 Aug 2017 Phage Genomics - Evergreen 2017
A bit of history…
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2011 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2013 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2013 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2013 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2015 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2015 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2015 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2015 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
The 2015 workshop
6 Aug 2017 Phage Genomics - Evergreen 2017
MOTIVATION
6 Aug 2017 Phage Genomics - Evergreen 2017
“The analysis bottleneck”
• Observation:
– We generate more data than we can analyze.
– We generate sequence data faster than
we can analyze them.
• Opinion:
– Not all bottlenecks are
created equal!
– It is important to define the question(s)
before working on the answer(s)!
6 Aug 2017 Phage Genomics - Evergreen 2017
“The analysis bottleneck”
• The Lavigne paradox (2013)
6 Aug 2017 Phage Genomics - Evergreen 2017
“The analysis bottleneck”
• The Lavigne paradox (2013)
6 Aug 2017 Phage Genomics - Evergreen 2017
“The analysis bottleneck”
• The Lavigne paradox (2015)
6 Aug 2017 Phage Genomics - Evergreen 2017
AUDIENCE
6 Aug 2017 Phage Genomics - Evergreen 2017
Workshop audience
• Who (how many) among you have:
– annotated at least a phage genome?
– worked on a viral metagenome?
– used the command line (Unix, Linux, Mac
Terminal) for sequence analysis?
• We have actually ran an online survey,
and here is what we found …
6 Aug 2017 Phage Genomics - Evergreen 2017
Workshop audience
6 Aug 2017 Phage Genomics - Evergreen 2017
Workshop audience
6 Aug 2017 Phage Genomics - Evergreen 2017
Workshop audience
6 Aug 2017 Phage Genomics - Evergreen 2017
Workshop audience
6 Aug 2017 Phage Genomics - Evergreen 2017
Workshop audience
6 Aug 2017 Phage Genomics - Evergreen 2017
Quick group activity
Defining the question(s):
• Introduce yourself, your institution, and your
favorite phage
• Do you have a genome sequenced? Planning to?
– Why have you sequenced your phage genome?
– Why you want to sequence your phage genome?
• What is the single most pressing question you
want to have answered from genome analysis?
6 Aug 2017 Phage Genomics - Evergreen 2017
DEFINING THE QUESTION(S)
6 Aug 2017 Phage Genomics - Evergreen 2017
What you want …... is
from genome from metagenome
6 Aug 2017 Phage Genomics - Evergreen 2017
Incomplete
frameshift
- complete
- accurate
Credit: Andrew Kropinski Credit: Bas Dutilh
faulty assembly
What you want …... is
from genome from metagenome
6 Aug 2017
Incomplete faulty assembly
frameshift
- complete
- accurate
Phage Genomics - Evergreen 2017
Credit: Andrew Kropinski Credit: Bas Dutilh
A process of reconstruction
6 Aug 2017 Phage Genomics - Evergreen 2017
A process of reconstruction
• Experimentally
6 Aug 2017 Phage Genomics - Evergreen 2017
DNA
TGATTGTGTGTTTGCGCAATGCG
ATGTGTATATATAGTGAGCTTGCCC
GTCTCTCTNNNTCTCTTG
TGATTGGTCTNNNTCTCTTGCGCAATGCG
A process of reconstruction
• Experimentally
• Computationally
6 Aug 2017 Phage Genomics - Evergreen 2017
TGATTGTGTGTTTGCGCAATGCG
ATGTGTATATATAGTGAGCTTGCCC
GTCTCTCTNNNTCTCTTG
TGATTGGTCTNNNTCTCTTGCGCAATGCG
DNA
TGATTGTGTGTTTGCGCAATGCG
ATGTGTATATATAGTGAGCTTGCCC
GTCTCTCTNNNTCTCTTG
TGATTGGTCTNNNTCTCTTGCGCAATGCG
A process of reconstruction
• Experimentally
• Computationally
6 Aug 2017 Phage Genomics - Evergreen 2017
TGATTGTGTGTTTGCGCAATGCG
ATGTGTATATATAGTGAGCTTGCCC
GTCTCTCTNNNTCTCTTG
TGATTGGTCTNNNTCTCTTGCGCAATGCG
“Any phage
one can get!”
“eDNA”
TGATTGTGTGTTTGCGCAATGCG
ATGTGTATATATAGTGAGCTTGCCC
GTCTCTCTNNNTCTCTTG
TGATTGGTCTNNNTCTCTTGCGCAATGCG
Assembly
Gene finding/
ORF calling
tRNA calling
Annotation
(Assigning
functions)
orienting
Validation
Fixing frameshifts
Introns and Inteins Subsystem
assignment
Refinement/
Secondary
annotation
loop
Special purpose:
toxins, morons, integrases,
lifestyle prediction
Regulatory elements
(promoters, terminators)
Output: files and graphics
From Sequence to Knowledge
From raw sequence data to
genome submission/ publication
Classification
• The phage sequence space (Lima-Mendez et al.)
• The phage proteomic tree (Edwards & Rohwer)
• New: VIP tree http://www.genome.jp/viptree
6 Aug 2017 Phage Genomics - Evergreen 2017
Countless tools
6 Aug 2017 Phage Genomics - Evergreen 2017
This workshop: outline
1. Annotation overview
2. Automated tools for genome annotation:
– PhAnToMe/RAST related tools
– Galaxy/ Apollo
3. Tools for metagenome-based analyses
– Assembly
– Functional prediction via protein families
6 Aug 2017 Phage Genomics - Evergreen 2017
Where to go from here?
• Part I:
General introduction of genome annotation
• Part II:
Two levels
– Level 1: Novices and beginners:
Automated annotation tools
– Level 2: Intermediate to advanced users:
Command-line based tools
6 Aug 2017 Phage Genomics - Evergreen 2017
Online resources/ Slideshare
• Data & links:
– http://egybio.net/tutorial
• Slides
– http://bit.ly/annotation2016
– http://bit.ly/phantome4
– Old tutorials (more detailed, but missing latest ):
• Evergreen 2011: http://slidesha.re/phantome1
• http://slidesha.re/phiRAST1 (by Karin Holmfeldt)
• Evergreen 2013: http://bit.ly/phantome2
• Evergreen 2015: http://bit.ly/phantome3
6 Aug 2017 Phage Genomics - Evergreen 2017

More Related Content

More from Ramy K. Aziz

An introduction to PATRIC and its use in phage annotation
An introduction to PATRIC and its use in phage annotationAn introduction to PATRIC and its use in phage annotation
An introduction to PATRIC and its use in phage annotationRamy K. Aziz
 
From Sequence to Knowledge: The Art and Science of Phage Genome Annotation
From Sequence to Knowledge: The Art and Science of Phage Genome AnnotationFrom Sequence to Knowledge: The Art and Science of Phage Genome Annotation
From Sequence to Knowledge: The Art and Science of Phage Genome AnnotationRamy K. Aziz
 
The Opera of Phantome - 2016 (presented at the EMBO Viruses of Microbes 2016 ...
The Opera of Phantome - 2016 (presented at the EMBO Viruses of Microbes 2016 ...The Opera of Phantome - 2016 (presented at the EMBO Viruses of Microbes 2016 ...
The Opera of Phantome - 2016 (presented at the EMBO Viruses of Microbes 2016 ...Ramy K. Aziz
 
Systems Biology and Genomics of Microbial Pathogens
Systems Biology and Genomics of Microbial PathogensSystems Biology and Genomics of Microbial Pathogens
Systems Biology and Genomics of Microbial PathogensRamy K. Aziz
 
The Opera of Phantome - Version 2.0 (presented at the 21st Biennial Evergreen...
The Opera of Phantome - Version 2.0 (presented at the 21st Biennial Evergreen...The Opera of Phantome - Version 2.0 (presented at the 21st Biennial Evergreen...
The Opera of Phantome - Version 2.0 (presented at the 21st Biennial Evergreen...Ramy K. Aziz
 
Giving and Receiving Feedback
Giving and Receiving FeedbackGiving and Receiving Feedback
Giving and Receiving FeedbackRamy K. Aziz
 
"The Opera of PhAnToMe": Phage Annotation Tools at the 20th Biennial Evergree...
"The Opera of PhAnToMe": Phage Annotation Tools at the 20th Biennial Evergree..."The Opera of PhAnToMe": Phage Annotation Tools at the 20th Biennial Evergree...
"The Opera of PhAnToMe": Phage Annotation Tools at the 20th Biennial Evergree...Ramy K. Aziz
 
phiRAST Tutorial - The 19th Evergreen Phage Meeting 2011
phiRAST Tutorial - The 19th Evergreen Phage Meeting 2011phiRAST Tutorial - The 19th Evergreen Phage Meeting 2011
phiRAST Tutorial - The 19th Evergreen Phage Meeting 2011Ramy K. Aziz
 
Introduction to PhAnToMe Workshop, 19th Evergreen Phage Meeting, 2011
Introduction to PhAnToMe Workshop, 19th Evergreen Phage Meeting, 2011Introduction to PhAnToMe Workshop, 19th Evergreen Phage Meeting, 2011
Introduction to PhAnToMe Workshop, 19th Evergreen Phage Meeting, 2011Ramy K. Aziz
 
If the dead bacteria could speak
If the dead bacteria could speakIf the dead bacteria could speak
If the dead bacteria could speakRamy K. Aziz
 

More from Ramy K. Aziz (12)

An introduction to PATRIC and its use in phage annotation
An introduction to PATRIC and its use in phage annotationAn introduction to PATRIC and its use in phage annotation
An introduction to PATRIC and its use in phage annotation
 
From Sequence to Knowledge: The Art and Science of Phage Genome Annotation
From Sequence to Knowledge: The Art and Science of Phage Genome AnnotationFrom Sequence to Knowledge: The Art and Science of Phage Genome Annotation
From Sequence to Knowledge: The Art and Science of Phage Genome Annotation
 
The Opera of Phantome - 2016 (presented at the EMBO Viruses of Microbes 2016 ...
The Opera of Phantome - 2016 (presented at the EMBO Viruses of Microbes 2016 ...The Opera of Phantome - 2016 (presented at the EMBO Viruses of Microbes 2016 ...
The Opera of Phantome - 2016 (presented at the EMBO Viruses of Microbes 2016 ...
 
Systems Biology and Genomics of Microbial Pathogens
Systems Biology and Genomics of Microbial PathogensSystems Biology and Genomics of Microbial Pathogens
Systems Biology and Genomics of Microbial Pathogens
 
The Opera of Phantome - Version 2.0 (presented at the 21st Biennial Evergreen...
The Opera of Phantome - Version 2.0 (presented at the 21st Biennial Evergreen...The Opera of Phantome - Version 2.0 (presented at the 21st Biennial Evergreen...
The Opera of Phantome - Version 2.0 (presented at the 21st Biennial Evergreen...
 
Giving and Receiving Feedback
Giving and Receiving FeedbackGiving and Receiving Feedback
Giving and Receiving Feedback
 
FootballOmics
FootballOmicsFootballOmics
FootballOmics
 
"The Opera of PhAnToMe": Phage Annotation Tools at the 20th Biennial Evergree...
"The Opera of PhAnToMe": Phage Annotation Tools at the 20th Biennial Evergree..."The Opera of PhAnToMe": Phage Annotation Tools at the 20th Biennial Evergree...
"The Opera of PhAnToMe": Phage Annotation Tools at the 20th Biennial Evergree...
 
phiRAST Tutorial - The 19th Evergreen Phage Meeting 2011
phiRAST Tutorial - The 19th Evergreen Phage Meeting 2011phiRAST Tutorial - The 19th Evergreen Phage Meeting 2011
phiRAST Tutorial - The 19th Evergreen Phage Meeting 2011
 
Introduction to PhAnToMe Workshop, 19th Evergreen Phage Meeting, 2011
Introduction to PhAnToMe Workshop, 19th Evergreen Phage Meeting, 2011Introduction to PhAnToMe Workshop, 19th Evergreen Phage Meeting, 2011
Introduction to PhAnToMe Workshop, 19th Evergreen Phage Meeting, 2011
 
If the dead bacteria could speak
If the dead bacteria could speakIf the dead bacteria could speak
If the dead bacteria could speak
 
Rka nxt 2010_web
Rka nxt 2010_webRka nxt 2010_web
Rka nxt 2010_web
 

Recently uploaded

Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function. MUKTA MANJARI SAHOO
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)GRAPE
 
MARSILEA notes in detail for II year Botany.ppt
MARSILEA  notes in detail for II year Botany.pptMARSILEA  notes in detail for II year Botany.ppt
MARSILEA notes in detail for II year Botany.pptaigil2
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPirithiRaju
 
Digitized Continuous Magnetic Recordings for the August/September 1859 Storms...
Digitized Continuous Magnetic Recordings for the August/September 1859 Storms...Digitized Continuous Magnetic Recordings for the August/September 1859 Storms...
Digitized Continuous Magnetic Recordings for the August/September 1859 Storms...Sérgio Sacani
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptSachin Teotia
 
CW marking grid Analytical BS - M Ahmad.docx
CW  marking grid Analytical BS - M Ahmad.docxCW  marking grid Analytical BS - M Ahmad.docx
CW marking grid Analytical BS - M Ahmad.docxmarwaahmad357
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...PirithiRaju
 
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Sérgio Sacani
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxRahulVishwakarma71547
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxAkinrotimiOluwadunsi
 
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfSUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfsantiagojoderickdoma
 
Physics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersPhysics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersAndreaLucarelli
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WaySérgio Sacani
 
Alternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusAlternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusPradnya Wadekar
 
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrashi Coaching
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba caohsadfeeling
 
M.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsM.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsSumathi Arumugam
 

Recently uploaded (20)

Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function.
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)
 
MARSILEA notes in detail for II year Botany.ppt
MARSILEA  notes in detail for II year Botany.pptMARSILEA  notes in detail for II year Botany.ppt
MARSILEA notes in detail for II year Botany.ppt
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPR
 
Digitized Continuous Magnetic Recordings for the August/September 1859 Storms...
Digitized Continuous Magnetic Recordings for the August/September 1859 Storms...Digitized Continuous Magnetic Recordings for the August/September 1859 Storms...
Digitized Continuous Magnetic Recordings for the August/September 1859 Storms...
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.ppt
 
CW marking grid Analytical BS - M Ahmad.docx
CW  marking grid Analytical BS - M Ahmad.docxCW  marking grid Analytical BS - M Ahmad.docx
CW marking grid Analytical BS - M Ahmad.docx
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
 
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptx
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
 
Applying Cheminformatics to Develop a Structure Searchable Database of Analyt...
Applying Cheminformatics to Develop a Structure Searchable Database of Analyt...Applying Cheminformatics to Develop a Structure Searchable Database of Analyt...
Applying Cheminformatics to Develop a Structure Searchable Database of Analyt...
 
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfSUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
 
Physics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersPhysics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and Engineers
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
 
Alternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusAlternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabus
 
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba ca
 
M.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsM.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery Systems
 

From Sequence to Knowledge (Phage Genomics Workshop Intro at the 22nd Biennial Evergreen Phage Meeting)

  • 1. From Sequence to Knowledge Assembly, Annotation, and Analysis of Phage genomes from Genomic and Metagenomic Data Sets A helping hand through The Annotation Bottleneck Ramy K. Aziz
  • 2. Workshop presenters 6 Aug 2017 Phage Genomics - Evergreen 2017 Alejandro Reyes AR Ramy Aziz RA Jason Gill JG
  • 3. PRELUDE 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 4. A bit of history… • Since 2009, the Genomics Workshop has become an essential part of the Evergreen phage meeting • The challenge always is: how to meet needs/expectations that are so many and so diverse, in ~4 hours 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 5. A bit of history… • Since 2009, the Genomics Workshop has become an essential part of the Evergreen phage meeting • The challenge always is: how to meet needs/expectations that are so many and so diverse, in ~4 hours • The answer is: ……. 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 6. A bit of history… 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 7. The 2011 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 8. The 2013 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 9. The 2013 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 10. The 2013 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 11. The 2015 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 12. The 2015 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 13. The 2015 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 14. The 2015 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 15. The 2015 workshop 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 16. MOTIVATION 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 17. “The analysis bottleneck” • Observation: – We generate more data than we can analyze. – We generate sequence data faster than we can analyze them. • Opinion: – Not all bottlenecks are created equal! – It is important to define the question(s) before working on the answer(s)! 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 18. “The analysis bottleneck” • The Lavigne paradox (2013) 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 19. “The analysis bottleneck” • The Lavigne paradox (2013) 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 20. “The analysis bottleneck” • The Lavigne paradox (2015) 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 21. AUDIENCE 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 22. Workshop audience • Who (how many) among you have: – annotated at least a phage genome? – worked on a viral metagenome? – used the command line (Unix, Linux, Mac Terminal) for sequence analysis? • We have actually ran an online survey, and here is what we found … 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 23. Workshop audience 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 24. Workshop audience 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 25. Workshop audience 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 26. Workshop audience 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 27. Workshop audience 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 28. Quick group activity Defining the question(s): • Introduce yourself, your institution, and your favorite phage • Do you have a genome sequenced? Planning to? – Why have you sequenced your phage genome? – Why you want to sequence your phage genome? • What is the single most pressing question you want to have answered from genome analysis? 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 29. DEFINING THE QUESTION(S) 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 30. What you want …... is from genome from metagenome 6 Aug 2017 Phage Genomics - Evergreen 2017 Incomplete frameshift - complete - accurate Credit: Andrew Kropinski Credit: Bas Dutilh faulty assembly
  • 31. What you want …... is from genome from metagenome 6 Aug 2017 Incomplete faulty assembly frameshift - complete - accurate Phage Genomics - Evergreen 2017 Credit: Andrew Kropinski Credit: Bas Dutilh
  • 32. A process of reconstruction 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 33. A process of reconstruction • Experimentally 6 Aug 2017 Phage Genomics - Evergreen 2017 DNA TGATTGTGTGTTTGCGCAATGCG ATGTGTATATATAGTGAGCTTGCCC GTCTCTCTNNNTCTCTTG TGATTGGTCTNNNTCTCTTGCGCAATGCG
  • 34. A process of reconstruction • Experimentally • Computationally 6 Aug 2017 Phage Genomics - Evergreen 2017 TGATTGTGTGTTTGCGCAATGCG ATGTGTATATATAGTGAGCTTGCCC GTCTCTCTNNNTCTCTTG TGATTGGTCTNNNTCTCTTGCGCAATGCG DNA TGATTGTGTGTTTGCGCAATGCG ATGTGTATATATAGTGAGCTTGCCC GTCTCTCTNNNTCTCTTG TGATTGGTCTNNNTCTCTTGCGCAATGCG
  • 35. A process of reconstruction • Experimentally • Computationally 6 Aug 2017 Phage Genomics - Evergreen 2017 TGATTGTGTGTTTGCGCAATGCG ATGTGTATATATAGTGAGCTTGCCC GTCTCTCTNNNTCTCTTG TGATTGGTCTNNNTCTCTTGCGCAATGCG “Any phage one can get!” “eDNA” TGATTGTGTGTTTGCGCAATGCG ATGTGTATATATAGTGAGCTTGCCC GTCTCTCTNNNTCTCTTG TGATTGGTCTNNNTCTCTTGCGCAATGCG
  • 36. Assembly Gene finding/ ORF calling tRNA calling Annotation (Assigning functions) orienting Validation Fixing frameshifts Introns and Inteins Subsystem assignment Refinement/ Secondary annotation loop Special purpose: toxins, morons, integrases, lifestyle prediction Regulatory elements (promoters, terminators) Output: files and graphics From Sequence to Knowledge From raw sequence data to genome submission/ publication
  • 37. Classification • The phage sequence space (Lima-Mendez et al.) • The phage proteomic tree (Edwards & Rohwer) • New: VIP tree http://www.genome.jp/viptree 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 38. Countless tools 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 39. This workshop: outline 1. Annotation overview 2. Automated tools for genome annotation: – PhAnToMe/RAST related tools – Galaxy/ Apollo 3. Tools for metagenome-based analyses – Assembly – Functional prediction via protein families 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 40. Where to go from here? • Part I: General introduction of genome annotation • Part II: Two levels – Level 1: Novices and beginners: Automated annotation tools – Level 2: Intermediate to advanced users: Command-line based tools 6 Aug 2017 Phage Genomics - Evergreen 2017
  • 41. Online resources/ Slideshare • Data & links: – http://egybio.net/tutorial • Slides – http://bit.ly/annotation2016 – http://bit.ly/phantome4 – Old tutorials (more detailed, but missing latest ): • Evergreen 2011: http://slidesha.re/phantome1 • http://slidesha.re/phiRAST1 (by Karin Holmfeldt) • Evergreen 2013: http://bit.ly/phantome2 • Evergreen 2015: http://bit.ly/phantome3 6 Aug 2017 Phage Genomics - Evergreen 2017