Yeast Genome Sequencing

ISF College of Pharmacy, Moga
Ghal Kalan, GT Road, Moga- 142001,
Punjab, INDIA
Internal Quality Assurance Cell - (IQAC)
Yeast Genome
Ruchika Sharma
Assistant Professor
Dept. of BIOTECHNOLOGY
ISF COLLEGE OF
PHARMACY
Website: - www.isfcp.org

INTRODUCTION
Genome: The entire chromosomal genetic material of
an organism.
Sequencing a genome: Determining the identity and
order of nucleotides in the genetic material – usually
DNA, sometimes RNA, of an organism.
3
Gene (DNA) mRNA Protein

 Genomics: is a discipline in genetics concerned with the
study of the genomes of organisms.
 The field includes efforts to determine the entire DNA
sequence of organisms and genetic mapping and other
interactions between loci and alleles within the genome.
 The yeast Saccharomyces cerevisiae (“baker’s yeast”) is
probably the ideal eukaryotic microorganism for biological
studies.
Classified in
the kingdom
fungi
1% of all
fungal
species
4

History
 The first genetic map of S. cerevisiae was published in 1949.
 In 1989, it was decided to initiate a yeast sequencing project
within the frame of the European Union biotechnology
programmes.
 Based on a network approach, some 35 European
laboratories became initially involved in this enterprise.
5

 For the first time, in May 1992, the
complete nucleotide sequence (315 kb)
of an entire chromosome - namely,
that of the yeast chromosome III - was
published by 35 European
laboratories
 In 1994, the sequence of two more
chromosomes was published:
chromosome II of 820 kb and
chromosome XI of 666 kb.
Conti…
6

Conti…
 By the end of 1995, more than 50% of the
yeast genome will have been sequenced
under the European Union project, and by
the end of 1996 the entire sequence of the
yeast genome will be known by an
International joint effort.
7

Basic problem
 Genomes are large (typically
millions or billions of base pairs)
 Current technology can only
reliably ‘read’ a short stretch –
typically hundreds of base pairs
8

Elements of a solution
 Automation – over the past decade, the
amount of hand-labor in the ‘reads’ has
been steadily and dramatically reduced
 Assembly of the ‘reads’ (sequences) in an
algorithmic and computational
programme.
9

Procedure
 The sequencing of chromosome started
from a collection of overlapping plasmid or
phage lambda clones that were distributed
by the DNA co-ordinator to the contracting
laboratories.
 However, it soon became evident that
ordered cosmid libraries were much more
advantageous to aid large scale
sequencing.
11

 A low number of clones was of
interest in setting up ordered
yeast cosmid libraries or sorting
out and mapping the chromosome
specific sublibraries.
 For example, a chromosome XI
specific sublibrary composed of
138 clones have been sorted out
from an unordered cosmid library
by colony hybridization, using
chromosome XI the DNA purified
by pulsed-field gel
electrophoresis. The 'nested
chromosomal fragmentation‘
was then applied to rapid
sorting of these clones
Nested chromosomal fragmentation
approach.
12

 To facilitate sequencing and assembly of the
sequences, contigs of overlapping cosmids and fine
resolution physical maps of the respective
chromosomes were constructed first, by application
of classical mapping methods (fingerprints, cross-
hybridization) or by novel methods developed for
this programme, such as site-specific chromosome
fragmentation
13

Genetic and physical map of yeast chromosome II.
14

Sequencing Strategies
 Two principle approaches were used to prepare sub
clones for sequencing:
(i) Generation of sub libraries by the use of a series of
appropriate restriction enzymes or from nested
deletions of appropriate sub fragments made by
exonuclease III;
(ii) Generation of shotgun libraries from whole cosmids
or sub cloned fragments by random shearing of the
DNA.
 Sequencing by the Sanger technique
16

Sequence Analysis
 Along with the data submissions by the
single laboratories, and finally when the
complete sequences were available, they
were subjected to analysis by various
algorithms.
17

The sequences have been interpreted
using the following principles
(i) All intron splice site pairs detected by using specially defined
patterns.
(ii) All open reading frames (ORF) containing at least 100
contiguous sense codons and not contained entirely in a longer
ORF on either DNA strand were listed (this included partially
overlapping ORFs).
18

(iii) The two lists were merged and all intron splice site pairs
occurring inside an ORF but in opposite orientation were
disregarded.
(iv) Centromere and telomere regions thereof were sought by
comparison with previously characterized datasets of such
elements including the database entries provided in a
continuously updated library.
19

 For similarity of proteins to entries in the
databanks were performed by FASTA, and
FLASH, in combination with the Protein
Sequence Database of PIR-International and
other public databases.
 Protein signatures were detected by using the
PROSITE dictionary as well as BLOCKS and
PRODOM domains whenever relevant for the
interpretation of the query sequence.
20

Compositional analyses of the
chromosomes (base composition;
nucleotide pattern frequencies, GC
profiles; ORF distribution profiles,
etc.) were performed by using GCG
programmes. For calculations of GC
content of ORFs the algorithm
CODONS was used.
21

This information was than
compiled at the end of the
sequencing project to annotate
all genetic elements in the yeast
genome.
22

Cloning and sequencing of yeast chromosome II.
23

Result
 In 1996 the Saccharomyces Genome Project has
revealed the presence of more than 6000 open reading
frames (ORFs) in the S. cerevisiae genome.
 The goal of the Saccharomyces Genome Deletion
Project was to generate as complete a set as possible
of yeast deletion strains with the overall goal of
assigning function to the ORFs through phenotypic
analysis of the mutants.
24

Conti…
 The average ORF size is 1450 bp. The sizes of the majority
of the open reading frames (ORFs) in yeast vary between
100 to 4000 codons.
 Less than 1% of the ORFs is estimated to be below 100
codons.
 14.8% of the total base pairs are homologues among gene of
unknown function', sometimes called ‘orphans”
25

Conti…
 Five different types of Ty elements that exhibit
substantial homology to retroviruses and
retrotransposons from plants and animals are
present in the yeast genome.
 The average base composition of yeast DNA is
38.4% (G+C).
 The protein coding regions have a higher GC
content on average (40.2%) than the non-
coding regions (35.1%).
26

Conti…
 The genome is composed of about
12,069,313 base pairs and
6,275 genes, compactly organized on
16 chromosomes. Only about 5,800
of these are believed to be true
functional genes.
27

Completely Sequenced Genomes yeasts
Year Name
Size
[Exact
Length]
Publication
Predicted
Genes
Sequence
[GenBank
#]
2002
Schizosaccharomyces
pombe
13.8 Mb
[exact]
Nature,
415(6874):8
71-880
(2002).
4824?
[confirmed]
sequence
sanger [uk]
NCBI [usa]
1996
Saccharomyces
cerevisiae
12Mb
[12,069
,313]
Nature
387, 5-105
(suppl)
(1997).
5800?
[confirmed
]
sanger,
NCBI [usa]
28

For each chromosome
Year Chromosome
Size
[Exact Length]
Publication
[Submitted Date]
1995 1
0.23 Mb
[230,203]
Bussey et al. Proc. Natl. Acad.
Sci. 92:3809-3813(1995)
1994 2
0.81 Mb
[813,139]
Feldmann et al. EMBO J,
13:5795-5809 (1994)
1992 3
0.32 Mb
[316,613]
Oliver et al. Nature, 357:38-46
(1992)
1997 4
1.5 Mb
[1,531,929]
Jacq et al. Nature (suppl),
387:75-78 (1997)
1997 5
0.58 Mb
[576,869]
Dietrich et al. Nature (suppl),
387:78-81(1997)
1995 6
0.27 Mb
[270,148]
Murakami et al. Nature Genet.,
10:261-268 (july 1995)
1997 7
1.1 Mb
[1,090,937]
Tettelin et al. Nature (suppl),
387:81-84 (1997)
1994 8
0.56 Mb
[562,639]
Johnston et al. Science,
265:2077-2082 (Sept 30 1994)
29

Year Chromosome
Size
[Exact Length]
Publication
[Submitted Date]
1997 9
0.44 Mb
[439,885]
Churcher et al. Nature (suppl),
387:84-87 (1997)
1996 10
0.75 Mb
[745,444]
Galibert et al. EMBO J, 15:2031-
2049 (1996)
1994 11
0.67 Mb
[666,445]
Dujon et al. Nature, 369:371-378
(June 2, 1994)
1997 12
1.1 Mb
[1,078,173]
Johnston et al. Nature (suppl),
387:87-90 (1997)
1997 13
0.92 Mb
[924,430]
Bowman et al. Nature (suppl),
387:90-93 (1997)
1997 14
0.78 Mb
[784,328]
Philippsen et al. Nature (suppl),
387:93-98 (1997)
1997 15
1.1 Mb
[1,091,284]
Dujon et al. Nature (suppl),
387:98-102 (1997)
1997 16
0.95 Mb
[948,061]
Bussey et al. Nature (suppl),
387:103-105 (1997)
30

Consortia involved in the yeast genome sequencing project
31

Classification of yeast genes
32

Conti…
 With the completion of the yeast
genome sequence, for the first
time, it became possible to
define the proteome of a
eukaryotic cell.
 The term 'proteome' has been
coined to describe the complete
set of proteins synthesized by a
living cell.
33

Comparison of the Yeast Genome with
Other Genomes
 The Human-Yeast Connection: It
is estimated that greater than 30% of
the yeast genes have homologues
among the human genes.
34

Comparison of homologous genes from
different species.

Conclusion
 Sequence completed in April 1996.
 12 mega bases on 16 chromosomes.
 About 6000 open reading frames.
 Few introns. (4%)
 70% of genome encodes proteins.
 75-80% genes are expressed.
 43% of genes are functionally
characterized
37

Yeast Genome Sequencing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Yeast Genome Sequencing

Similar to Yeast Genome Sequencing (20)

More from ISF COLLEGE OF PHARMACY MOGA

More from ISF COLLEGE OF PHARMACY MOGA (20)

Recently uploaded

Recently uploaded (20)

Yeast Genome Sequencing