The document discusses the National Center for Biotechnology Information (NCBI), which maintains biological databases and provides bioinformatics tools. NCBI houses both primary databases directly submitted by researchers and secondary databases compiled from primary sources. Major databases include GenBank (nucleotide sequences), PubMed Central (biomedical literature), and reference sequence databases. Tools like BLAST, Entrez, and ORFfinder allow users to search and analyze sequence data. NCBI aims to make biomedical research data freely accessible worldwide.
2. NCBI
What is NCBI?
National center for biotechnology
information
Established in 1998
Part of national library of medicine at
national institute of health
Major aim : public database
Development of software tools for
sequence analysis and disseminate
biomedical information
2
3. 2 explain Roles of NCBI
1) Maintenance of biological databases
whether primary or secondary. It
includes GENEBANK
2) NCBI provides the data retrieval
systems such as ENTREZ
3) Provides computational sources for
the analysis of the GENEBANK data
and other biological data
3
4. Kinds of databases
Primary databases
Secondary
databases
Original submission by
the experimentalists who
have originally searched
Content Is controlled by
the submitters
Examples include
GENEBANK, SNP and
GEO
Built up from primary
data which is retrieved by
primary database
Content controlled by
third party NCBI
Examples include
RefSeq, RefSNP, NCBI
Structure, Protein. Etc.
4
6. 6
NCBI
TOOLS
BLAST
Standard blast Mega blast
PSI-blast PHI-blast
RPS blast
BLAST 2 SEQ
DATABASE
RETREIVAL
TOOL
SPECIALIZED
TOOL
ORF finder E-pcr
Sequence
submission
tool bankit
Spidey
DATABASES
Nucleotide
database
Literature
database
Protein
database
Expression
database
Structure
database
7. Retrieval tool ENTREZ
Integrated database search and
retrieval system
Provides extensive links between and
within database records
Cross references of different
databases
7
8. 3 Sequence submission to
NCBI
Databases are constantly updated
with the newer submissions of the
sequences via sequence submission
tools such as:
Bankit
Sequein
8
9. Bank it
Web-based sequence submission tool
Connect to NCBI Home Page
Connect to GENEBANK side bar at
left
Tool of choice for simple submissions
Can also be used for updating
previously added information
9
10. Sequein
Stand alone sequence submission
and updating tool
Handling multiple sequence
submission
Provides increased capacity for long
sequence submissions
Multiple annotation
Phylogenetic analysis population
10
11. BLAST
Basic local alignment search tool
program
Sequence similarity searches against
a variety of different sequence
databases
Unigene, gene, MMDB, GEO
11
13. SPECIALIZED TOOLS
There are a lot of sequence analysis
tools which will be explained later
1) ORF Finder
2) e-PCR
3) SPIDEY
13
14. ORF FINDER
Open reading frame finder
Graphical analysis tool
Finds all open reading frames in the
user’s sequence or the sequence
already submitted in the databases
Uses standard and alternative genetic
codes for the analysis of reading
frames
Packaged with sequein
14
15. e-PCR
Electronic polymerase chain reaction
Searches for the STS
Whole template DNA is searched for
STS
New database searches a query
sequence against a sequence
database
15
16. Spidey
This is another m RNA to genome
alignment tool
Searches databases via BLAST
As an input it gets a single genomic
sequence and m RNA FASTA
sequences
Pseudo genes and paralogues are
eliminated in this search and rue gene
is selected.
16
18. Nucleotide database-
GENEBANK
NCBI’s primary sequence database
Comprehensive public database of
nucleotide sequences
Bibliographic support
Built from authors entry into genebak
regarding EST
Genebank an EMBL make an INSD
Collaborative approach to share data
daily
18
19. HOMOLOGENE
Automated detection of homologues
Completely sequenced eukaryotic
genes
Analyses the proteins of the input
organism
Blastp
Taxonomic trees are being made
Statistical analysis of each match is
done and orthologs and paralogs are
identified 19
20. Db SNP
Database of single nucleotide
polymorphisms
Short deletion and insertions
polymorphisms
SNP~ 3D structures via Cn3D and
MMDB
Functional variants could be matched
with the OMIM
20
21. Literature database- PMC
Pubmed central
Digital archive of peer review journals
of life sciences
Enormous full text journals are there
Immediate access to full text journals
or within 12 months of publishing
21
22. Protein database
ENTREZ PROTEIN ~ Protein
sequence database of NCBI
Databases are cross searched
PDB, Swiss-Prot
Taxonomic relations
CDD conserved domain database
22
23. Gene expression database
Distribution and regulation of the
Transcriptional products
Normal and abnormal cell types
Lot of techniques have been
developed for survey of genome wide
transcript expression
23
24. SAGE map
Serial analysis of gene expression
map
Gene expression data analysis
Tag-to-gene function map
SAGE tags to gene clusters or a
single gene
A reciprocal gene to tag SAGE Map is
also available
Updated weekly
24
25. Structural database- MMDB
Molecular modeling database MMDB
3D macromolecular structures
XRD and NMR are being used for the
experimental structure determination
Evolutionary history of function
Relationship between
macromolecules.
25
30. Chemical database- Pubchem
Database for the chemical molecules
Freely accessed through web-user
interface
Chemical structure
Diagnostic and therapeutic agents
Molecular mass below 2000u
Bridge between macromolecular
genomics and small organic
molecules of cellular metabolism
30