Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Data retrieval tools
1. Data retrieval tools
Dedicated to access information for molecular biologists.
Most widely used are,
1. Entrez
2. DBGET
3. SRS
Each of these allows,
- Text based searching of a no. of linked DBs.(Data Bases)
- Sequence searching.
They differ in,
- The DBs they cover
- How the retrieved information is accessed and presented.
Entrez
- WWW-based data retrieval system.
- Developed by NCBI (National Centre for Biotechnology Information).
- Integrates information held in different DBs.
Data bases covered by Entrez are,
Nucleic acid - GenBank, RefSeq, PDB.
Protein seqs - SWISS-PROT, PIR.
3D structures – MMDB
Genomes – Many sources
PopSet – From GenBank
OMIM – OMIM
Taxonomy – NCBI taxonomy database
Books- Bookshelf
ProbeSet – GEO (Gene Expression Omnibus)
Literature - PubMed
3. - Data retrieval tool developed by EBI
- Integrates 80 molecular biology DBs
- An Open source software (Can be installed locally)
SRS has an associated scripting language called Icarus
Central resource for molecular biology data
- more than 250 databanks have been indexed. More than 35 SRS servers over the
WWW(world wide)
Data analysis applications server
- 11 protein applications
- 6 nucleic acid applications
- Uniform query interface on the web
History of SRS
1990 - Main author Dr. Thure Etzold
– Development started in EMBL, Heidelberg
1997
– Moved to EBI in Cambridge. Development work was supported by various
grants amongst others from the EMBnet.
1998
– Etzold and his group join LionBiosciences
Information retrieval
– Easy way to retrieve information from sequence and sequence-related
databases
– Possibility to search for multiple words/other criteria
Linkage between different databases
– E.g. Find all primary structures with known three-dimensional structure.
Different types of database in SRS
Sequence & structure
– DNA, protein, three-dimensional structures
Sequence-related
Gene-related
– Genome, mapping, mutations, transcription factors
– SNP
Bibliographic
4. – Medline, enzyme
User-defined
SRS main toolbar tabs:
Top Page: displays databases in different database groups
Query: displays either the standard or extended query form
Results or “the query manager”: maintains a history of all the results obtained
during a session
Projects or “the project manager”: maintains a history of all queries and views
used during a session
Views: allows a user to define a user specific view for one or more databases
Databanks: contains a list and some facts about the databases available in the
system
Search terms in SRS
SRS indexed fields can be searched using any of the following:
– Single word search
– Multiple word phrases
– Numbers and dates
– Regular expressions
– Wildcards