SlideShare a Scribd company logo
1 of 78
Download to read offline
Dr. Sven Nahnsen,
Quantitative Biology Center (QBiC)
Data Management for Quantitative
Biology
Lecture 1: Introduction and overview
Overview
• Administrative stuff (credits, requirements)
• Motivation/quick review of relevant contents
(Bioinformatics I and II)
• Introduction to this lecture series
• Semester overview
2
Course requirements
To pass this course you must:
•  regularly and actively participate in the weekly problem sessions,
•  pass the final exam, assignments and projects
•  You have to work on assignments alone
•  You will work in small groups for the problem-orientated research
project
3
Course credits and grading
•  Credits
-  MSc Bioinfo: 4 LP, module “Wahlpflichtbereich Bioinformatik”
-  MSc Info: 4 LP, area “Wahlpflichtbereich Informatik”
•  Grade
-  30% assignments
-  20% project
-  50% finals
•  Finals: oral exam (30 minutes) covering the contents of the whole
lecture, the assignments and the project
•  Finals will be scheduled at the end of the semester (Thu,
30/07/2015)
4
Recommended literature
•  We will point to relevant papers during the course of the literature
•  Important overview papers:
§  Hastings et al., 2005, Quantitative Bioscience for the 21st century. BioScience. Vol
55 No. 6
§  Cohen JE (2004) Mathematics Is Biology's Next Microscope, Only Better; Biology
Is Mathematics' Next Physics, Only Better. PLoS Biol 2(12): e439.
•  Books
§  Free E-Book: Data management in Bioinformatics (
http://en.wikibooks.org/wiki/Data_Management_in_Bioinformatics)
§  Lacroix, Z.; Critchlow, T. (eds): Bioinformatics: Managing Scientific Data. Morgan
Kaufmann Publishers, San Francisco 2003
§  Michale E. Wall, Quantitative Biology: From Molecular to Cellular Systems. 2012.
Chapman & Hall
§  Pierre Bonnet. Enterprise Data Governance: Reference and Master Data
Management Semantic Modeling. 2013. Wiley
•  Web resources
§  http://www.ariadne.ac.uk: Ariadne, Web Magazine for Information Professionals
§  http://www.dama.org: THE GLOBAL DATA MANAGEMENT COMMUNITY
§  H.D. Ehrich: http://www.ifis.cs.tu-bs.de/node/2855
5
Recommended Software
•  These software tools/framework and webservers will be used
during the problem sessions
http://www.cisd.ethz.ch/software/openBIS https://usegalaxy.org
https://vaadin.com/home https://www.knime.org/
6
Contact and organization
•  Questions concerning the lecture/assignments
§  dmqb-ss15@informatik.uni-tuebingen.de
•  Website
§  abi.inf.uni-tuebingen.de/Teaching/ws-2013-14/CPM
•  Christopher Mohr (Sand 14, C322) , Andreas Friedrich (Sand 14, C 304)
•  Dr. Sven Nahnsen (Quantitative Biology Center, Auf der Morgenstelle 10,
C2P43, please send e-mail first)
•  Course material will be available on the website (see above), through
social media channels and (if wished) as a hard copy during the lecture
facebook.com/qbic.tuebingen twitter.com/qbic_tue
7
Who am I
•  Most of me and on our work can be found here:
www.qbic.uni-tuebingen.de
8
Contents of this lecture
Date	
   Lecturer	
   Lecture	
  8-­‐10	
  AM	
  
Thursday	
  16	
  April	
  15	
   Nahnsen	
   Introduc8on	
  and	
  overview	
  
Thursday	
  23	
  April	
  15	
   Nahnsen	
   Biological	
  Data	
  Management	
  
Thursday	
  30	
  April	
  15	
   Czemmel	
  
Data	
  sources	
  ("Next-­‐genera8on"	
  
technologies)	
  
Dr. Stefan Czemmel
9
Contents of this lecture
Date	
   Lecturer	
   Lecture	
  8-­‐10	
  AM	
  
Thursday	
  7	
  May	
  15	
   Codrea	
  
Database	
  systems	
  	
  (mySQL,	
  
noSQL,	
  etc.)	
  
Thursday	
  14	
  May	
  15	
  
Ascension	
  Day	
  
(Himmelfahrt)	
  
Thursday	
  21	
  May	
  15	
   Czemmel	
   LIMS	
  and	
  E-­‐Lab	
  books	
  
Thursday	
  28,	
  May	
  15	
   Kenar	
   Experimental	
  Design	
  
Dr. Marius CodreaErhan Kenar
10
Contents of this lecture
Date	
   Lecturer	
   Lecture	
  8-­‐10	
  AM	
  
Thursday	
  4	
  June	
  15	
  
Corpus	
  Chris8	
  Day	
  
(Fronleichnam)	
  
Thursday	
  11	
  June	
  15	
   Nahnsen	
   Data	
  analysis	
  workflows	
  (I)	
  
Thursday	
  18	
  June	
  15	
   Nahnsen	
   Data	
  analysis	
  workflows	
  (II)	
  
Thursday	
  25	
  June	
  15	
   Nahnsen	
   Standardiza8on	
  
Thursday	
  2	
  July	
  15	
   Nahnsen	
   Big	
  Data	
  
Thursday	
  9	
  July	
  15	
   Nahnsen	
  
Integrated	
  data	
  management	
  
(OpenBIS,	
  OpenBEB)	
  
Thursday	
  16	
  July	
  15	
   Nahnsen	
   Applica8ons	
  
Thursday	
  23	
  July	
  15	
   Nahnsen	
   Exam	
  prepara8on	
  
Thursday	
  30	
  July	
  15	
  
Nahnsen,	
  Mohr,	
  
Friedrich	
   EXAMS	
  
11
What is your background?
Ad hoc collection from the audience, Apr. 16, 2015
•  Computer Science
•  Bioinformatics(immonoinformatics; User Front-end;integration ,
visualization)
•  Biology
•  Drug design
•  Agricultural biology (plant breeding)
•  Bioinformatics (Tx, NGS)
•  Geoecology
•  (ecology)
•  Biochemistry; Molecular Biology
•  Structural Biology
•  Electronic business
12
Let us brainstorm
Ad hoc collection from the audience, Apr. 16, 2015
•  What is data management?
-  Rapid access to data
-  Selective access to data; database queries
-  Combine data; manipulate data efficiently
-  Big data storage/analysis
-  Curating quality
-  Data visualization
-  Make data interpretable
13
Let us brainstorm
•  What is data management?
http://zonese7en.com/wp-content/uploads/2014/04/Data-Management.jpg, accessed Apr 10,
2015, 11 AM 14
Data Management
•  The official definition provided by DAMA (Data management
association) International, the professional organization for those
in the data management profession, is: "Data Resource
Management is the development and execution of
architectures, policies, practices and procedures that
properly manage the full data lifecycle needs of an enterprise.”
•  Further, the DAMA – Data management Body of Knowledge
((DAMA-DMBOK)) states:” Data management is the development,
execution and supervision of plans, policies, programs and
practices that control, protect, deliver and enhance the value of
data and information assets ”
Wikipedia: http://en.wikipedia.org/wiki/Data_management accessed Mar 30, 2015, 10 PM
15
10 Data Management functions According to the DAMA
Data Management Body of
Knowledge (DMBOK)
16
Data governance
•  Strategy
•  Organization and roles
•  Policies and standards
•  Projects and services
•  Issues
•  Valuation
Source: DAMA DMBOK Guide, p. 10
“Planning, supervision and control over data management and use”
http://meship.com
17
Data quality management
•  Data cleansing
•  Data integrity
•  Data enrichment
•  Data quality
•  Data quality assurance
Source: DAMA DMBOK Guide, p. 10
“defining, monitoring and improving data quality”
http://www.arcplan.com/
18
Data architecture management
•  Data architecture
•  Data analysis
•  Data design (modeling)
Source: DAMA DMBOK Guide, p. 10
atasourceconsulting.com
19
Data development
•  Analysis
•  Data modeling
•  Database design
•  Implementation
Source: DAMA DMBOK Guide, p. 10
dataone.org
20
“Data development is the process of building a data set for a specific purpose. The
process includes identifying what data are required and how feasible it is to obtain
the data. Data development includes developing or adopting data standards in
consultation with stakeholders to ensure uniform data collection and reporting, and
obtaining authoritative approval for the data set.”, A guide to data development,
Australian Institute of Health and Welfare Canberra, 2007
Database management
•  Data maintenance
•  Data administration
•  Database management
system
Source: DAMA DMBOK Guide, p. 10
21
Data Security Management
•  Standards
•  Classification
•  Administration
•  Authentication/Authorization
•  Auditing
Source: DAMA DMBOK Guide, p. 10
http://www.techieapps.com/wp-content/uploads/2013/07/2-1024x768.jpg
22
Reference and Master Data management
•  External/internal codes
•  Customer Data
•  Product Data
•  Dimension management (why do
different dimensions (entities) relate to each other)
•  Taxonomy/Ontology
Source: DAMA DMBOK Guide, p. 10
Master Reference
23
Reference data
management
Master data
Data warehousing and business intelligence management
•  Architecture
•  Implementation
•  Training and Support
•  Monitoring and Tuning
Source: DAMA DMBOK Guide, p. 10
Raw data
Metadata
…
Summary
data
Data warehouse
25
Data warehousing and business intelligence management
Raw data
Metadata
…
Summary
data
Data warehouse
Input
Report
Business intelligence
26
Document, record and content management
•  Acquisition and storage
•  Backup and Recovery
•  Content Management
•  Retrieval
•  Retention
Source: DAMA DMBOK Guide, p. 10
27
Metadata management
Metadata is data about data
•  Architecture
•  Integration
•  Control
•  Delivery
Source: DAMA DMBOK Guide, p. 10
28
DAMA – DMBOK
•  A broad collection of all discipline and subtopics that are
summarized under the umbrella of data management
•  These concern many business-related issues, but many concepts
are very well applicable to the field of bioscience
•  We will come back to various aspects of the DAMA DMBOK during
the course
29
Data management needs in science and research
•  Survey at the University of Oregon, USA (Brian Westra. "Data Services for the Sciences: A
Needs Assessment". July 2010, Ariadne Issue 64 http://www.ariadne.ac.uk/issue64/westra/)
•  Different scientific discipline
30
Data management in science and research
Brian Westra. "Data Services for the Sciences: A Needs Assessment". July 2010, Ariadne Issue 64
http://www.ariadne.ac.uk/issue64/westra/, accessed Apr. 10, 2015, 11 AM
1 2 3 4 5 6 7 8 9 10 11
1 Data storage and backup 7 Finding and accessing related data from others
2 Making scientific data findable by others 8 Connecting data storage to data analysis
3 Connecting data acquisition to data storage 9 Liniking this data to publications or other asset
4 Allowing or controlling access to scientific data by others 10 Ensuring data is secure and trustworthy
5 Documenting and tracking updates 11 Others
6 Data analysis and manipulation
31
Let us brainstorm
•  What is Quantitative Biology?
Ad hoc collection from the audience, Apr. 16, 2015
-  Not only yes/no, but put amounts to entities
-  Huge amount of data
-  Qunatitative methods to study biology
-  System-wide analysis; specific pathways
-  Make results human readable and accesible
32
Quantitative Biology
•  The term quantitative biology has been coined by Hastings et al.,
2005.
•  High-throughput methods have led to a paradigm shift in
biomedical research
•  Traditionally, the focus was on one-molecule-at-a-time for most
bio(medical) research projects
•  Now, data on whole genomes, exomes, epigenomes,
transcriptomes, proteomes and metabolomes can be generated at
low cost.
•  The term quantitative biology is used to describe this paradigm
shift. Improvements in this area have been driven mainly by two
technological developments:
Hastings et al., 2005, Quantitative Bioscience for the 21st century. BioScience. Vol 55 No. 6
33
Technological innovations
•  State-of-the-art mass spectrometers coupled to high-
performance liquid chromatography through soft ionization
techniques (HPLC-ESI-MS) have quickly changed the way we do
proteomics, metabolomics, and lipidomics.
•  Next-generation sequencing has similarly changed the way we
look at genomes, epigenomes, transcriptomes, and metagenomes.
Due to advances in chemistry and imaging, sequencing reactions
have been parallelized on a very large scale. The
comprehensiveness of the data produced by high-throughput
methods makes them particularly interesting as general-purpose
analytical and diagnostic techniques.
34
Technological innovations
•  Imaging technologies can now produce high-resolution pictures
of fine-grained cellular details at a very high speed
•  Finally methods from bioinformatics and computational biology
have matured to rapidly analyze the huge raw data sets that are
generated by these high throughput technologies
35
Contents from Bioinformatics 2 (high-throughput
technologies
•  Most of the high throughput technologies have been introduced
during the Bioinformatics II lecture
•  There are specialized lectures on “Transcriptomics” and on
“Computational Proteomics and Metabolomics”
•  We will give a short Recap on the Bioinformatics II contents that
are relevant for this lecture
•  More advanced topics on data generation methods will be
introduced in lecture 3 by Dr. Stefan Czemmel (focus on next
generation sequencing)
36
Origin of the “Central Dogma of Molecular Biology” (Francis Crick, 1956)
The central dogma of molecular biology
•  First articulation by Francis Crick in 1956
•  Published in Nature in 1970
37
The central dogma – classical view
•  In general, the classic view reflects how biology is (biological data
are) organized
•  Genomics, however, enabled a more complex view
Cox Systems Biology Lab | Research, University of Toronto, Canada
38
Reminder (Bioinformatics 2)
Oltvai-Barabasi, Science, 2002 39
Recap Bioinformatics II: Systems biology
•  Quantitative data on various levels of biological complexity build
fundaments of systems biology
•  Mathematical modeling has been based on gene expression
•  Recent important technological improvements allow the analysis of protein
and metabolite profiles to a great depth
•  Important layers for understanding biology
•  New experimental techniques offer tremendous challenges for
computational analysis
40
Recap Bioinformatics II: Aims of Systems Biology
•  Describe large-scale organization
•  Quantitative modeling
•  Describe cell as system of networks
-  Fundamental research: time-resolved quantitative
understanding of living systems
-  Medicine: enable personalized medicine (e.g., improve
treatment strategies for cancer patients)
-  Biotechnology: improve production, degradation, construction
of synthetic organisms, etc.
41
Exp. Methods – Transcriptomics
•  Extract and amplify RNA
•  Hybridization on microarray
•  Identify and quantify by fluorescence signal
•  Sequences can be mapped back to genome
Lindsay, Nature Rev. Drug Discovery, 2003, 2, 803
42
Microarray Data Analysis
•  Key problems in microarray
data analysis are
-  Data normalization
-  Clustering
-  Dimension reduction
-  Diagnostics/classification
-  Network inference
-  Visualization of results
Janko Dietzsch , Nils Gehlenborg and Kay Nieselt. Mayday-a microarray
data analysis workbench. Bioinformatics 2006 22(8):1010-1012 43
Genome sequencing
February 15, 2001 February 16, 2001
44
Genome sequencing
•  2001: initial publication
•  2003: 2nd draft “Human Genome”
•  > 13 years of work and > 3*109 $
•  2010: 8 days 1*104 $
•  Today: approximately 5.5 days and < 1*104 $
•  Future: within 3 years Biotech company (Pacific Biosciences)
expects similar amount of data in < 15 min for < 1*103 $
45
Status genomics/transcriptomics
•  Dramatic drop in cost for genome sequencing
•  Number of sequenced genomes grows continuously
•  Genome is a very static snapshot of living system
•  Biological adaption is rather slow; long-term information storage
•  Proteins and their reaction products, metabolites are much closer
to reality
•  Genome and transcriptome databases are essential bases for
proteomics and metabolomics research
46
Genomics vs. Proteomics
Genomics Proteomics
Genomes rather static
~ 20 k genes
established technology
(capillary sequencer)
Proteomes are dynamic
(age, tissue, breakfast,
…)
up to 1000 k proteins
emerging technologies
(MS, HPLC/MS, protein
chips)
47
Proteomics
http://www.iamashcash.com/wp-content/uploads/2011/03/caterpillar-to-butterfly1.jpg, accessed: 14/10/2013 6 PM
Genome remains the same
Proteome changes
48
Main fields of proteomics
49
Applications of proteomics
50
Shotgun Proteomics
51
Next generation sequencing
1st generation 2nd generation
Illumina / Solexa
Genetic Analyzer
2000 Mb / run (96h)
Roche / 454
Genome Sequencer FLX
400 Mb / run (8h)
Applied Biosystems
SOLiD
3000 Mb / run (120h)
300 : 1 (2008)
Applied Biosystems
3730xl
0,08 Mb / run
1 Mb / 24 h
>3000 : 1 (2010)
1st generation 2nd generation
Slide: Prof. Peter Bauer
In 2008
52
„3rd“ generation sequencing
Drmanac Science (2010) 5961: 78-81.
CompleteGenomics
DNB sequencing
18x 210Gb / run
>37.000 : 1 (2010)
3rd generation sequencing
Slide: Prof. Peter Bauer
53
High resolution imaging
“Imaging in biology may refer to >15 different technologies”
Prominent and data-intense examples include:
•  Optical (bioluminescence and fluorescence imaging)
•  Magnetic resonance imaging
•  X-ray computed tomography
•  Positron emission tomography
http://en.wikipedia.org/wiki/Biological_imaging, accessed Apr. 13, 4 PM
54
Imaging workflow
Eliceiri et al., Nature Methods 9, 697–710 (2012)
55
Database systems
•  Relational databases
-  Example MySQL
•  NoSQL databases
-  Example MongoDB
•  How to query databases
•  Entity relationship models
•  Repositories (e.g. Pride, PeptideAtlas)
-  Annotations
-  Sequences
56
Many database concepts
http://dataconomy.com/wp-content/uploads/2014/07/fig2large.jpg
57
Databases/Repositories in Genomics/Proteomics
http://www.ebi.ac.uk/ena
http://www.peptideatlas.org
http://www.ebi.ac.uk/pride/archive/
58
Large-scale study data – 1000 Genomes
•  Sample lists and sequencing progress
•  Variant Calls
•  Alignments
•  Raw sequence files
http://www.1000genomes.org/data
59
Large-scale study data – The cancer genome atlas (TCGA)
•  TCGA aims to help to diagnose, treat and prevent cancer
•  explore the entire spectrum of genomic changes involved in more
than 20 types of human cancer.
•  Approx. 2 PB of genomic raw data
http://cancergenome.nih.gov
60
Laboratory information management systems/
Electronic Lab Books
•  How to track all information that is generated in the laboratory
•  Automated annotation of all experimental parameters is essential
for reproducible science
•  Currently, most experiments are protocolled manually in lab
textbooks
•  Data security (intellectual property versus open data)
61
Experimental design
•  Biological experiments are very complex
•  Statistical significance requires a high number of biological
replicates
•  Often many different conditions and time points need to be
considered
•  One study can involve many different experiments (multi omics
studies involve different omics layers, e.g. genomics +
transcriptomics + proteomics)
•  All experiments come with different meta data requirements
•  For various reasons the experimental design is not always
balanced (e.g. 5 samples in group A and and only 3 samples are
available for group B)
Friedrich, A., et al. Biomed Research International, April 2015 – in press.
Nahnsen, S., Drug Target, May 2015 – in press. 62
Experimental design
Friedrich, A., et al. Biomed Research International, April 2015 – in press.
Nahnsen, S., Drug Target, May 2015 – in press. 63
Data analysis workflows
•  Chain different (heterogeneous) tools
•  Parameter handling
•  Execution in high performance computing environment made easy
64
Standardization in bioinformatics
•  Many world-wide bioinformatics initiatives need to rely on open
standards
•  Development of standards has to be a community effort
•  Standardized data formats are important to guarantee
-  Sustainability
-  Independence of instrument vendors
-  Independence of analysis software
-  Exchangeability of raw data
•  Standard formats increase the amount of data by a factor of x (x =
2-4)
•  Many people refrain from using open standards
65
http://en.wikipedia.org/wiki/Big_Data, accessed Apr 24, 2014
Big data is the term for a collection of data sets so large and complex
that it becomes difficult to process using on-hand database management
tools or traditional data processing applications. The challenges include
capture, curation, storage, search, sharing, transfer, analysis and
visualization. The trend to larger data sets is due to the additional
information derivable from analysis of a single large set of related data,
as compared to separate smaller sets with the same total amount of data,
allowing correlations to be found ……
Big data
66
Big data examples
•  European Council for Nuclear Research (CERN) Geneva,
Swizerland
25 Petabyte/Jahr at LHC (Large Hadron collider) (~6.2 Mio.
DVDs)
CERN
LHD data
Big data Beispiele
ep.ph.bham.ac.uk, 2014
67
Big data examples
•  Google verarbeitet 9.1 Exabyte/year (300 Mio. DVDs)
GOOGLE
data
Mayer-SchĂśnberger, 2013, ititch.com, 2014
68
Biology and Big data?
•  Klassisch: Beobachtung der Natur
und deren Phänomene
DNA RNA Proteine
Träger der
Erbinformation
Expression von
bestimmten Genen
Üben nötige Funktion
in der Zelle aus
1956 formuliert Francis Crick das zentrale Dogma der Molekularbiologie:
•  1950er JahreDurchbruch in der
Molekularbiologie
69
Big data
Vivien Marx, Biology: The big challenges of big data, Nature. 2013, doi:10.1038/498255a
70
Integrated data management in biology/biomedicine
71
http://media.americanlaboratory.com/m/20/Article/35231-fig1.jpg
QBiC infrastructure
72
NGS Lab
Lab
Storage
Data movers
•  Automatically moves large to huge file-based data to a remote
(central) storage
•  Uses rsync routine; easy configuration using config file
•  Data mover athentification: public/private key ssh authentification
•  Moves data to openbis dropboxes (individual boxes and users for
each of the five member labs)
Data
Mover
DataMover:
•  Developed at ETH Zurich as part of OpenBIS
•  http://www.cisd.ethz.ch/software/Data_Mover
73
openBIS (meta) data store
•  Open, distributed system for managing biological
information
•  Captures different experiment types (OMICS,
imaging, screening,...)
•  Tracking, annotating and sharing of experiments,
samples and datasets for distributed research
•  Different servers for meta data and bulk raw
data
•  Underlying PostgreSQL database
•  ETL routines for extraction of meta data and
linking
74
Data organization
•  http://www.cisd.ethz.ch/software/openBIS
75
Data organization
•  http://www.cisd.ethz.ch/software/openBIS
76
Applications
•  Personalized medicine: Individualized vaccination in cancer
•  Large-scale clinical studies: example Hepatocellular carcinoma
77
Contact:
Quantitative Biology Center (QBiC)
Auf der Morgenstelle 10
72076 Tßbingen ¡ Germany
dmqb-ss15@informatik.uni-tuebingen.de
Thanks for listening – See you next week

More Related Content

What's hot

Computer control of fermentation process
Computer control of fermentation process Computer control of fermentation process
Computer control of fermentation process Sanjay236837
 
BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)Sobia
 
Fermentation
FermentationFermentation
FermentationMano Billi
 
Batch &amp; continuous culture
Batch &amp; continuous cultureBatch &amp; continuous culture
Batch &amp; continuous cultureArantha Jessy Joseph
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databasesSangeeta Das
 
Rapid methods for detection of Food-borne Pathogens.
Rapid methods for detection of Food-borne Pathogens.Rapid methods for detection of Food-borne Pathogens.
Rapid methods for detection of Food-borne Pathogens.Kaleem Iqbal
 
DIFFERENT TYPES OF BIOREACTORS
DIFFERENT TYPES OF BIOREACTORSDIFFERENT TYPES OF BIOREACTORS
DIFFERENT TYPES OF BIOREACTORSdhanu prabha
 
Solid state fermentation - Brief introduction
Solid state fermentation - Brief introductionSolid state fermentation - Brief introduction
Solid state fermentation - Brief introductionBalaganesh Kuruba
 
Downstream processing
Downstream processingDownstream processing
Downstream processingSunandaArya
 
Biosensor
BiosensorBiosensor
BiosensorStudent
 
Scale up process or Bioreactor scale up or Upstream process
Scale up process or Bioreactor scale up or Upstream processScale up process or Bioreactor scale up or Upstream process
Scale up process or Bioreactor scale up or Upstream processPurvesh Mendapara
 
Molecular modeling database
Molecular modeling database Molecular modeling database
Molecular modeling database Jayati Shrivastava
 
Methods to detect potability of water sample
Methods to detect potability of water sampleMethods to detect potability of water sample
Methods to detect potability of water samplevimala rodhe
 
Production of Chitinase
Production of ChitinaseProduction of Chitinase
Production of ChitinaseSaikiran Yewatkar
 
Downstream processing - industrial microbiology
Downstream processing - industrial microbiology Downstream processing - industrial microbiology
Downstream processing - industrial microbiology Kiran Kumar
 
Bioinformatic in drug designing
Bioinformatic in drug designingBioinformatic in drug designing
Bioinformatic in drug designingSalman Khan
 
Media formulation
Media formulationMedia formulation
Media formulationeswar1810
 
Fermentation -- Scale up Technology
Fermentation -- Scale up TechnologyFermentation -- Scale up Technology
Fermentation -- Scale up TechnologyDr. Pavan Kundur
 

What's hot (20)

Computer control of fermentation process
Computer control of fermentation process Computer control of fermentation process
Computer control of fermentation process
 
BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)
 
Fermentation
FermentationFermentation
Fermentation
 
Batch &amp; continuous culture
Batch &amp; continuous cultureBatch &amp; continuous culture
Batch &amp; continuous culture
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Rapid methods for detection of Food-borne Pathogens.
Rapid methods for detection of Food-borne Pathogens.Rapid methods for detection of Food-borne Pathogens.
Rapid methods for detection of Food-borne Pathogens.
 
DIFFERENT TYPES OF BIOREACTORS
DIFFERENT TYPES OF BIOREACTORSDIFFERENT TYPES OF BIOREACTORS
DIFFERENT TYPES OF BIOREACTORS
 
Solid state fermentation - Brief introduction
Solid state fermentation - Brief introductionSolid state fermentation - Brief introduction
Solid state fermentation - Brief introduction
 
Downstream processing
Downstream processingDownstream processing
Downstream processing
 
Biosensor
BiosensorBiosensor
Biosensor
 
Scale up process or Bioreactor scale up or Upstream process
Scale up process or Bioreactor scale up or Upstream processScale up process or Bioreactor scale up or Upstream process
Scale up process or Bioreactor scale up or Upstream process
 
Molecular modeling database
Molecular modeling database Molecular modeling database
Molecular modeling database
 
Screening
ScreeningScreening
Screening
 
Methods to detect potability of water sample
Methods to detect potability of water sampleMethods to detect potability of water sample
Methods to detect potability of water sample
 
Bioreactors
BioreactorsBioreactors
Bioreactors
 
Production of Chitinase
Production of ChitinaseProduction of Chitinase
Production of Chitinase
 
Downstream processing - industrial microbiology
Downstream processing - industrial microbiology Downstream processing - industrial microbiology
Downstream processing - industrial microbiology
 
Bioinformatic in drug designing
Bioinformatic in drug designingBioinformatic in drug designing
Bioinformatic in drug designing
 
Media formulation
Media formulationMedia formulation
Media formulation
 
Fermentation -- Scale up Technology
Fermentation -- Scale up TechnologyFermentation -- Scale up Technology
Fermentation -- Scale up Technology
 

Viewers also liked

Clinical data management india as a hub
Clinical data management india as a hubClinical data management india as a hub
Clinical data management india as a hubBhaswat Chakraborty
 
The Imperative of Linking Clinical and Financial Data to Improve Outcomes - H...
The Imperative of Linking Clinical and Financial Data to Improve Outcomes - H...The Imperative of Linking Clinical and Financial Data to Improve Outcomes - H...
The Imperative of Linking Clinical and Financial Data to Improve Outcomes - H...Health Catalyst
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020Sarah Jones
 
The ABCs of Clinical Trial Management Systems
The ABCs of Clinical Trial Management SystemsThe ABCs of Clinical Trial Management Systems
The ABCs of Clinical Trial Management SystemsPerficient, Inc.
 
Clinical trials ppt. Dr. Zubair Ali
Clinical trials ppt. Dr. Zubair AliClinical trials ppt. Dr. Zubair Ali
Clinical trials ppt. Dr. Zubair AliDr. Zubair Ali
 
Introduction to Oracle Clinical Data Model
Introduction to Oracle Clinical Data ModelIntroduction to Oracle Clinical Data Model
Introduction to Oracle Clinical Data ModelPerficient
 
Clinical data-management-overview
Clinical data-management-overviewClinical data-management-overview
Clinical data-management-overviewAcri India
 
Clinical data management process setup
Clinical data management process  setupClinical data management process  setup
Clinical data management process setupDr.K Pati
 
Software Testing Process, Testing Automation and Software Testing Trends
Software Testing Process, Testing Automation and Software Testing TrendsSoftware Testing Process, Testing Automation and Software Testing Trends
Software Testing Process, Testing Automation and Software Testing TrendsKMS Technology
 
Clinical Data Management Training @ Gratisol Labs
Clinical Data Management Training @ Gratisol LabsClinical Data Management Training @ Gratisol Labs
Clinical Data Management Training @ Gratisol LabsGratisol Labs
 
Integrating Clinical Operations and Clinical Data Management Through EDC
Integrating Clinical Operations and Clinical Data Management Through EDCIntegrating Clinical Operations and Clinical Data Management Through EDC
Integrating Clinical Operations and Clinical Data Management Through EDCwww.datatrak.com
 
Role of computers in clinical pharmacy
Role of computers in clinical pharmacyRole of computers in clinical pharmacy
Role of computers in clinical pharmacyRai Waqas
 
Software Testing Process & Trend
Software Testing Process & TrendSoftware Testing Process & Trend
Software Testing Process & TrendKMS Technology
 
Information Management In Pharmaceutical Industry
Information Management In Pharmaceutical IndustryInformation Management In Pharmaceutical Industry
Information Management In Pharmaceutical IndustryFrank Wang
 
Role of Information Technology in Pharmaceutical industry
Role of Information Technology in Pharmaceutical industryRole of Information Technology in Pharmaceutical industry
Role of Information Technology in Pharmaceutical industryshivamthakore
 
Thesis in IT Online Grade Encoding and Inquiry System via SMS Technology
Thesis in IT Online Grade Encoding and Inquiry System via SMS TechnologyThesis in IT Online Grade Encoding and Inquiry System via SMS Technology
Thesis in IT Online Grade Encoding and Inquiry System via SMS TechnologyBelLa Bhe
 

Viewers also liked (16)

Clinical data management india as a hub
Clinical data management india as a hubClinical data management india as a hub
Clinical data management india as a hub
 
The Imperative of Linking Clinical and Financial Data to Improve Outcomes - H...
The Imperative of Linking Clinical and Financial Data to Improve Outcomes - H...The Imperative of Linking Clinical and Financial Data to Improve Outcomes - H...
The Imperative of Linking Clinical and Financial Data to Improve Outcomes - H...
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
 
The ABCs of Clinical Trial Management Systems
The ABCs of Clinical Trial Management SystemsThe ABCs of Clinical Trial Management Systems
The ABCs of Clinical Trial Management Systems
 
Clinical trials ppt. Dr. Zubair Ali
Clinical trials ppt. Dr. Zubair AliClinical trials ppt. Dr. Zubair Ali
Clinical trials ppt. Dr. Zubair Ali
 
Introduction to Oracle Clinical Data Model
Introduction to Oracle Clinical Data ModelIntroduction to Oracle Clinical Data Model
Introduction to Oracle Clinical Data Model
 
Clinical data-management-overview
Clinical data-management-overviewClinical data-management-overview
Clinical data-management-overview
 
Clinical data management process setup
Clinical data management process  setupClinical data management process  setup
Clinical data management process setup
 
Software Testing Process, Testing Automation and Software Testing Trends
Software Testing Process, Testing Automation and Software Testing TrendsSoftware Testing Process, Testing Automation and Software Testing Trends
Software Testing Process, Testing Automation and Software Testing Trends
 
Clinical Data Management Training @ Gratisol Labs
Clinical Data Management Training @ Gratisol LabsClinical Data Management Training @ Gratisol Labs
Clinical Data Management Training @ Gratisol Labs
 
Integrating Clinical Operations and Clinical Data Management Through EDC
Integrating Clinical Operations and Clinical Data Management Through EDCIntegrating Clinical Operations and Clinical Data Management Through EDC
Integrating Clinical Operations and Clinical Data Management Through EDC
 
Role of computers in clinical pharmacy
Role of computers in clinical pharmacyRole of computers in clinical pharmacy
Role of computers in clinical pharmacy
 
Software Testing Process & Trend
Software Testing Process & TrendSoftware Testing Process & Trend
Software Testing Process & Trend
 
Information Management In Pharmaceutical Industry
Information Management In Pharmaceutical IndustryInformation Management In Pharmaceutical Industry
Information Management In Pharmaceutical Industry
 
Role of Information Technology in Pharmaceutical industry
Role of Information Technology in Pharmaceutical industryRole of Information Technology in Pharmaceutical industry
Role of Information Technology in Pharmaceutical industry
 
Thesis in IT Online Grade Encoding and Inquiry System via SMS Technology
Thesis in IT Online Grade Encoding and Inquiry System via SMS TechnologyThesis in IT Online Grade Encoding and Inquiry System via SMS Technology
Thesis in IT Online Grade Encoding and Inquiry System via SMS Technology
 

Similar to Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015

Getting to Grips with Research Data Management
Getting to Grips with Research Data Management Getting to Grips with Research Data Management
Getting to Grips with Research Data Management IzzyChad
 
RDMRose 1.1 The basics
RDMRose 1.1 The basicsRDMRose 1.1 The basics
RDMRose 1.1 The basicsRDMRose
 
Data Management for librarians
Data Management for librariansData Management for librarians
Data Management for librariansC. Tobin Magle
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College LondonSarah Anna Stewart
 
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...ARDC
 
Getting to grips with research data management
Getting to grips with research data management Getting to grips with research data management
Getting to grips with research data management Wendy Mears
 
Educause 2015 RDM Maturity
Educause 2015 RDM Maturity Educause 2015 RDM Maturity
Educause 2015 RDM Maturity ResearchSpace
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data ThingsKatina Toufexis
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghRobin Rice
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Robin Rice
 
Getting to grips with Research Data Management
Getting to grips with Research Data ManagementGetting to grips with Research Data Management
Getting to grips with Research Data ManagementIzzyChad
 
Working with Research Data, 21/05/20
Working with Research Data, 21/05/20Working with Research Data, 21/05/20
Working with Research Data, 21/05/20IzzyChad
 
RDM Programme @ Edinburgh: Data Librarian Experience
RDM Programme @ Edinburgh: Data Librarian ExperienceRDM Programme @ Edinburgh: Data Librarian Experience
RDM Programme @ Edinburgh: Data Librarian ExperienceEDINA, University of Edinburgh
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Managementdancrane_open
 
Practical Strategies for Research Data Management
Practical Strategies for Research Data ManagementPractical Strategies for Research Data Management
Practical Strategies for Research Data Managementdancrane_open
 
AKVS - Edinburgh Data Repository Experiences June 2016
AKVS - Edinburgh Data Repository Experiences June 2016AKVS - Edinburgh Data Repository Experiences June 2016
AKVS - Edinburgh Data Repository Experiences June 2016University of Edinburgh
 
Engaging with students and researchers: the case of the social sciences
Engaging with students and researchers: the case of the social sciencesEngaging with students and researchers: the case of the social sciences
Engaging with students and researchers: the case of the social sciencesLouise Corti
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data LocallyErin D. Foster
 

Similar to Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015 (20)

Getting to Grips with Research Data Management
Getting to Grips with Research Data Management Getting to Grips with Research Data Management
Getting to Grips with Research Data Management
 
RDMRose 1.1 The basics
RDMRose 1.1 The basicsRDMRose 1.1 The basics
RDMRose 1.1 The basics
 
Data Management for librarians
Data Management for librariansData Management for librarians
Data Management for librarians
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
 
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
 
Getting to grips with research data management
Getting to grips with research data management Getting to grips with research data management
Getting to grips with research data management
 
Educause 2015 RDM Maturity
Educause 2015 RDM Maturity Educause 2015 RDM Maturity
Educause 2015 RDM Maturity
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data Things
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of Edinburgh
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...
 
Getting to grips with Research Data Management
Getting to grips with Research Data ManagementGetting to grips with Research Data Management
Getting to grips with Research Data Management
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
 
Working with Research Data, 21/05/20
Working with Research Data, 21/05/20Working with Research Data, 21/05/20
Working with Research Data, 21/05/20
 
RDM Programme @ Edinburgh: Data Librarian Experience
RDM Programme @ Edinburgh: Data Librarian ExperienceRDM Programme @ Edinburgh: Data Librarian Experience
RDM Programme @ Edinburgh: Data Librarian Experience
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Management
 
Practical Strategies for Research Data Management
Practical Strategies for Research Data ManagementPractical Strategies for Research Data Management
Practical Strategies for Research Data Management
 
AKVS - Edinburgh Data Repository Experiences June 2016
AKVS - Edinburgh Data Repository Experiences June 2016AKVS - Edinburgh Data Repository Experiences June 2016
AKVS - Edinburgh Data Repository Experiences June 2016
 
Engaging with students and researchers: the case of the social sciences
Engaging with students and researchers: the case of the social sciencesEngaging with students and researchers: the case of the social sciences
Engaging with students and researchers: the case of the social sciences
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data Locally
 
RDM @ KU Leuven: De verbindende kracht van het Research Data Management Compe...
RDM @ KU Leuven: De verbindende kracht van het Research Data Management Compe...RDM @ KU Leuven: De verbindende kracht van het Research Data Management Compe...
RDM @ KU Leuven: De verbindende kracht van het Research Data Management Compe...
 

Recently uploaded

Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A BeĂąa
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A BeĂąa
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 

Recently uploaded (20)

Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 

Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015

  • 1. Dr. Sven Nahnsen, Quantitative Biology Center (QBiC) Data Management for Quantitative Biology Lecture 1: Introduction and overview
  • 2. Overview • Administrative stuff (credits, requirements) • Motivation/quick review of relevant contents (Bioinformatics I and II) • Introduction to this lecture series • Semester overview 2
  • 3. Course requirements To pass this course you must: •  regularly and actively participate in the weekly problem sessions, •  pass the final exam, assignments and projects •  You have to work on assignments alone •  You will work in small groups for the problem-orientated research project 3
  • 4. Course credits and grading •  Credits -  MSc Bioinfo: 4 LP, module “Wahlpflichtbereich Bioinformatik” -  MSc Info: 4 LP, area “Wahlpflichtbereich Informatik” •  Grade -  30% assignments -  20% project -  50% finals •  Finals: oral exam (30 minutes) covering the contents of the whole lecture, the assignments and the project •  Finals will be scheduled at the end of the semester (Thu, 30/07/2015) 4
  • 5. Recommended literature •  We will point to relevant papers during the course of the literature •  Important overview papers: §  Hastings et al., 2005, Quantitative Bioscience for the 21st century. BioScience. Vol 55 No. 6 §  Cohen JE (2004) Mathematics Is Biology's Next Microscope, Only Better; Biology Is Mathematics' Next Physics, Only Better. PLoS Biol 2(12): e439. •  Books §  Free E-Book: Data management in Bioinformatics ( http://en.wikibooks.org/wiki/Data_Management_in_Bioinformatics) §  Lacroix, Z.; Critchlow, T. (eds): Bioinformatics: Managing Scientific Data. Morgan Kaufmann Publishers, San Francisco 2003 §  Michale E. Wall, Quantitative Biology: From Molecular to Cellular Systems. 2012. Chapman & Hall §  Pierre Bonnet. Enterprise Data Governance: Reference and Master Data Management Semantic Modeling. 2013. Wiley •  Web resources §  http://www.ariadne.ac.uk: Ariadne, Web Magazine for Information Professionals §  http://www.dama.org: THE GLOBAL DATA MANAGEMENT COMMUNITY §  H.D. Ehrich: http://www.ifis.cs.tu-bs.de/node/2855 5
  • 6. Recommended Software •  These software tools/framework and webservers will be used during the problem sessions http://www.cisd.ethz.ch/software/openBIS https://usegalaxy.org https://vaadin.com/home https://www.knime.org/ 6
  • 7. Contact and organization •  Questions concerning the lecture/assignments §  dmqb-ss15@informatik.uni-tuebingen.de •  Website §  abi.inf.uni-tuebingen.de/Teaching/ws-2013-14/CPM •  Christopher Mohr (Sand 14, C322) , Andreas Friedrich (Sand 14, C 304) •  Dr. Sven Nahnsen (Quantitative Biology Center, Auf der Morgenstelle 10, C2P43, please send e-mail first) •  Course material will be available on the website (see above), through social media channels and (if wished) as a hard copy during the lecture facebook.com/qbic.tuebingen twitter.com/qbic_tue 7
  • 8. Who am I •  Most of me and on our work can be found here: www.qbic.uni-tuebingen.de 8
  • 9. Contents of this lecture Date   Lecturer   Lecture  8-­‐10  AM   Thursday  16  April  15   Nahnsen   Introduc8on  and  overview   Thursday  23  April  15   Nahnsen   Biological  Data  Management   Thursday  30  April  15   Czemmel   Data  sources  ("Next-­‐genera8on"   technologies)   Dr. Stefan Czemmel 9
  • 10. Contents of this lecture Date   Lecturer   Lecture  8-­‐10  AM   Thursday  7  May  15   Codrea   Database  systems    (mySQL,   noSQL,  etc.)   Thursday  14  May  15   Ascension  Day   (Himmelfahrt)   Thursday  21  May  15   Czemmel   LIMS  and  E-­‐Lab  books   Thursday  28,  May  15   Kenar   Experimental  Design   Dr. Marius CodreaErhan Kenar 10
  • 11. Contents of this lecture Date   Lecturer   Lecture  8-­‐10  AM   Thursday  4  June  15   Corpus  Chris8  Day   (Fronleichnam)   Thursday  11  June  15   Nahnsen   Data  analysis  workflows  (I)   Thursday  18  June  15   Nahnsen   Data  analysis  workflows  (II)   Thursday  25  June  15   Nahnsen   Standardiza8on   Thursday  2  July  15   Nahnsen   Big  Data   Thursday  9  July  15   Nahnsen   Integrated  data  management   (OpenBIS,  OpenBEB)   Thursday  16  July  15   Nahnsen   Applica8ons   Thursday  23  July  15   Nahnsen   Exam  prepara8on   Thursday  30  July  15   Nahnsen,  Mohr,   Friedrich   EXAMS   11
  • 12. What is your background? Ad hoc collection from the audience, Apr. 16, 2015 •  Computer Science •  Bioinformatics(immonoinformatics; User Front-end;integration , visualization) •  Biology •  Drug design •  Agricultural biology (plant breeding) •  Bioinformatics (Tx, NGS) •  Geoecology •  (ecology) •  Biochemistry; Molecular Biology •  Structural Biology •  Electronic business 12
  • 13. Let us brainstorm Ad hoc collection from the audience, Apr. 16, 2015 •  What is data management? -  Rapid access to data -  Selective access to data; database queries -  Combine data; manipulate data efficiently -  Big data storage/analysis -  Curating quality -  Data visualization -  Make data interpretable 13
  • 14. Let us brainstorm •  What is data management? http://zonese7en.com/wp-content/uploads/2014/04/Data-Management.jpg, accessed Apr 10, 2015, 11 AM 14
  • 15. Data Management •  The official definition provided by DAMA (Data management association) International, the professional organization for those in the data management profession, is: "Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise.” •  Further, the DAMA – Data management Body of Knowledge ((DAMA-DMBOK)) states:” Data management is the development, execution and supervision of plans, policies, programs and practices that control, protect, deliver and enhance the value of data and information assets ” Wikipedia: http://en.wikipedia.org/wiki/Data_management accessed Mar 30, 2015, 10 PM 15
  • 16. 10 Data Management functions According to the DAMA Data Management Body of Knowledge (DMBOK) 16
  • 17. Data governance •  Strategy •  Organization and roles •  Policies and standards •  Projects and services •  Issues •  Valuation Source: DAMA DMBOK Guide, p. 10 “Planning, supervision and control over data management and use” http://meship.com 17
  • 18. Data quality management •  Data cleansing •  Data integrity •  Data enrichment •  Data quality •  Data quality assurance Source: DAMA DMBOK Guide, p. 10 “defining, monitoring and improving data quality” http://www.arcplan.com/ 18
  • 19. Data architecture management •  Data architecture •  Data analysis •  Data design (modeling) Source: DAMA DMBOK Guide, p. 10 atasourceconsulting.com 19
  • 20. Data development •  Analysis •  Data modeling •  Database design •  Implementation Source: DAMA DMBOK Guide, p. 10 dataone.org 20 “Data development is the process of building a data set for a specific purpose. The process includes identifying what data are required and how feasible it is to obtain the data. Data development includes developing or adopting data standards in consultation with stakeholders to ensure uniform data collection and reporting, and obtaining authoritative approval for the data set.”, A guide to data development, Australian Institute of Health and Welfare Canberra, 2007
  • 21. Database management •  Data maintenance •  Data administration •  Database management system Source: DAMA DMBOK Guide, p. 10 21
  • 22. Data Security Management •  Standards •  Classification •  Administration •  Authentication/Authorization •  Auditing Source: DAMA DMBOK Guide, p. 10 http://www.techieapps.com/wp-content/uploads/2013/07/2-1024x768.jpg 22
  • 23. Reference and Master Data management •  External/internal codes •  Customer Data •  Product Data •  Dimension management (why do different dimensions (entities) relate to each other) •  Taxonomy/Ontology Source: DAMA DMBOK Guide, p. 10 Master Reference 23 Reference data management
  • 25. Data warehousing and business intelligence management •  Architecture •  Implementation •  Training and Support •  Monitoring and Tuning Source: DAMA DMBOK Guide, p. 10 Raw data Metadata … Summary data Data warehouse 25
  • 26. Data warehousing and business intelligence management Raw data Metadata … Summary data Data warehouse Input Report Business intelligence 26
  • 27. Document, record and content management •  Acquisition and storage •  Backup and Recovery •  Content Management •  Retrieval •  Retention Source: DAMA DMBOK Guide, p. 10 27
  • 28. Metadata management Metadata is data about data •  Architecture •  Integration •  Control •  Delivery Source: DAMA DMBOK Guide, p. 10 28
  • 29. DAMA – DMBOK •  A broad collection of all discipline and subtopics that are summarized under the umbrella of data management •  These concern many business-related issues, but many concepts are very well applicable to the field of bioscience •  We will come back to various aspects of the DAMA DMBOK during the course 29
  • 30. Data management needs in science and research •  Survey at the University of Oregon, USA (Brian Westra. "Data Services for the Sciences: A Needs Assessment". July 2010, Ariadne Issue 64 http://www.ariadne.ac.uk/issue64/westra/) •  Different scientific discipline 30
  • 31. Data management in science and research Brian Westra. "Data Services for the Sciences: A Needs Assessment". July 2010, Ariadne Issue 64 http://www.ariadne.ac.uk/issue64/westra/, accessed Apr. 10, 2015, 11 AM 1 2 3 4 5 6 7 8 9 10 11 1 Data storage and backup 7 Finding and accessing related data from others 2 Making scientific data findable by others 8 Connecting data storage to data analysis 3 Connecting data acquisition to data storage 9 Liniking this data to publications or other asset 4 Allowing or controlling access to scientific data by others 10 Ensuring data is secure and trustworthy 5 Documenting and tracking updates 11 Others 6 Data analysis and manipulation 31
  • 32. Let us brainstorm •  What is Quantitative Biology? Ad hoc collection from the audience, Apr. 16, 2015 -  Not only yes/no, but put amounts to entities -  Huge amount of data -  Qunatitative methods to study biology -  System-wide analysis; specific pathways -  Make results human readable and accesible 32
  • 33. Quantitative Biology •  The term quantitative biology has been coined by Hastings et al., 2005. •  High-throughput methods have led to a paradigm shift in biomedical research •  Traditionally, the focus was on one-molecule-at-a-time for most bio(medical) research projects •  Now, data on whole genomes, exomes, epigenomes, transcriptomes, proteomes and metabolomes can be generated at low cost. •  The term quantitative biology is used to describe this paradigm shift. Improvements in this area have been driven mainly by two technological developments: Hastings et al., 2005, Quantitative Bioscience for the 21st century. BioScience. Vol 55 No. 6 33
  • 34. Technological innovations •  State-of-the-art mass spectrometers coupled to high- performance liquid chromatography through soft ionization techniques (HPLC-ESI-MS) have quickly changed the way we do proteomics, metabolomics, and lipidomics. •  Next-generation sequencing has similarly changed the way we look at genomes, epigenomes, transcriptomes, and metagenomes. Due to advances in chemistry and imaging, sequencing reactions have been parallelized on a very large scale. The comprehensiveness of the data produced by high-throughput methods makes them particularly interesting as general-purpose analytical and diagnostic techniques. 34
  • 35. Technological innovations •  Imaging technologies can now produce high-resolution pictures of fine-grained cellular details at a very high speed •  Finally methods from bioinformatics and computational biology have matured to rapidly analyze the huge raw data sets that are generated by these high throughput technologies 35
  • 36. Contents from Bioinformatics 2 (high-throughput technologies •  Most of the high throughput technologies have been introduced during the Bioinformatics II lecture •  There are specialized lectures on “Transcriptomics” and on “Computational Proteomics and Metabolomics” •  We will give a short Recap on the Bioinformatics II contents that are relevant for this lecture •  More advanced topics on data generation methods will be introduced in lecture 3 by Dr. Stefan Czemmel (focus on next generation sequencing) 36
  • 37. Origin of the “Central Dogma of Molecular Biology” (Francis Crick, 1956) The central dogma of molecular biology •  First articulation by Francis Crick in 1956 •  Published in Nature in 1970 37
  • 38. The central dogma – classical view •  In general, the classic view reflects how biology is (biological data are) organized •  Genomics, however, enabled a more complex view Cox Systems Biology Lab | Research, University of Toronto, Canada 38
  • 40. Recap Bioinformatics II: Systems biology •  Quantitative data on various levels of biological complexity build fundaments of systems biology •  Mathematical modeling has been based on gene expression •  Recent important technological improvements allow the analysis of protein and metabolite profiles to a great depth •  Important layers for understanding biology •  New experimental techniques offer tremendous challenges for computational analysis 40
  • 41. Recap Bioinformatics II: Aims of Systems Biology •  Describe large-scale organization •  Quantitative modeling •  Describe cell as system of networks -  Fundamental research: time-resolved quantitative understanding of living systems -  Medicine: enable personalized medicine (e.g., improve treatment strategies for cancer patients) -  Biotechnology: improve production, degradation, construction of synthetic organisms, etc. 41
  • 42. Exp. Methods – Transcriptomics •  Extract and amplify RNA •  Hybridization on microarray •  Identify and quantify by fluorescence signal •  Sequences can be mapped back to genome Lindsay, Nature Rev. Drug Discovery, 2003, 2, 803 42
  • 43. Microarray Data Analysis •  Key problems in microarray data analysis are -  Data normalization -  Clustering -  Dimension reduction -  Diagnostics/classification -  Network inference -  Visualization of results Janko Dietzsch , Nils Gehlenborg and Kay Nieselt. Mayday-a microarray data analysis workbench. Bioinformatics 2006 22(8):1010-1012 43
  • 44. Genome sequencing February 15, 2001 February 16, 2001 44
  • 45. Genome sequencing •  2001: initial publication •  2003: 2nd draft “Human Genome” •  > 13 years of work and > 3*109 $ •  2010: 8 days 1*104 $ •  Today: approximately 5.5 days and < 1*104 $ •  Future: within 3 years Biotech company (Pacific Biosciences) expects similar amount of data in < 15 min for < 1*103 $ 45
  • 46. Status genomics/transcriptomics •  Dramatic drop in cost for genome sequencing •  Number of sequenced genomes grows continuously •  Genome is a very static snapshot of living system •  Biological adaption is rather slow; long-term information storage •  Proteins and their reaction products, metabolites are much closer to reality •  Genome and transcriptome databases are essential bases for proteomics and metabolomics research 46
  • 47. Genomics vs. Proteomics Genomics Proteomics Genomes rather static ~ 20 k genes established technology (capillary sequencer) Proteomes are dynamic (age, tissue, breakfast, …) up to 1000 k proteins emerging technologies (MS, HPLC/MS, protein chips) 47
  • 49. Main fields of proteomics 49
  • 52. Next generation sequencing 1st generation 2nd generation Illumina / Solexa Genetic Analyzer 2000 Mb / run (96h) Roche / 454 Genome Sequencer FLX 400 Mb / run (8h) Applied Biosystems SOLiD 3000 Mb / run (120h) 300 : 1 (2008) Applied Biosystems 3730xl 0,08 Mb / run 1 Mb / 24 h >3000 : 1 (2010) 1st generation 2nd generation Slide: Prof. Peter Bauer In 2008 52
  • 53. „3rd“ generation sequencing Drmanac Science (2010) 5961: 78-81. CompleteGenomics DNB sequencing 18x 210Gb / run >37.000 : 1 (2010) 3rd generation sequencing Slide: Prof. Peter Bauer 53
  • 54. High resolution imaging “Imaging in biology may refer to >15 different technologies” Prominent and data-intense examples include: •  Optical (bioluminescence and fluorescence imaging) •  Magnetic resonance imaging •  X-ray computed tomography •  Positron emission tomography http://en.wikipedia.org/wiki/Biological_imaging, accessed Apr. 13, 4 PM 54
  • 55. Imaging workflow Eliceiri et al., Nature Methods 9, 697–710 (2012) 55
  • 56. Database systems •  Relational databases -  Example MySQL •  NoSQL databases -  Example MongoDB •  How to query databases •  Entity relationship models •  Repositories (e.g. Pride, PeptideAtlas) -  Annotations -  Sequences 56
  • 59. Large-scale study data – 1000 Genomes •  Sample lists and sequencing progress •  Variant Calls •  Alignments •  Raw sequence files http://www.1000genomes.org/data 59
  • 60. Large-scale study data – The cancer genome atlas (TCGA) •  TCGA aims to help to diagnose, treat and prevent cancer •  explore the entire spectrum of genomic changes involved in more than 20 types of human cancer. •  Approx. 2 PB of genomic raw data http://cancergenome.nih.gov 60
  • 61. Laboratory information management systems/ Electronic Lab Books •  How to track all information that is generated in the laboratory •  Automated annotation of all experimental parameters is essential for reproducible science •  Currently, most experiments are protocolled manually in lab textbooks •  Data security (intellectual property versus open data) 61
  • 62. Experimental design •  Biological experiments are very complex •  Statistical significance requires a high number of biological replicates •  Often many different conditions and time points need to be considered •  One study can involve many different experiments (multi omics studies involve different omics layers, e.g. genomics + transcriptomics + proteomics) •  All experiments come with different meta data requirements •  For various reasons the experimental design is not always balanced (e.g. 5 samples in group A and and only 3 samples are available for group B) Friedrich, A., et al. Biomed Research International, April 2015 – in press. Nahnsen, S., Drug Target, May 2015 – in press. 62
  • 63. Experimental design Friedrich, A., et al. Biomed Research International, April 2015 – in press. Nahnsen, S., Drug Target, May 2015 – in press. 63
  • 64. Data analysis workflows •  Chain different (heterogeneous) tools •  Parameter handling •  Execution in high performance computing environment made easy 64
  • 65. Standardization in bioinformatics •  Many world-wide bioinformatics initiatives need to rely on open standards •  Development of standards has to be a community effort •  Standardized data formats are important to guarantee -  Sustainability -  Independence of instrument vendors -  Independence of analysis software -  Exchangeability of raw data •  Standard formats increase the amount of data by a factor of x (x = 2-4) •  Many people refrain from using open standards 65
  • 66. http://en.wikipedia.org/wiki/Big_Data, accessed Apr 24, 2014 Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found …… Big data 66
  • 67. Big data examples •  European Council for Nuclear Research (CERN) Geneva, Swizerland 25 Petabyte/Jahr at LHC (Large Hadron collider) (~6.2 Mio. DVDs) CERN LHD data Big data Beispiele ep.ph.bham.ac.uk, 2014 67
  • 68. Big data examples •  Google verarbeitet 9.1 Exabyte/year (300 Mio. DVDs) GOOGLE data Mayer-SchĂśnberger, 2013, ititch.com, 2014 68
  • 69. Biology and Big data? •  Klassisch: Beobachtung der Natur und deren Phänomene DNA RNA Proteine Träger der Erbinformation Expression von bestimmten Genen Üben nĂśtige Funktion in der Zelle aus 1956 formuliert Francis Crick das zentrale Dogma der Molekularbiologie: •  1950er JahreDurchbruch in der Molekularbiologie 69
  • 70. Big data Vivien Marx, Biology: The big challenges of big data, Nature. 2013, doi:10.1038/498255a 70
  • 71. Integrated data management in biology/biomedicine 71 http://media.americanlaboratory.com/m/20/Article/35231-fig1.jpg
  • 73. NGS Lab Lab Storage Data movers •  Automatically moves large to huge file-based data to a remote (central) storage •  Uses rsync routine; easy configuration using config file •  Data mover athentification: public/private key ssh authentification •  Moves data to openbis dropboxes (individual boxes and users for each of the five member labs) Data Mover DataMover: •  Developed at ETH Zurich as part of OpenBIS •  http://www.cisd.ethz.ch/software/Data_Mover 73
  • 74. openBIS (meta) data store •  Open, distributed system for managing biological information •  Captures different experiment types (OMICS, imaging, screening,...) •  Tracking, annotating and sharing of experiments, samples and datasets for distributed research •  Different servers for meta data and bulk raw data •  Underlying PostgreSQL database •  ETL routines for extraction of meta data and linking 74
  • 77. Applications •  Personalized medicine: Individualized vaccination in cancer •  Large-scale clinical studies: example Hepatocellular carcinoma 77
  • 78. Contact: Quantitative Biology Center (QBiC) Auf der Morgenstelle 10 72076 TĂźbingen ¡ Germany dmqb-ss15@informatik.uni-tuebingen.de Thanks for listening – See you next week