Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big data in biology

I spoke on "Big Data in Biology". The talk basically concentrates on how biology has affected big data and how big data has become a key player in biology. I have also covered how DNA storage can address long term archival storage.

  • Be the first to comment

Big data in biology

  1. 1. BigDatain Biology FutureScopeandStudy
  3. 3. So,howisthisdataproduced?? ● The data produced by the social media in a single minute is astounding! ● All this data is stored and analyzed for many obvious reasons.
  4. 4. howisitrelated tobiology?? ● DNA-DeoxyriboNucleicAcid ● DNA carries all the genetic information in our body. ● It drives the human body. ● Genome is an organism’s complete set of DNA.
  5. 5. humangenomeproject ● It is an international scientific research project. ● The goal of the project is to determine the sequence of chemical base pairs that make up human DNA. ● The project was successfully completed in 2003 and 90% of the human genome was sequenced. ● This was just a start of a new era of sequencing.
  6. 6. Whydoweneedsequencing? Bacterial Lights in Paris Genetically modified mosquitoes
  7. 7. Whydoweneedsequencing? CRISPR/Cas9
  8. 8. Bigdataparking Clouds are a solution, but they also throw up fresh challenges. Ironically, their proliferation can cause a bottleneck if data end up parked on several clouds and thus still need to be moved to be shared. And clouds means entrusting valuable data to a distant service provider who may be subject to power changes or other disruptions. Scientists experiment with different constellations to suit their needs and trust levels. Clouds can be used for both data storage and computing. This reduces the overhead of transferring the data into a local machine and computing it on
  9. 9. Databasesstoringthesegenomics
  10. 10. ● The information necessary to build and control any living organism is written in its genome and it took 13 years to decipher. ● A single decade later sequencing a genome takes a few hours on a machine that fits on a tabletop. ● The tsunami of biological data generates new problems, it needs to be analysed properly to unearth and retrieve the exciting knowledge it contains. ● Getting the most from the data requires interpreting them in light of all the relevant prior knowledge. ● That means scientists have to store a large data sets, and analyse, compare and share them - not simple tasks. Whatareweconcernedabout??
  11. 11. It is estimated that by 2025 , exabytes(1018) of genomics data will be produced globally and will far exceed from twitter and facebook. Moreover, the genomics data being produced roughly doubles every year and will require new solutions in precision and accuracy for storage, analysis and sharing. The European Bioinformatics Institute(EBI), UK, part of the European Molecular Biology Laboratory and one of the world’s largest biology-data repositories, currently stores 20 petabytes(20*1015) of data and back-ups about genes, proteins and small molecules. Genomic data accounts for 2 petabytes of that, a number that more than doubles every year. DataExplosion
  12. 12. 5Dstorageofdata
  13. 13. Storingdatausingdnamolecules ● This could be the future of the data storage. ● DNA molecules can store huge amounts of data. ● DNA storage is robust.
  14. 14. Howdowedoit??
  15. 15. Microsoft has already started storing some of its data using DNA. The first phase of demonstration was successfully completed. Microsoft partnered with the startup “Twist Bioscience” which produced oligonucleotides for them and arranged them in the sequence specified. One of the drawbacks of this storage is it cannot be commercialised.
  16. 16. ThankYou. Haveagoodday.