Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Understanding the Big Picture of e-Science

A. Sallans. "Understanding the Big Picture of e-Science." Presented at the 2011 eScience Bootcamp at the University of Virginia's Claude Moore Health Sciences Library. 4 March 2011

  • Login to see the comments

Understanding the Big Picture of e-Science

  1. 1. UNDERSTANDING THEBIG PICTURE OF E-SCIENCEAndrew SallansHead of Strategic Data InitiativesUniversity of Virginia LibraryE-Science BootcampClaude Moore Health Sciences Library, University of Virginia4 March 2011
  2. 2. OUTLINE What it‟s all about Examples Implications UVA Libraries Response (Round 1) 2
  3. 3. WHAT IT‟S ALL ABOUT (AROUND 1999)"e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.""e-Science will change the dynamic of the way science is undertaken." Dr Sir John Taylor Director General of Research Councils, Office of Science and Technology United Kingdom 3 Source:
  4. 4. WHAT MADE THIS POSSIBLE? Internet/World Wide Web Faster networking (fiber, special research networks, advances in grids) Better storage (higher capacity, faster access, better reliability) Cheap storage (costs keep decreasing) Major funding initiatives Broader interest in collaboration 4
  5. 5. SOME COMMON TERMS Computational science Scientific computing Research computing High-performance computing Cyberscience Cyberinfrastructure 5
  6. 6. CLIMATOLOGY RESEARCHSources:1) Climate Simulation on Cray XT5 “Jaguar” supercomputer, ORNL 6 ( Cray XT5 “Jaguar” supercomputer, ORNL (
  7. 7. LARGE HADRON COLLIDER AT CERN  Circumference: 26,659 meters  Magnets: 9,300  Speed: protons move at 99.9999991% speed of light)  Collisions/second: 600 million  Data produced: equivalent to 100,000 dual layer DVDs per year  LHC Grid: tens of thousands of computers around the world used collectively to analyze data (will take 15 years) 7Source: CERN website (
  8. 8. BIOMEDICAL INFORMATICS GRID (CABIG) Launched as test in 2004 Adopted by over 50 NCI-designated cancer centers Focused on:  Connecting scientists and practitioners through a shareable and interoperable infrastructure  Development of standard rules and a common language to more easily share information  Building or adapting tools for collecting, analyzing, integrating, and disseminating information associated with cancer research and care Source: caBIG website, National Cancer Institute ( 8
  9. 9. CITIZEN SCIENCE…THE SOCIAL SIDE 34,617,406 clicks done by 82,931 users! Source: Zooniverse, Real Science Online ( 9
  10. 10. IMPLICATIONS FOR RESEARCH Greater emphasis on technology Increase in interdisciplinary research and collaboration Often bigger data, with far more complex associated issues (storage, access, expertise, funding, preservation, etc.) Need for innovative approaches and integration into education/curriculum 10
  11. 11. DATA TSUNAMI IDC estimate of about 1.7 zetabytes (1 trillion terabytes) around 2011 ….twice the available spaceSource: 111) The Great Wave off Kanagawa, Katsushika Hokusai. Found on Wikipedia.2) The Diverse and Exploding Digital Universe, IDC, May 2010 ( reports/diverse-exploding-digital-universe.pdf)
  12. 12. BUT, NOT ALL DATA IS EQUAL…. Source: Long Tail, Wikipedia ( 12
  13. 13. CASE STUDY: UVA LIBRARIES RESPONSE(ROUND 1) Collaboration established around 2005 through discussions between ITC and Library, and impetus of Frye Institute capstones. Research Computing Support services in need of greater visibility, Library seeking ways to support changes in scientific research, collocation provides mutual benefits. In 2006, staff moved to Library locations (Research Computing Lab & Scholars‟ Lab), setup new service points and services. 13
  14. 14. RESEARCH IN THE E-SCIENCE WORLD Heavy use of electronic information resources Work is predominantly done from a lab/office, not in the Library Collaboration is fundamental, but don‟t always know people in other domains Grad students are usually bringing new technology/methods into the team (learning more about grad students in a research study now) 14
  15. 15. IDENTIFIED E-SCIENCE TRENDS Various components  Computationally intensive science  IT/software/infrastructure  Collaboration  Data Often intertwined with Open Access initiatives 15
  16. 16. E-SCIENCE IN OTHER LIBRARIES Purdue University  Focus on data curation  IATUL Conference, June 2010 University of Illinois – Urbana Champaign  Focus on data curation  Summer Institute on Data Curation Cornell University  Metadata consulting services University of New Mexico  Major DataONE grant 16
  17. 17. RESEARCH COMPUTING LAB RESPONSE Aiming to provide support across the entire scientific research data lifecycle Staff with expertise in:  Data  Quantitative data, statistics  Modeling, visualization  Scientific publishing Emphasis on consulting, not drop-off services Partnership with traditional librarians to help ease transition to new support models 17
  18. 18. RCL OUTREACHUniversity Community Speaker series 2006, 2007, 2008 Research 2.0 Symposium Partnerships with courses, other units (ie. MLBS) Short course series each semesterLibrary Community Panel at the ACCS Conference in 2007 Poster at ARL/CNI Forum in 2008 Poster at STS Section of ALA in 2009 18 Journal article in JLA in 2009
  19. 19. SAMPLE RCL CONSULTATIONS STS Undergrad Environmental Justice (2008)  Development of technology solutions for empowering the citizen scientist  Web 2.0 tools, data collection/management  Data analysis Economics Graduate Student (2008/2009)  Airline flight price modeling  Screen scraping, data collection/management  Data analysis Mountain Lake Beetle Project (2009)  Mobile data acquisition/collection solution  Database development/management, programming  Data analysis Archiving of dissertation data (2009)  EVSC student, ModelMaker 4.0 data  Biology student, IDL, Matlab, R code 19
  20. 20. SPECIFICS FOR MEDICAL CENTER At least 600 RCL support requests from Medical Center from October „07 through December „09 Medical Center patrons are heavy users of computational software like Matlab, SAS, LabView Increasing emphasis on collaboration (translational research) Greater attention to open access (NIH policy) Growing interest in areas like image integrity 20
  21. 21. TAKE-AWAYS This is the future Heavily growing space, lots of opportunity Requires big investment and commitment, the biggest being training and priority alignment Libraries and institutions need to make decisions on what to do and what not to do It‟s a culture change for both libraries, institutions, and researchers 21
  22. 22. COMING LATER….(ROUND 2) “Practical Applications of e-Science” in UVA Libraries today 22
  23. 23. QUESTIONS? Please feel free to contact me with questions:   434-243-2180  Twitter: asallans 23
  24. 24. ADDITIONAL INFORMATION E-Science Talking Points for ARL Deans and Directors, Elisabeth Jones, University of Washington, October 2008 ( 24