Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Neo4j GraphDay Munich - Life & Health Sciences Intro to Graphs

Neo4j GraphDay Munich Life & Health Sciences
Bruno Ungermann, Neo4j

  • Login to see the comments

Neo4j GraphDay Munich - Life & Health Sciences Intro to Graphs

  1. 1. Welcome! Neo4j Graphday: Health & Life Sciences
  2. 2. 9.00- 9:30 Breakfast & Networking 9.30- 12.30 Presentations Introduction to Graph Databases and Neo4j Bruno Ungermann, Neo4j The Germany Centre of Diabetes Research Greatly Improves Research Capabilities with Graph Technology Dr. Alexander Jarasch, Deutsches Zentrum für Diabetesforschung Big Data in Genomics: How Neo4j enables personalized therapies Dr. Martin Preusse, Knowing Health Neo4j Bloom – Visualization & Analysis for Everyone Michael Hunger, Neo4j 12.30 Lunch Break How to Make your Graph Project a Success with Neo4j Stefan Kolmar, Neo4j Workshop: New Possibilities in Health & Life Sciences with Graphs Michael Hunger, Dr. Martin Preusse 15.30 – Coffee & Open Discussion Agenda Health & Life Sciences
  3. 3. Complexity
  4. 4. Connectedness
  5. 5. Bootcamp
  6. 6. Domain Model Logistics Process
  7. 7. Traditional Approach: Fixed Schema, Tables
  8. 8. Graph Model: Nodes & Relationships Containe r Load USING ROUTE Depart 2014-04-15 Arrive 2014-04-28 USING_CARRIER Vessel Physical Container Shipment Carrier Emission Class A Shipment: ID 256787 Carrier: DHL Route 10520km Route: 823km Fueling Max Wgt 80 Type Gas B Town: Tokyo Town: Hong Kong Town: Hamburg Container LoadContainer LoadContainer Load Parcel Weight 15.5kg Container Load
  9. 9. Intuitiveness
  10. 10. Flexibility: no fixed schema
  11. 11. Flexibility & Agility
  12. 12. “We found Neo4j to be literally thousands of times faster than our prior MySQL solution, with queries that require 10-100 times less code. Today, Neo4j provides eBay with functionality that was previously impossible.” - Volker Pacher, Senior Developer “Minutes to milliseconds” performance Queries up to 1000x faster than other tested database types Speed
  13. 13. Graph Based Success
  14. 14. Neo4j - The Graph Company 500+ 7/10 12/25 8/10 53K+ 100+ 250+ 450+ Adoption Top Retail Firms Top Financial Firms Top Software Vendors Customers Partners • Creator of the Neo4j Graph Platform • ~250 employees • HQ in Silicon Valley, other offices include London, Munich, Paris and Malmö (Sweden) • $160M in funding from Morgan Stanley, Fidelity, Sunstone, Conor, Creandum, and Greenbridge Capital • Over 10M+ downloads, • 250+ enterprise subscription customers with over half with >$1B in revenue Ecosystem Startups in program Enterprise customers Partners Meet up members Events per year Industry’s Largest Dedicated Investment in Graphs
  15. 15. 15 • Record “Cyber Monday” sales • About 35M daily transactions • Each transaction is 3-22 hops • Queries executed in 4ms or less • Replaced IBM Websphere commerce • 300M pricing operations per day • 10x transaction throughput on half the hardware compared to Oracle • Replaced Oracle database • Large postal service with over 500k employees • Neo4j routes 10M+ packages daily at peak, with peaks of 5,000+ routing operations per second. Handling Large Graph Work Loads for Enterprises Real-time promotion recommendations Marriott’s Real-time Pricing Engine Handling Package Routing in Real-Time
  16. 16. Discrete Data Minimally connected data Neo4j is designed for data relationships Other NoSQL Relational DBMS Neo4j Graph DB Connected Data Focused on Data Relationships Development Benefits Easy model maintenance Easy query Deployment Benefits Ultra high performance Minimal resource usage Use the Right Database for the Right Job
  17. 17. How Neo4j Fits — Common Architecture Patterns From Disparate Silos To Cross-Silo Connections From Tabular Data To Connected Data From Data Lake Analytics to Real-Time Operations
  18. 18. 18 Common Graph Technology Use Cases Network & IT Operations Application Management Meta Data Management Real-Time Recommendations Identity & Access Management, Security Knowledge Management Fraud Detection, AML Compliance, GDPR
  19. 19. 19 Biological and Medical Knowledge in heterogeneous networks
  20. 20. 20 Biological and Medical Knowledge in heterogeneous networks
  21. 21. 21
  22. 22. 22 Medical Research Background • Italian research center that analyzes cancer samples from around the world • Provides state-of-the-art therapeutic and diagnostic cancer services Business Problem • Develop a tool that provides cancer data insights, tracks workflows and is available to external researchers • Relational databases didn’t provide adequate flexibility Solution and Benefits • Easily find complex research data relationships • Develop complex semantics for genomic knowledge • Cancer research is accessible to external scientists
  23. 23. 23 Pharmaceutical Research Business Problem • Seeking to automate phenotype, compound and protein cell behaviour research by using previously documented research more effectively • Text mining for research elements like DNA strings, proteins, RNA, chemicals and diseases Solution and Benefits • Found ways to identify compound interaction behaviour from millions of rearch documents • Relations between biological entities can be identified and validated by biological experts • Still very challenging to keep up to date, add genomics data, and find a breakthrough Background • 5 year long drug discovery research • Parse & Navigate over 25 Million scientific papers • Sourced from National Library of Medicine and tagging of “Medical Subject Headers” (MeSH tags)
  24. 24. 24 Agriculture Background • One of the world’s largest agribusinesses • Founded in 1901 and based in St. Louis • Grew from pioneer to leader in genetically modifying plants and building related businesses • Among the first companies to genetically modify a plant cell (1983) Business Problem • Although the data volume was not huge, (200 GB, 800 Mln nodes, Bln relationships) queries from connected data sets using traditional technology ran for long durations. In some cases, Monsanto had to stop them • Shorten new product development pipeline by one year through “yield testing in the lab” • Efficiently impute genotypes of newly bred populations from analysis of decades of genetic ancestry data
  25. 25. 25 Large Chemical Company: R&D Knowledge Solution Background • Provide new ways to search and interact with internal R&D Knowledge and published scientific information, highly connected at fact level to make knowledge actionable • Thousands of employees in R&D • Chemicals, Reactions Biologicals, physical- chemical properties Company • 10.000+ employees in R&D • 70+ R&D locations • 800 new patents • 3.000 R&D projects • 2 Bln R&D budget
  26. 26. 26 Large Pharmaceutical Company: Enterprise Search Background • Personalized Search for 100.000+ employees • 300.000.000 docs, pptx, pdf, html • 1 Mln products • 130.000 projects • Sources Exchange, Sharepoint, Office 365, Oracle, Hana, Blogs, Active Directory ….. Background • 150.000+ employees, 300 locations
  27. 27. White Board Session