Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Connected Data Imperative: The Shifting Enterprise Data Story

View the slides from 'The Connected Data Imperative' by Jeff Morris, Director of Product Marketing at Neo4j at GraphTalk Denver.

  • Login to see the comments

  • Be the first to like this

The Connected Data Imperative: The Shifting Enterprise Data Story

  1. 1. The Connected Data Imperative #1 Database for Connected Data Jeff Morris Head of Product Marketing 4/4/17
  2. 2. The Connected Data Imperative: The Shifting Enterprise Data Story May 2017
  3. 3. Who We Are: The Graph Database for Connected Data Neo4j is an enterprise-grade native graph database that enables you to: • Store and query data relationships • Traverse any levels of depth on real-time • Add and connect new data on the fly • Performance • ACID Transactions • Agility 3 Designed, built and tested natively for graphs from the start to ensure: • Developer Productivity • Hardware Efficiency
  4. 4. ACCOUNT ADDRESS PERSON PERSON NAME STREET BANK NAME COMPANY BANK BAHAMAS
  5. 5. Bank US Account Person A Company Bank Bahamas AddressNODE RELATIONSHIP Person B
  6. 6. ICIJ Pulitzer Price Winner 2017
  7. 7. Finding a Cure for Cancer
  8. 8. PHARMA DB GENOMIC DATA PATIENT RECORDS TRIAL DATA DRUGS DB LAB DATA Data Stored in Disparate Silos
  9. 9. Johnson Space Center Houston, Texas
  10. 10. “Lessons Learned Database” A half-century of collective NASA engineering knowledge
  11. 11. Let’s Hear a Few Stories
  12. 12. Let’s Hear a Few Stories — David Meza, Chief Knowledge Architect at NASA “Neo4j saved well over two years of work and one million dollars of taxpayer funds.” Impact
  13. 13. Our core belief is
  14. 14. Our core belief is — connections between data are as important as the data itself
  15. 15. Relational Database Turn of the Century Thinking
  16. 16. 21st Century Thinking Data Modelled as a Graph Graph Database
  17. 17. “Lessons Learned” Database GRAPH DB Load Data
  18. 18. “Lessons Learned” Database GRAPH DB Load Data
  19. 19. CONSUMER DATA PRODUCT DATA PAYMENT DATA SOCIAL DATA SUPPLIER DATA “The next wave of competitive advantage will be all about using connections to produce actionable insights.”
  20. 20. Graphs power the transformative companies by highlighting the RELATIONSHIPS in Data
  21. 21. Graph is Top Trending Database Type
  22. 22. “Forrester estimates that over 25% of enterprises will be using graph databases by 2017.” Forrester Research, 2014
  23. 23. Introducing Neo4j Graphs Community Overview
  24. 24. Sample of Connected Graphs Organization Identity & Access Network & IT Ops
  25. 25. Networks are Graphs
  26. 26. network topology
  27. 27. Mesh Router Gateway Mesh Router Router Mesh Router Gateway Router Router Router Router Access Point CPU CPU CPU CPU Mobile Mobile Mobile Mobile Base Station CPU CPU CPU CPU Access Point
  28. 28. Sys Admins Servers, on-premise virtual machines, cloud virtual machines, etc. Network Admins Switches, Routers, Egress Points App Admins I.e. Salesforce, Marketo, SAP, Oracle Apps, Tableau, SharePoint, DBA’s etc. Internal Users HR, Sales, Marketing, Data Analysts, E-staff etc. Numerous Customers & Partners
  29. 29. Router Servers Servers Apps FirewallCloud Switch Apps Network Admins Switches, Routers, Egress Points Sys Admins Servers, on-premise virtual machines, cloud virtual machines, etc. App Admins I.e. Salesforce, Marketo, SAP, Oracle Apps, Tableau, SharePoint, DBA’s etc. Internal Users HR, Sales, Marketing, Data Analysts, E-staff etc.
  30. 30. Router Servers Servers Apps FirewallCloud Switch Apps Network Admins Switches, Routers, Egress Points Sys Admins Servers, on-premise virtual machines, cloud virtual machines, etc. App Admins I.e. Salesforce, Marketo, SAP, Oracle Apps, Tableau, SharePoint, DBA’s etc. Internal Users HR, Sales, Marketing, Data Analysts, E-staff etc.
  31. 31. Router Servers Servers Apps FirewallCloud Switch Apps Network Admins Switches, Routers, Egress Points Sys Admins Servers, on-premise virtual machines, cloud virtual machines, etc. App Admins I.e. Salesforce, Marketo, SAP, Oracle Apps, Tableau, SharePoint, DBA’s etc. Internal Users HR, Sales, Marketing, Data Analysts, E-staff etc.
  32. 32. Router Servers Servers Apps FirewallCloud Switch Apps Network Admins Switches, Routers, Egress Points Sys Admins Servers, on-premise virtual machines, cloud virtual machines, etc. App Admins I.e. Salesforce, Marketo, SAP, Oracle Apps, Tableau, SharePoint, DBA’s etc. Internal Users HR, Sales, Marketing, Data Analysts, E-staff etc.
  33. 33. Router Servers Servers Apps FirewallCloud Switch Apps Network Admins Switches, Routers, Egress Points Sys Admins Servers, on-premise virtual machines, cloud virtual machines, etc. App Admins I.e. Salesforce, Marketo, SAP, Oracle Apps, Tableau, SharePoint, DBA’s etc. Internal Users HR, Sales, Marketing, Data Analysts, E-staff etc.
  34. 34. >50% of the Global 2000 are Using or Piloting Neo4j! As of today
  35. 35. Introducing Neo4j Native Graph Differentiation Graph Overview
  36. 36. CAR name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Neo4j Invented the Labeled Property Graph Model Nodes • Can have name-value properties • Can have Labels to classify nodes Relationships • Relate nodes by type and direction • Can have name-value properties MARRIED TO LIVES WITH PERSON PERSON 46 Neo4j Advantage - Agility
  37. 37. The Largest Graph Innovation Network 3,000,000+ with 50k additional per month Neo4j Downloads 225+ customers 50% from Global 2000 100+ Technology and Services Partners 450+ annual events & 10k attendees Graph and Neo4j awareness and training 43,000+ Neo4j Meetup Members 50,000+ Online and Classroom Education Registrants
  38. 38. Users Love Neo4j
  39. 39. RDBMS Vocabulary Mapped to Graph Modeling Relational DB Construct Graph DB Construct Entity table Node labels Row Node Columns Node properties Technical primary keys Replace with business primary keys Constraints Unique constraints for business keys Indexes Indexes on any property Foreign keys Relationships Default values Not required De-normalized or duplicated data Create separate nodes Join tables Relationships Join table columns Relationship properties
  40. 40. Good for discrete problems Insufficient for connected problems RDBMS
  41. 41. Relational DBMSs Can’t Handle Relationships Well • Cannot model or store data and relationships without complexity • Performance degrades with number and levels of relationships, and database size • Query complexity grows with need for JOINs • Adding new types of data and relationships requires schema redesign, increasing time to market … making traditional databases inappropriate when data relationships are valuable in real-time Slow development Poor performance Low scalability Hard to maintain
  42. 42. Queries can take non-sequential, arbitrary paths through data Real-time queries need speed and consistent response times Queries must run reliably with consistent results Q A single query can touch a lot of data Relationship Queries Strain Traditional Databases 5 2
  43. 43. At Write Time: data is connected as it is stored At Read Time: Lightning-fast retrieval of data and relationships via pointer chasing Index free adjacency Graph Optimized Memory & Storage
  44. 44. Neo4j: Native Graph from the Start Native graph storage Optimized for real-time reads and ACID writes • Relationships stored as physical objects, eliminating need for joins and join tables • Nodes connected at write time, enabling scale-independent response times Native graph querying Memory structures and algorithms optimized for graphs • Index-free adjacency enables 1M+ hops per second via in- memory pointer chasing • Off-heap page cache improves operational robustness and scaling compared with JVM-based caches • “Minutes to milliseconds” performance improvement Neo4j Advantage - Performance Neo4j Advantage - ACID Transactions
  45. 45. Cypher: Powerful and Expressive Query Language MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse) MARRIED_TO Dan Ann NODE RELATIONSHIP TYPE LABEL PROPERTY VARIABLE Neo4j Advantage – Developer productivity
  46. 46. 56 Example HR Query in SQL The Same Query using Cypher MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, count(report) AS Total Project Impact Less time writing queries • More time understanding the answers • Leaving time to ask the next question Less time debugging queries: • More time writing the next piece of code • Improved quality of overall code base Code that’s easier to read: • Faster ramp-up for new project members • Improved maintainability & troubleshooting Productivity Gains with Graph Query Language The query asks: “Find all direct reports and how many people they manage, up to three levels down”
  47. 47. Connectedness and Size of Data Set ResponseTime Relational and Other NoSQL Databases 0 to 2 hops 0 to 3 degrees Thousands of connections 1000x Advantage Tens to hundreds of hops Thousands of degrees Billions of connections Graph “Minutes to milliseconds” “Minutes to Milliseconds” Real-Time Query Performance
  48. 48. Equivalent Cypher Query MATCH (you)-[:BOUGHT]->(something)<-[:BOUGHT]-(other)-[:BOUGHT]->(reco) WHERE id(you)={id} RETURN reco Traversal Speeds on Amazon Retail Dataset Threads Hops per second 1 3-4 million 10 17-29 million 20 34-50 million 30 36-60 million 5 8 Social Recommendation Example Neo4j Advantage - Performance
  49. 49. Graph databases are designed for data relationships Discrete Data Minimally connected data Fit for Purpose: The Right Architecture for the Right Job Other NoSQL Relational DBMS Graph DB Connected Data Focused on Data Relationships Development Benefits Easy model maintenance Easy query Deployment Benefits Ultra high performance Minimal resource usage
  50. 50. Graph Graph DatabaseRDBMS TabularAggregate Oriented (3) Key-Value, Column-Family, Document Database Source: Martin Fowler NoSQL Distilled Database Management Systems Five Key Sub-Patterns (incl. SQL)
  51. 51. NoSQL Databases Don’t Handle Relationships • No data structures to model or store relationships • No query constructs to support data relationships • Relating data requires “JOIN logic” in the application • No ACID support for transactions … making NoSQL databases inappropriate when data relationships are valuable in real-time
  52. 52. Data lake Good for Analytics, BI, Map Reduce Non-Operational, Slow Queries RDBMS
  53. 53. UNIFIED, IN-MEMORY MAP Lightning-fast queries due to replicated in-memory architecture and index-free adjacency MACHINE 1 MACHINE 2 MACHINE 3 Slow queries due to index lookups + network hops Using Graph Using Other NoSQL to Join Data Q R Q R Relationship Queries on non-native Graph Architectures 6 3
  54. 54. Neo4j Scalability Dynamic pointer compression Unlimited-sized graphs with no performance compromise Index partitioning Auto-partitioning of indexes into 2GB partitions Causal clustering architecture Enables unlimited read scaling with ACID writes and a choice of consistency levels Multi-Data Center Support Creates HA, Fault Tolerant Global Applications Efficient processing Native graph processing and storage often requires 10x less hardware Efficient storage One-tenth the disk and memory requirements of certain alternatives Neo4j Advantage – Scalability
  55. 55. Raft-based architecture • Continuously available • Consensus commits • Third-generation cluster architecture Cluster-aware stack • Seamless integration among drivers, Bolt protocol and cluster • No need for external load balancer • Stateful, cluster-aware sessions with encrypted connections Streamlined development • Relieves developers from complex infrastructure concerns • Faster and easier to develop distributed graph applications Neo4j Enterprise: Causal Clustering Architecture Modern and Fault-Tolerant to Guarantee Graph Safety 65 Neo4j Advantage – Scalability
  56. 56. Graph Transactions Over ACID Consistency Graph Transactions Over Non-ACID DBMSs 66 Maintains Integrity Over Time Eventual Consistency Becomes Corrupt Over Time The Importance of ACID Graph Writes • Ghost vertices • Stale indexes • Half-edges • Uni-directed ghost edges
  57. 57. Summary of Neo4j: Built for the Enterprise Native Graph Storage Designed, built, and tested for graphs Native Graph Query Processing For real-time, relationship-based apps Evaluate millions of relationships in a blink Whiteboard-Friendly Data Modeling Faster projects compared to RDBMS Data Integrity and Security Fully ACID transactions, causal consistency and enterprise security Powerful, Expressive Query Language Improved productivity, with 10x to 100x less code than SQL Scalability and High Availability Architecture provides ideal balance of performance, availability, scale for graphs Built-in ETL Seamless import from other databases Integration Fits easily into your IT environment, with drivers and APIs for popular languages MATCH (A)67
  58. 58. Enterprise-Class Technology Ready for real-time enterprise applications Performance and Scalability • Clustered replication across data centers • Unlimited graph sizes • Intelligent online space reuse • Enterprise lock manager • Compiled runtime for common queries • Kerberos authentication add-on • Clustering on CAPI flash add-on Monitoring and Administration • Advanced monitoring by role • Cypher query tracing • Hot backups • Enterprise security Enterprise Schema Governance • Property existence constraints • Composite and node key constraints 68
  59. 59. Enterprise-Class Expertise Neo4j Customer Success Expert design, development and deployment services • Graph and application design • Application deployment • Data center configuration • Developer and user training • World-class support with SLAs • Support portal and knowledge base Graph Innovation Network Worldwide community of Neo4j and graph database experts • Service providers • OEMs and VARs • Technology partners • Open source community Use Neo4j experts and join the Innovation Network. Develop your apps right the first time. 69
  60. 60. Graph Visionaries Enterprise Customers 70 Partners System Integrators Trainers OEMs Cloud IaaS, PaaSm, DBaaS Marketplace OSS Community Events Forums Add-Ons The Density of the Neo4j Innovation Network Tech Ecosystem OEM & Tech Partners Graph Solutions Data Science Architecture Data Models Commercial Support Technical Support Packaged Services Custom Services Education Documents Online Training Classroom Custom Onsite Standards Initiatives openCypher, LDBS
  61. 61. The Neo4j Innovation Network is Your Fastest Path to Your Next Great Idea Graph Expertise Graph Database Platform Innovation Network Enterprise-Grade Innovation Launchpad • Neo4j Enterprise Edition • HA, Causal Cluster, MDC • Better performance • Hardened product The Next Innovation • Density of the network accelerates innovation opportunity • Thousands of project successes • Partners, Service Providers, Vendors, Academics, Researchers Millions of Graph Hours • Shrink learning curve • Design advice • Contextual experience • Deploy & Ops support 71 Neo4j Commercial Value
  62. 62. Case Studies for Knowledge Graphs and Recommendation Engines Neo4j Case Studies
  63. 63. Real-Time Recommendations Dynamic Pricing Artificial Intelligence & IoT-applications Fraud Detection Network Management Customer Engagement Supply Chain Efficiency Identity and Access Management Relationship-Driven Applications
  64. 64. Shopping Recommendations Examples of companies that use Neo4j, the world’s leading graph database, for recommendation and personalization engines. Adidas uses Neo4j to combine content and product data into a single, searchable graph database which is used to create a personalized customer experience “We have many different silos, many different data domains, and in order to make sense out of our data, we needed to bring those together and make them useful for us,” – Sokratis Kartelias, Adidas eBay ShopBot Personal Shopping Companion in FB Messenger “ShopBot uses its Knowledge Graph to understand user requests and generate follow-up questions to refine requests before searching for the items in eBay’s inventory. In a search query for “bags” for example, purple nodes represent “categories,” green “attributes” and pink are “values” for those attributes.” – RJ Pittman Blog, eBay Walmart uses Neo4j to give customer best web experience through relevant and personal recommendations “As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands”. - Marcos Vada, Walmart Product recommendations Personalization Linkedin Chitu seeks to engage Chinese jobseekers through a game-like user interface that is available on both desktop and mobile devices. “The challenge was speed,” said Dong Bin, Manager of Development at Chitu. “Due to the rate of growth we saw from our competitors in the Chinese market, we knew that we had to launch Chitu as quickly as possible.” Social Network Additional Case Studies
  65. 65. Case Study: Knowledge Graphs at eBay
  66. 66. Case Study: Knowledge Graphs at eBay
  67. 67. Case Study: Knowledge Graphs at eBay
  68. 68. Case Study: Knowledge Graphs at eBay
  69. 69. Bags Case Study: Knowledge Graphs at eBay
  70. 70. Men’s Backpack Handbag Case Study: Knowledge Graphs at eBay
  71. 71. https://shopbot.ebay.com/ Try it out at: Case Study: Knowledge Graphs at eBay
  72. 72. Real-time Package Routing • Large postal service with over 500k employees • Neo4j routes 7M+ packages daily at peak, with peaks of 5,000+ routing operations per second. Real-time promotion recommendations • Record “Cyber Monday” sales • About 35M daily transactions • Each transaction is 3-22 hops • Queries executed in 4ms or less • Replaced IBM Websphere commerce Real-time pricing engine • 300M pricing operations per day • 10x transaction throughput on half the hardware compared to Oracle • Presentation at http://graphconnect.com/gc2016-sf/ • Replaced Oracle database Recommendations, Pricing and Routing

×