Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GraphTalk Copenhagen - Introduction to Graphs and Neo4j

Dinuke Abeysekera & Fredrik Johansson, Neo4j
Neo4j GraphTalk Copenhagen

  • Login to see the comments

  • Be the first to like this

GraphTalk Copenhagen - Introduction to Graphs and Neo4j

  1. 1. Welcome to Fredrik Johansson, Dinuke Abeysekera, Stefan Wendin,
  2. 2. 10:00 - 12:00 - Presentations • Introduction to the Neo4j Graph Platform Fredrik Johansson & Dinuke Abeysekera, Neo4j • Killing Data Silos in the Life Sciences with Neo4j Dave Iberson-Hurst, S-cubed • Fraud Detection with Graphs Marius Hartmann, Danish Business Authority • Accelerate Innovation through Graph Thinking Stefan Wendin, Neo4j 12:00 - Q&A & Networking Agenda
  3. 3. Neo4j - The Graph Company 500+ 7/10 12/25 8/10 53K+ 100+ 250+ 450+ Adoption Top Retail Firms Top Financial Firms Top Software Vendors Customers Partners •Creator of the Neo4j Graph Platform •~300 employees •HQ in Silicon Valley, other offices include London, Munich, Paris and Malmö (Sweden) •$80M in funding from Fidelity, Sunstone, Conor, Creandum, and Greenbridge Capital •Over 10M+ downloads, •300+ enterprise subscription customers with over half with >$1B in revenue Ecosystem Startups in program Enterprise customers Partners Meet up members Events per year Industry’s Largest Dedicated Investment in Graphs
  4. 4. Neo4j Graph Platform Overview
  5. 5. What Is Different In Neo4j?
  6. 6. What Is Different In Neo4j? 6 TRADITIONAL DATABASES Store and retrieve data Real time storage & retrieval Up to 3 Max # of hops
  7. 7. What Is Different In Neo4j? 7 TRADITIONAL DATABASES BIG DATA TECHNOLOGY Store and retrieve data Aggregate and filter data Real time storage & retrieval Long running queries Aggregation & filtering Up to 3 Max # of hops 1
  8. 8. What Is Different In Neo4j? 8 TRADITIONAL DATABASES BIG DATA TECHNOLOGY Store and retrieve data Aggregate and filter data Connections in data Real time storage & retrieval Real-Time Connected Insights Long running queries Aggregation & filtering “Our Neo4j solution is literally thousands of times faster than the prior MySQL solution, with queries that require 10-100 times less code” Volker Pacher, Senior Developer Up to 3 Max # of hops 1 Millions
  9. 9. What Is Different In Neo4j? Index-Free Adjacency 9
  10. 10. Connectedness and Size of Data Set ResponseTime Relational and Other NoSQL Databases 0 to 2 hops 0 to 3 degrees Thousands of connections 1000x Advantage Tens to hundreds of hops Thousands of degrees Billions of connections Neo4j “Minutes to milliseconds” What’s Different in Neo4j: “Minutes to Milliseconds” Real-Time Query Performance
  11. 11. ACID Consistency Non ‘Graph-ACID’ DBMSs 11 Maintains Integrity Over Time Guaranteed Graph Consistency Becomes Corrupt Over Time Not ‘Good Enough’ for Graphs What Is Different In Neo4j? ACID Graph Writes : A Requirement for Graph Transactions
  12. 12. What Is Different In Neo4j? Cypher Query Language 12 MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE = “John Doe” RETURN AS Subordinate, count(report) AS Total Project Impact Less time writing queries • More time understanding the answers • Leaving time to ask the next question Less time debugging queries: • More time writing the next piece of code • Improved quality of overall code base Code that’s easier to read: • Faster ramp-up for new project members • Improved maintainability & troubleshooting
  13. 13. 13 Neo4j Graph Advantage: Foundational Components 1 2 3 4 5 6 Index-Free Adjacency In memory and on flash/disk vs ACID Foundation Required for safe writes Full-Stack Clustering Causal consistency Language, Drivers, Tooling Developer Experience, Graph Efficiency, Type Safety Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime Hardware Optimizations For next-gen infrastructure
  14. 14. Neo4j Enterprise Maturity & Robustness 14 Neo4j Security Foundation Multi-Clustering Support for Global Internet Apps Rolling Upgrades Schema Constraints Concurrent/Transactional Write Performance Auto Cache Reheating For Restarts, Restores and Cluster Expansion Neo4j 3.4 now supports rolling upgrades 3.4 3.5 Upgrade older instances while keeping other members stable and without requiring a restart of the environment 3.5
  15. 15. Neo4j: Enabling the Connected Enterprise Consumers of Connected Data 15 AI & Graph Analytics • Sentiment analysis • Customer segmentation • Machine learning • Cognitive computing • Community detection Transactional Graphs • Fraud detection • Real-time recommendations • Network and IT operations management • Knowledge Graphs • Master Data Management Discovery & Visualization • Fraud detection • Network and IT operations • Product information management • Risk and portfolio analysisData Scientists Business Users Applications
  16. 16. Neo4j Graph Platform 16 Development & Administration Analytics Tooling BUSINESS USERS DEVELOPERS ADMINS Graph Analytics Graph Transactions Data Integration Discovery & Visualization DATA ANALYSTS DATA SCIENTISTS Drivers & APIs APPLICATIONS AI openCypherCloud
  17. 17. Neo4j Graph Platform: Where We Are Today 17 Development & Administration Analytics Tooling Graph Analytics Graph Transactions Data Integration Discovery & VisualizationDrivers & APIs AI Improved Admin Experience - Rolling upgrades - Brute force attack prevention - Fast, resumable backups - Cache Warming on startup - Improved diagnostics Multi-Cluster routing built into Bolt drivers Seabolt & Go Driver - Other v1.7 Supported Drivers: Java, JavaScript, Python, .NET - Community Drivers: Perl, PhP, Ruby, Erlang, R, Haskell, Clojure, JDBC and many others SparkCypher/Morpheus (pre-EAP) Spark Implementation Proposal for getting Cypher into Spark Neo4j Bloom - New graph illustration and communication tool for non-technical users - Explore and edit graph - Search-based - Create storyboards - Foundation for graph data discovery - Integrated with graph platform Graph Data Science High speed graph algorithms Neo4j Database 3.4 & 3.5 - 70% faster Cypher - Native GraphB+Tree Indexes (up to 5x faster writes) - Full-text search - Index-Backed Optimisation - 100B+ bulk importer - Date/Time data type - 3-D Geospatial search - Secure, Horizontal Multi-Clustering - Property Blacklisting - Causal Cluster with Raft v2 Protocol - Hostname verification, Intra-cluster discovery encryption
  18. 18. The information presented here is Neo4j, Inc. confidential and does not constitute, and should not be construed as, a promise or commitment by Neo4j to develop, market or deliver any particular product, feature or function. Neo4j reserves the right to change its product plans or roadmap at any time, without obligation to notify any person of such changes. The timing and content of Neo4j’s future product releases could differ materially from the expectations discussed herein. Safe Harbor Roadmap Disclaimer 18
  19. 19. Neo4j 4.0 Milestone Release 2 is Out! 19 • New index population algorithm • Increased index key size for the native index provider • Transactional ID Management • Improved space reuse in store files • Improved Cluster performance • New Spring Boot Starter • SDN/RX • Support for multiple databases • Reactive drivers with back-pressure and flow control • Schema-based security model • Role and user management • System database • neo4j:// scheme • For standalone, Causal Cluster and desktop installations • Download MR2 from • Windows ZIP, Generic tarball, Docker image, Debian and RedHat packages • Documentation here • Features:
  20. 20. 20 Graph Visualization Options for Neo4j Neo4j Bloom Provided by Neo4j Exclusively optimized for Neo4j graphs Deploys easily in Neo4j Desktop and also as web based Focused on graph exploration thru a code-free UI Near natural language search Currently caters to data analysts and graph SMEs Currently for individual or small team use Viz Toolkits 3rd party e.g. vis.js, d3.js, Keylines Some offer data hooks into Neo4j, others may require custom integration Offer robust APIs for flexible control of the viz output Cater to developers who will create a custom solution, usually with limited interactivity Departmental, enterprise or public use BI Tools 3rd party e.g. Tableau, Qlik Not optimized for graph data, may require a special connector UI for dashboard and report creation with many kinds of viz, in addition to graph viz Cater to business users and data analysts Departmental, cross- department or enterprise use Graph Viz Solutions 3rd party e.g. Linkurious, Tom Sawyer Have to support multiple graph models and sources Feature UI for exploration or APIs for customizing output and embedding/publishing Solutions may cater to business users, analysts or developers Small team, departmental or cross-department use Little technical expertise Most technically involved Exploration focused Publishing / Consumption focused Smaller deployments Larger deployments
  21. 21. Perspective Search Visualization Exploration Inspection Editing 21 Business view of the graph Departmental views • Hiding PII • Styling Near-natural Language Search Full-text search • Graph patterns • Custom Search Phrases GPU Accelerated Visualization High performance physics & rendering Direct graph interactions Select, expand, dismiss, find paths Node + Relationship details Browse from neighbor to neighbor Create, Connect, Update Code-free graph changes Neo4j Bloom Overview
  22. 22. 22 Neo4j Graph Algorithm Library Finds the optimal path or evaluates route availability and quality Pathfinding & Search Determines the importance of distinct nodes in the network Centrality Evaluates how a group is clustered or partitioned Community Detection
  23. 23. 23 Neo4j Graph Algorithm Library - Parallel Breadth First Search & DFS - Shortest Path - Single-Source Shortest Path - All Pairs Shortest Path - Minimum Spanning Tree - A* Shortest Path - Yen’s K Shortest Path - K-Spanning Tree (MST) - Degree Centrality - Closeness Centrality - Betweenness Centrality - PageRank - Wasserman & Faust Closeness Centrality - Harmonic Closeness Centrality - Dangalchev Closeness Centrality - Approx. Betweenness Centrality - Personalise PageRank - Triangle Count - Clustering Coefficients - Strongly Connected Components - Label Propagation - Louvian Modularity - Louvian (Multi-step) - Balanced Triad (identification) - Connected Components (Union Find) - Euclidean Distance - Cosine Similarity - Jaccard Similarity - Random Walk - One Hot Encoding
  24. 24. Evolutions in Data Phase III: Data Relationships RDBMS & Aggregate- Oriented NoSQL Hadoop / MapReduce
  25. 25. 25 Packaged Services Project Lifecycle Graph Awareness Technical Assessment Solution Implementation Roll-out / Production Innovation Lab Bootcamp Solution Design Workshop Solution Audit Staff Augmentation Product Training
  26. 26. Questions? 26
  27. 27. Thank You! 27