Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GraphTalk Helsinki - Introduction to Graphs and Neo4j

Dinuke Abeysekera & Fredrik Johansson, Neo4j
Neo4j GraphTalk Helsinki

  • Login to see the comments

  • Be the first to like this

GraphTalk Helsinki - Introduction to Graphs and Neo4j

  1. 1. Welcome to Fredrik Johansson, fredrik.johansson@neo4j.com Dinuke Abeysekera, dinuke.abeysekera@neo4j.com Stefan Wendin, stefan.wendin@neo4j.com Jesus Barrasa, jesus.barrasa@neo4j.com
  2. 2. 10:00 - 12:00 - Presentations • Introduction to the Neo4j Graph Platform Fredrik Johansson & Dinuke Abeysekera, Neo4j • Building Intelligent Solutions with Graphs Jesus Barrasa, Neo4j • Accelerate Innovation through Graph Thinking Stefan Wendin, Neo4j 12:00 - Q&A & Networking Agenda
  3. 3. Neo4j - The Graph Company 500+ 7/10 12/25 8/10 53K+ 100+ 250+ 450+ Adoption Top Retail Firms Top Financial Firms Top Software Vendors Customers Partners •Creator of the Neo4j Graph Platform •~300 employees •HQ in Silicon Valley, other offices include London, Munich, Paris and Malmö (Sweden) •$80M in funding from Fidelity, Sunstone, Conor, Creandum, and Greenbridge Capital •Over 10M+ downloads, •300+ enterprise subscription customers with over half with >$1B in revenue Ecosystem Startups in program Enterprise customers Partners Meet up members Events per year Industry’s Largest Dedicated Investment in Graphs
  4. 4. Intro to Graphs and Neo4j
  5. 5. NODE NODE NODE RELATIONSHIP RELATIONSHIP RELATIONSHIP Graphs are Nodes & Relationships
  6. 6. Knows Know s Know s Know s Social Graph“People you may know” Disruptor: Facebook Industry: Media Ad-business Bough t Bough t Viewe d Returned Bough t Disruptor: Amazon Industry: Retail People & Products“Other people also bought” Whatche d W atche d W atche d Like s Like d Rate d People & Content“You might also like” Disruptor: Netflix Industry: Broadcasting Media Some Famous Graphs
  7. 7. ACCOUNT HAS ADDRESS LIVES_AT PERSON A PERSON B LIVES_AT IS_OFFICER_OF COMPANY REG ISTERED BANK BAHAMAS WITH BANK
  8. 8. NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  9. 9. VIEWED GRAPH THINKING: Real Time Recommendations VIEWED BOUGHT VIEWED BOUGHT BOUGHT BOUGHT BOUGHT NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  10. 10. “As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands.” Marcos Wada Software Developer, Walmart NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  11. 11. GRAPH THINKING: Master Data Management MANAGE S MANAGE S LEADS REGION M ANAGES MANAGE S REGION LEADS LEADS COLLABORATES NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  12. 12. Neo4j is the heart of Cisco HMP: used for governance and single source of truth and a one- stop shop for all of Cisco’s hierarchies. -Prem Malhotra, Director of Enterprise NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  13. 13. O PENED_ACCO UN THAS IS_ISSUED GRAPH THINKING: Fraud Detection HAS LIVES LIVES IS_ISSUED OPENED_ACCOUNT NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  14. 14. INVESTIGATE Revolving Debt Number of Accounts INVESTIGATE Normal behavior Fraud Detection with Discrete Analysis
  15. 15. Revolving Debt Number of Accounts Normal behavior Fraudulent pattern Fraud Detection with Connected Analysis
  16. 16. CONNECTED ANALYSIS Endpoint-Centric Analysis of users and their end-points Navigation Centric Analysis of navigation behavior and suspect patterns Account-Centric Analysis of anomaly behavior by channel DISCRETE ANALYSIS 1 . 2 . 3 . Cross Chanel Analysis of anomaly behavior correlated across channels 4 . Entity Linking Analysis of relationships to detect organized crime and collusion 5 . Augmented Fraud Detection
  17. 17. ACCOUNT HOLDER 2 Modeling a fraud ring as a graph ACCOUNT HOLDER 1 ACCOUNT HOLDER 3
  18. 18. ACCOUNT HOLDER 2 ACCOUNT HOLDER 1 ACCOUNT HOLDER 3 CREDIT CARD BANK ACCOUNT BANK ACCOUNT BANK ACCOUNT PHONE NUMBER UNSECURE D LOAN SSN 2 UNSECURED LOAN Modeling a fraud ring as a graph
  19. 19. ACCOUNT HOLDER 2 ACCOUNT HOLDER 3 CREDIT CARD BANK ACCOUNT BANK ACCOUNT BANK ACCOUNT ADDRESS PHONE NUMBER PHONE NUMBER SSN 2 UNSECURED LOAN SSN 2 UNSECURED LOAN Modeling a fraud ring as a graph ACCOUNT HOLDER 1
  20. 20. “Graph databases offer new methods of uncovering fraud rings and other sophisticated scams with a high-level of accuracy, and are capable of stopping advanced fraud scenarios in real-time.” Gorka Sadowski Cyber Security Expert NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  21. 21. BROWSES CONNECTS BRIDGES ROUTES POWERS ROUTES POWERS POWERS HOSTS QUERIES GRAPH THINKING: Network & IT-Operations NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  22. 22. Uses Neo4j for network topology analysis for big telco service providers NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  23. 23. GRAPH THINKING: Identity And Access Management TRUSTS TRUSTS ID ID AUTHENTICATES AUTHENTICATES OW NS OWNS CAN_READ NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  24. 24. UBS was the recipient of the 2014 Graphie Award for “Best Identity And Access Management App” NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Graph Based Search Network & IT-Operations Identity & Access Management
  25. 25. Neo4j Graph Platform Overview
  26. 26. What Is Different In Neo4j?
  27. 27. What Is Different In Neo4j? 29 TRADITIONAL DATABASES Store and retrieve data Real time storage & retrieval Up to 3 Max # of hops
  28. 28. What Is Different In Neo4j? 30 TRADITIONAL DATABASES BIG DATA TECHNOLOGY Store and retrieve data Aggregate and filter data Real time storage & retrieval Long running queries Aggregation & filtering Up to 3 Max # of hops 1
  29. 29. What Is Different In Neo4j? 31 TRADITIONAL DATABASES BIG DATA TECHNOLOGY Store and retrieve data Aggregate and filter data Connections in data Real time storage & retrieval Real-Time Connected Insights Long running queries Aggregation & filtering “Our Neo4j solution is literally thousands of times faster than the prior MySQL solution, with queries that require 10-100 times less code” Volker Pacher, Senior Developer Up to 3 Max # of hops 1 Millions
  30. 30. What Is Different In Neo4j? Index-Free Adjacency 32
  31. 31. Connectedness and Size of Data Set ResponseTime Relational and Other NoSQL Databases 0 to 2 hops 0 to 3 degrees Thousands of connections 1000x Advantage Tens to hundreds of hops Thousands of degrees Billions of connections Neo4j “Minutes to milliseconds” What’s Different in Neo4j: “Minutes to Milliseconds” Real-Time Query Performance
  32. 32. ACID Consistency Non ‘Graph-ACID’ DBMSs 34 Maintains Integrity Over Time Guaranteed Graph Consistency Becomes Corrupt Over Time Not ‘Good Enough’ for Graphs What Is Different In Neo4j? ACID Graph Writes : A Requirement for Graph Transactions
  33. 33. What Is Different In Neo4j? Cypher Query Language 35 MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, count(report) AS Total Project Impact Less time writing queries • More time understanding the answers • Leaving time to ask the next question Less time debugging queries: • More time writing the next piece of code • Improved quality of overall code base Code that’s easier to read: • Faster ramp-up for new project members • Improved maintainability & troubleshooting
  34. 34. 36 Neo4j Graph Advantage: Foundational Components 1 2 3 4 5 6 Index-Free Adjacency In memory and on flash/disk vs ACID Foundation Required for safe writes Full-Stack Clustering Causal consistency Language, Drivers, Tooling Developer Experience, Graph Efficiency, Type Safety Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime Hardware Optimizations For next-gen infrastructure
  35. 35. Neo4j Enterprise Maturity & Robustness 37 Neo4j Security Foundation Multi-Clustering Support for Global Internet Apps Rolling Upgrades Schema Constraints Concurrent/Transactional Write Performance Auto Cache Reheating For Restarts, Restores and Cluster Expansion Neo4j 3.4 now supports rolling upgrades 3.4 3.5 Upgrade older instances while keeping other members stable and without requiring a restart of the environment 3.5
  36. 36. Neo4j: Enabling the Connected Enterprise Consumers of Connected Data 38 AI & Graph Analytics • Sentiment analysis • Customer segmentation • Machine learning • Cognitive computing • Community detection Transactional Graphs • Fraud detection • Real-time recommendations • Network and IT operations management • Knowledge Graphs • Master Data Management Discovery & Visualization • Fraud detection • Network and IT operations • Product information management • Risk and portfolio analysisData Scientists Business Users Applications
  37. 37. Neo4j Graph Platform 39 Development & Administration Analytics Tooling BUSINESS USERS DEVELOPERS ADMINS Graph Analytics Graph Transactions Data Integration Discovery & Visualization DATA ANALYSTS DATA SCIENTISTS Drivers & APIs APPLICATIONS AI openCypherCloud
  38. 38. Neo4j Graph Platform: Where We Are Today 40 Development & Administration Analytics Tooling Graph Analytics Graph Transactions Data Integration Discovery & VisualizationDrivers & APIs AI Improved Admin Experience - Rolling upgrades - Brute force attack prevention - Fast, resumable backups - Cache Warming on startup - Improved diagnostics Multi-Cluster routing built into Bolt drivers Seabolt & Go Driver - Other v1.7 Supported Drivers: Java, JavaScript, Python, .NET - Community Drivers: Perl, PhP, Ruby, Erlang, R, Haskell, Clojure, JDBC and many others SparkCypher/Morpheus (pre-EAP) Spark Implementation Proposal for getting Cypher into Spark Neo4j Bloom - New graph illustration and communication tool for non-technical users - Explore and edit graph - Search-based - Create storyboards - Foundation for graph data discovery - Integrated with graph platform Graph Data Science High speed graph algorithms Neo4j Database 3.4 & 3.5 - 70% faster Cypher - Native GraphB+Tree Indexes (up to 5x faster writes) - Full-text search - Index-Backed Optimisation - 100B+ bulk importer - Date/Time data type - 3-D Geospatial search - Secure, Horizontal Multi-Clustering - Property Blacklisting - Causal Cluster with Raft v2 Protocol - Hostname verification, Intra-cluster discovery encryption
  39. 39. The information presented here is Neo4j, Inc. confidential and does not constitute, and should not be construed as, a promise or commitment by Neo4j to develop, market or deliver any particular product, feature or function. Neo4j reserves the right to change its product plans or roadmap at any time, without obligation to notify any person of such changes. The timing and content of Neo4j’s future product releases could differ materially from the expectations discussed herein. Safe Harbor Roadmap Disclaimer 41
  40. 40. Neo4j 4.0 Milestone Release 2 is Out! 42 • New index population algorithm • Increased index key size for the native index provider • Transactional ID Management • Improved space reuse in store files • Improved Cluster performance • New Spring Boot Starter • SDN/RX • Support for multiple databases • Reactive drivers with back-pressure and flow control • Schema-based security model • Role and user management • System database • neo4j:// scheme • For standalone, Causal Cluster and desktop installations • Download MR2 from https://neo4j.com/download-center/#prerelease • Windows ZIP, Generic tarball, Docker image, Debian and RedHat packages • Documentation here • Features:
  41. 41. 43 Graph Visualization Options for Neo4j Neo4j Bloom Provided by Neo4j Exclusively optimized for Neo4j graphs Deploys easily in Neo4j Desktop and also as web based Focused on graph exploration thru a code-free UI Near natural language search Currently caters to data analysts and graph SMEs Currently for individual or small team use Viz Toolkits 3rd party e.g. vis.js, d3.js, Keylines Some offer data hooks into Neo4j, others may require custom integration Offer robust APIs for flexible control of the viz output Cater to developers who will create a custom solution, usually with limited interactivity Departmental, enterprise or public use BI Tools 3rd party e.g. Tableau, Qlik Not optimized for graph data, may require a special connector UI for dashboard and report creation with many kinds of viz, in addition to graph viz Cater to business users and data analysts Departmental, cross- department or enterprise use Graph Viz Solutions 3rd party e.g. Linkurious, Tom Sawyer Have to support multiple graph models and sources Feature UI for exploration or APIs for customizing output and embedding/publishing Solutions may cater to business users, analysts or developers Small team, departmental or cross-department use Little technical expertise Most technically involved Exploration focused Publishing / Consumption focused Smaller deployments Larger deployments
  42. 42. Perspective Search Visualization Exploration Inspection Editing 44 Business view of the graph Departmental views • Hiding PII • Styling Near-natural Language Search Full-text search • Graph patterns • Custom Search Phrases GPU Accelerated Visualization High performance physics & rendering Direct graph interactions Select, expand, dismiss, find paths Node + Relationship details Browse from neighbor to neighbor Create, Connect, Update Code-free graph changes Neo4j Bloom Overview
  43. 43. 45 Neo4j Graph Algorithm Library Finds the optimal path or evaluates route availability and quality Pathfinding & Search Determines the importance of distinct nodes in the network Centrality Evaluates how a group is clustered or partitioned Community Detection
  44. 44. 46 Neo4j Graph Algorithm Library - Parallel Breadth First Search & DFS - Shortest Path - Single-Source Shortest Path - All Pairs Shortest Path - Minimum Spanning Tree - A* Shortest Path - Yen’s K Shortest Path - K-Spanning Tree (MST) - Degree Centrality - Closeness Centrality - Betweenness Centrality - PageRank - Wasserman & Faust Closeness Centrality - Harmonic Closeness Centrality - Dangalchev Closeness Centrality - Approx. Betweenness Centrality - Personalise PageRank - Triangle Count - Clustering Coefficients - Strongly Connected Components - Label Propagation - Louvian Modularity - Louvian (Multi-step) - Balanced Triad (identification) - Connected Components (Union Find) - Euclidean Distance - Cosine Similarity - Jaccard Similarity - Random Walk - One Hot Encoding
  45. 45. Evolutions in Data Phase III: Data Relationships RDBMS & Aggregate- Oriented NoSQL Hadoop / MapReduce
  46. 46. 48 Packaged Services Project Lifecycle Graph Awareness Technical Assessment Solution Implementation Roll-out / Production Innovation Lab Bootcamp Solution Design Workshop Solution Audit Staff Augmentation Product Training
  47. 47. Questions? 49
  48. 48. Thank You! 50

×