Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Neo4j

Data is both our most valuable asset and our biggest ongoing challenge. As data grows in volume, variety and complexity, across applications, clouds and siloed systems, traditional ways of working with data no longer work.

Unlike traditional databases, which arrange data in rows, columns and tables, Neo4j has a flexible structure defined by stored relationships between data records.

We'll discuss the primary use cases for graph databases
Explore the properties of Neo4j that make those use cases possible
Look into the visualisation of graphs
Introduce how to write queries.

Webinar, 23 July 2020

  • Be the first to comment

Introduction to Neo4j

  1. 1. An Introduction to Neo4j Anthony Flynn, Sales Director UK & Ireland
  2. 2. 7/10 20/25 7/10 Top Retail Firms Top Financial Firms Top Software Vendors Anyway You Like It Neo4j - The Graph Company The Industry’s Largest Dedicated Investment in Graphs 2 Creator of the Label Property Graph and Cypher language at the core of the GQL ISO project Thousands of Customers World-Wide HQ in Silicon Valley, offices include London, Munich, Paris & Malmö Industry Leaders use Neo4j On-Prem DB-as-a-Service In the Cloud
  3. 3. Connections in Data are as valuable as the Data itself Networks of People Transaction Networks Bought Bought Viewed Returned Bought Knowledge Networks Plays Lives_in In_sport Likes Fan_of Plays_for E.g., Risk management, Supply chain, Payments E.g., Employees, Customers, Suppliers, Partners, Influencers E.g., Enterprise content, Domain specific content, eCommerce content Knows Knows Knows Knows
  4. 4. 4 Harnessing Connections Drives Business Value Enhanced Decision Making Hyper Personalization Massive Data Integration Data Driven Discovery & Innovation Product Recommendations Personalized Health Care Media and Advertising Fraud Prevention Network Analysis Law Enforcement Drug Discovery Intelligence and Crime Detection Product & Process Innovation 360º view of customer Compliance Optimize Operations Data Science AI & ML Fraud Prediction Patient Journey Customer Disambiguation Transforming Industries
  5. 5. Neo4j is an enterprise-grade native graph database and associated tools: • Store, reveal and query data and data relationships • Traverse and analyze data to many levels of depth in real-time • Add context to AI systems and network structures to data science 5 Native Graph Technology • Performance • ACID Transactions • Schema-free Agility • Graph Algorithms Designed, built and tested natively for graphs from the start for: • Developer Productivity • Hardware Efficiency • Enterprise Scale • Graph Adoption Analytics Tooling Graph Transactions Data Integration Dev. & Admin Drivers & APIs Discovery & Visualization Graph Analytics
  6. 6. 6 • Record “Cyber Monday” sales • About 35M daily transactions • Each transaction is 3-22 hops • Queries executed in 4ms or less • Replaced IBM Websphere commerce • 300M pricing operations per day • 10x transaction throughput on half the hardware compared to Oracle, which Neo4j replaced • Large postal service with over 500k employees • Neo4j routes 7M+ packages daily at peak, with peaks of 5,000+ routing operations per second. Handling Large Graph Work Loads for Enterprises Real-time promotion recommendations Marriott’s Real-time Pricing Engine Handling Package Routing in Real-Time
  7. 7. 7 • The media conglomerate Meredith uses Neo4j to turn data about its largely anonymous website visitors into customer profiles by graphing the data into billions of nodes and then applying machine learning to it. • Almost 70% of Credit Card fraud was missed • +1B Nodes and +1B Relationships to analyse • Graph analytics with queries & algorithms help find $10’s of millions of fraud in 1st year Improving Analytics, ML & AI Across Industries Meredith Marketing to the Anonymous Financial Fraud Detection & Recovery Top 10 Bank • Early intervention project with 3 years of visits, tests & diagnosis with 10’s of Billions of records • Finding similarities in patient journeys • Graph algorithms for identifying communities & best intervention points AstraZeneca Patient Journeys
  8. 8. Graph Technology 8
  9. 9. The Whiteboard Model Is the Physical Model 9
  10. 10. CAR DRIVES name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Latitude: 37.5629900° Longitude: -122.3255300° Nodes • Can have Labels to classify nodes • Labels have native indexes Relationships • Relate nodes by type and direction Properties • Attributes of Nodes & Relationships • Stored as Name/Value pairs • Can have indexes and composite indexes • Visibility security by user/role Neo4j Invented the Labeled Property Graph Model MARRIED TO LIVES WITH OW NS PERSON PERSON 10
  11. 11. Cypher: Powerful & Expressive Query Language MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse) MARRIED_TO Dan Ann NODE RELATIONSHIP TYPE LABEL PROPERTY VARIABLE
  12. 12. MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE = “Jane Doe” RETURN AS Subordinate, count(report) AS Total Express Complex Queries Easily with Cypher Find all direct reports and how many people they manage, up to 3 levels down Cypher Query SQL Query
  13. 13. Relational Versus Graph Models Relational Model Graph Model KNOWS KNOWS KNOWS ANDREAS TOBIAS MICA DELIA Person FriendPerson-Friend ANDREAS DELIA TOBIAS MICA
  14. 14. Analytics Tooling Graph Transactions Data Integration Dev. & Admin Drivers & APIs Discovery & Visualization Graph Analytics Developers Admins Applications Business Users Data Analysts Data Scientists Enterprise Data Hub Native Graph Technology for Applications & Analytics
  15. 15. APPLICATION SERVERS 15 Neo4j Clustering - Causal Cluster Replica Servers Query, View Core Servers Synced Cluster CORE SERVERS READ REPLICAS Async Replication Writes (transactions) Reads (Graph Queries)
  16. 16. Robust Graph Algorithms • Run on the loaded graph to compute metrics about the topology and connectivity • Highly parallelized and scale to 10’s of billions of nodes 16 The Neo4j GDS Library Mutable In-Memory Workspace Computational Graph Native Graph Store Efficient & Flexible Analytics Workspace • Automatically reshapes transactional graphs into an in-memory analytics graph • Optimized for analytics with global traversals and aggregation • Create workflows and layer algorithms
  17. 17. +50 Algorithms in the Neo4j GDS Library • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • A* Shortest Path • Yen’s K Shortest Path • Minimum Weight Spanning Tree • K-Spanning Tree (MST) • Random Walk • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality & Approximate • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity • K-1 Coloring • Euclidean Distance • Cosine Similarity • Node Similarity (Jaccard) • Overlap Similarity • Pearson Similarity • Approximate KNN Pathfinding & Search Centrality / Importance Community Detection Similarity Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors ...and also Auxiliary Functions: • Random graph generation • Encoding • Distributions & metrics 17
  18. 18. for Graph Data Science™ Neo4j Graph Data Science Library Scalable Graph Algorithms & Analytics Workspace Native Graph Creation & Persistence Neo4j Database Visual Graph Exploration & Prototyping Neo4j Bloom Practical Integrated Intuitive
  19. 19. 19 Perspective Search Visualize Explore Inspect Edit Explore & Collaborate with Neo4j Bloom Explore Graphs Visually Prototype Concepts Faster Collaborate Across Teams
  20. 20. Neo4j Bloom’s Intuitive User Interface 20 Search with type-ahead suggestions Flexible Color, Size and Icon schemes Visualize, Explore and Discover Pan, Zoom and Select Property Browser and editor
  21. 21. Neo4j in the Cloud 21
  22. 22. Neo4j Cloud offerings to suit every need 22 Database-as-a-service Self-hosted Cloud Managed Services (CMS) Cloud-native service Zero administration Pay-as- you-go Self-service deployment Cloud-native stack No access to underlying infra and systems. Self hosted and managed Any cloud (AWS, GCP, Azure) Bring-your-own-license Self-manage software, infra in own private cloud Own data, tenant, security >50% deploy this way White-glove fully managed service by Neo4j experts Fully customizable deployment model and service levels Operate In own data centers or Virtual Private Cloud
  23. 23. Neo4j Aura: Built for the best developer experience Neo4j’s open source roots backed by the strongest graph community helps deliver the best developer experience to rapidly build rich graph-powered applications 23 Easy Start in minutes Automatic upgrades, patches Scale on-demand instantly Zero downtime Powerful Lightning-fast queries with Native graph engine Flexible “whiteboard” data model Cypher - expressive, efficient and easy! Broad language driver support Reliable End-to-end encrypted Always ON Globally available on world-class infrastructure Self-healing, durable ACID compliant Affordable Pay-as-you-go Capacity based pricing Billing by the hour, starting as low as 9¢/hr Simple and predictable bills
  24. 24. Neo4j Cloud Managed Services (CMS) Enterprise-class, white-glove managed services for day-to-day operations, service and support of your Neo4j environment Dedicated team, always on-call Advanced monitoring and preventative maintenance Enterprise-grade security and compliance 24x7x365 remote services and support Big Three clouds, private cloud, or on-premises Your data in your infrastructure, fully controlled versioning
  25. 25. The CMS Advantage Focus on Innovation … while we manage your day-to-day infrastructure operations Achieve Faster Time-to-value … with experts to manage your environment from day one. Minimize hiring, in- house training, and ramp- up. Reduce your Risk … and meet your security, compliance and business continuity needs with proven best practices. Accelerate your Cloud Journey … by enabling a fully managed enterprise cloud environment and moving your production Neo4j environment within days.
  26. 26. Real-World Success Stories 26
  27. 27. Recommendations Dynamic Pricing IoT-applicationsFraud Detection Real-Time Transaction Applications Generate and Protect Revenue Customer Engagement Metadata and Advanced Analytics Data Lake Integration Knowledge Graphs for AI Risk Mitigation Generate Actionable Insights Network Management Supply Chain Efficiency Identity and Access Management Internal Business Processes Improve Efficiency and Cut Costs 27 Graph Use Cases by Value Proposition
  28. 28. Dun & Bradstreet Neo4j for Tracking Beneficial Ownership Background ● Regulations and requirements around beneficial ownership ● Needed to let B2B clients book new business promptly via accelerated due diligence investigations Business Problem ● Investigations call for highly trained staff, and this activity is hard to scale. A single query might tie up key people for 10-15 days, resulting in lost revenue Solution and Benefits ● Use Neo4j to quickly query historic relationships between business owners and companies ● Query responses take milliseconds versus days of skilled manual research
  29. 29. Adobe Behance Social Network of 10M Graphic Artists Background ● Social network of 10M graphic artists ● Peer-to-peer evaluation of art and works-in-progress ● Job sourcing site for creatives ● Massive, millions of updates (reads & writes) to Activity Feed ● 150 Mongos to 48 Cassandras to 3 Neo4j’s! Business Problem ● Artists subscribe, appreciate and curate “galleries” of works of their own and from other artists ● Activities Feed is how everyone receives updates ● 1st implementation was 150 MongoDB instances ● 2nd implementation shrunk to 48 Cassandras, but it was still too slow and required heavy IT overhead Solution and Benefits ● 3rd implementation shrunk to 3 Neo4j instances ● Saved over $500k in annual AWS fees ● Reduced data footprint from 50TB to 40GB ● Significantly easier to introduce new features like, “New projects in your Network”
  30. 30. US Army / Calibre Systems Equipment Logistics Background ● US IT consulting firm helped US Army streamline equipment deployments and maintenance spending ● Saving lives by improving the operational readiness of Army equipment like tanks, radios, transports, aircraft, weaponry, etc. Business Problem ● Needed to modernize procurement, budget and logistics processes for equipment & spare parts ● Millions of connections among a tank’s bill-of-materials, for example ● Improve “what if” cost calculations when planning missions and troop deployments ● Mainframe systems required over 60 man-hrs to calculate changes… planning took too long. Solution and Benefits ● 118M nodes & 185M relationships ● Shed cost estimation times by 88% ● Improved parts delivery timing and accuracy ● DBA labor required dropped by 77% ● Equipment TCO more predictable ● Safer soldiers
  31. 31. Caterpillar Heavy Equipment Manufacturing Background ● Fortune 100 heavy equipment manufacturer ● 27 Million warranty & service documents parsed ● Foundation for AI-based supply chain management Business Problem ● Improve maintenance predictability ● Need a knowledge base for 27 million warranty documents and maintenance orders ● Graphs gather context for AI to identify ‘prime examples’ of connections among parts, suppliers, customers and their mechanics anticipate when equipment will need servicing and by whom. Solution and Benefits ● Text to knowledge graph ● Common ontology for complaints, symptoms & parts ● Anticipates when equipment will need servicing ● Improves customer and brand satisfaction ● Maximizes lifespan and value of equipment
  32. 32. Improving Patient Outcomes Global pharmaceutical with $22.1Billion revenue Focus on oncology, cardiovascular, renal, metabolism, & respiratory 32 Neo4j GDS to Map & Predict Patient Journeys • Kidney disease intervention project • 3 yrs of visits, tests & diagnosis with 10’s of Bn of records • Knowledge Graph, graph queries & algorithms • Community detection to help find similarities over time • Finding influence points where experienced physicians may be able to guide and assist • Looking forward to path based embeddings Challenge: Better intervention for complex diseases • Complex diseases develop over years with many, many doctor visits, tests and evolving diagnosis • How to identify early warnings, intervene faster & improve outcomes? • No two patients are the same, so how are similarities found?
  33. 33. Let’s Do Something Amazing Together… Try Neo4j today: Free training and education: Contact us: