4. Connectedness Represented in Graphs
C
C
A AA
U
S S SS S
USER_ACCESS
CONTROLLED_BY
SUBSCRIBED _BY
User
Customers
Accounts
Subscriptions
VP
Staff Staff StaffStaff
DirectorStaffDirector
Manager Manager Manager Manager
Fiber
Link
Fiber
Link
Fiber
Link
Ocean
Cable
Switch Switch
Router Router
Service
Organizational
Hierarchy
Product
Subscriptions
Network
Operations
Social
Networks
5. Who We Are: The Graph Platform
Neo4j is an enterprise-grade native graph platform that enables you to:
• Store, reveal and query data relationships
• Traverse and analyze any levels of depth in real-time
• Add context and connect new data on the fly
• Performance
• ACID Transactions
• Agility
• Graph Algorithms
5
Designed, built and tested natively
for graphs from the start for:
• Developer Productivity
• Hardware Efficiency
• Global Scale
• Graph Adoption
6. The Rise of Connections in Data
Networks of People Business Processes Knowledge Networks
E.g., Risk management, Supply
chain, Payments
E.g., Employees, Customers,
Suppliers, Partners,
Influencers
E.g., Enterprise content,
Domain specific content,
eCommerce content
Data connections are increasing as rapidly as data volumes
13. CAR
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Neo4j Invented the Labeled Property Graph Model
Nodes
• Can have Labels to classify nodes
• Labels have native indexes
Relationships
• Relate nodes by type and direction
Properties
• Attributes of Nodes & Relationships
• Stored as Name/Value pairs
• Can have indexes and composite
indexes
MARRIED TO
LIVES WITH
PERSON PERSON
13
15. Lessons Learned: Ten Year Head Start
Native Connectedness Differentiates Neo4j
Conceive
Code
Compute
Store
Non-Native Graph DBNative Graph DB RDBMS
Optimized for graph workloads
16. 10M+
Downloads
3M+ from Neo4j Distribution
7M+ from Docker
Events
400+
Approximate Number of
Neo4j Events per Year
50k+
Meetups
Number of Meetup
Members Globally
Largest pool of graph technologists
50k+
Trained/certified Neo4j
professionals
Trained Developers
18. Business Problem
• Find relationships between people, accounts,
shell companies and offshore accounts
• Journalists are non-technical
• Biggest “Snowden-Style” document leak ever;
11.5 million documents, 2.6TB of data
Solution and Benefits
• Pulitzer Prize winning investigation resulted in
robust coverage of fraud and corruption
• PM of Iceland & Pakistan resigned, exposed
Putin, Prime Ministers, gangsters, celebrities
(Messi)
• Led to assassination of journalist in Malta
Background
• International Consortium of Investigative
Journalists (ICIJ), small team of data journalists
• International investigative team specializing in
cross-border crime, corruption and accountability
of power
• Works regularly with leaks and large datasets
ICIJ Panama Papers INVESTIGATIVE JOURNALISM
Fraud Detection / Knowledge Graph18
22. Business Problem
• Find relationships between people, corporations,
accounts, shell companies and offshore accounts
• Journalists are non-technical
• 2017 Leak from Appleby tax sheltering law firm
matched 13.4 million account records with public
business registrations data from across Caribbean
Solution and Benefits
• Exposed tax sheltering practices of Apple, Nike
• Revealed hidden connections among politicians
and nations, like Wilbur Ross & Putin’s son in law
• Triggered government tax evasion investigations in
US, UK, Europe, India, Australia, Bermuda, Canada
and Cayman Islands within 2 days.
Background
• International Consortium of Investigative
Journalists (ICIJ), Pulitzer Prize winning journalists
• Fourth blockbuster investigation using Neo4j to
reveal connections in text-based, and account-
based data leaked from offshore law firms and
government records about the “1% Elite”
• Appends Neo4j-based, “Offshore Leaks Database”
ICIJ Paradise Papers INVESTIGATIVE JOURNALISM
Fraud Detection / Knowledge Graph22
25. Common Thread: Density Drives Value In Graphs
Metcalfe’s Law of the Network (V=n2)
5 hops = Less Value
100’s of hops
More Value
26. Entity Linking
Analysis of relationships
to detect organized
crime and collusion
5.
Endpoint-Centric
Analysis of users and
their end-points
1.
Navigation Centric
Analysis of
navigation behavior
and suspect patterns
2.
Account-Centric
Analysis of anomaly
behavior by channel
3.
PC:s
Mobile Phones
IP-addresses
User ID:s
Comparing Transaction
Identity Vetting
Augment Methods by Examining Connected Data
DISCRETE ANALYSIS CONNECTED ANALYSIS
Cross Channel
Analysis of anomaly
behavior correlated
across channels
4.
27. ACCOUNT
HOLDER 2
ACCOUNT
HOLDER 1
ACCOUNT
HOLDER 3
CREDIT
CARD
BANK
ACCOUNT
BANK
ACCOUNT
BANK
ACCOUNT
PHONE
NUMBER
UNSECURED
LOAN
SSN 2
UNSECURED
LOAN
Modeling Fraud Transactions as an Organized Ring
At first glance, each
account holder
looks normal.
Each has multiple
accounts…
31. Social Graphs: Roles & Projects around Enterprise Data Hub
Data Scientists
Real-time
Graph traversal
Applications
Developers
& Prod Managers
Analysts and
Business Users
Chief Officers of …
Compliance, Data, Digital,
Information, Innovation,
Marketing, Operations, Risk &
Security…
Big Data IT &
Architecture
ID, Auth & Security
Network & IT Ops
Metadata
Management
360⁰
Marketing
Customer 360
Real-time
Cybersecurity
Account navigation
• Multiple paths through
organization
• Graphs have strong
appetite for data to add
nodes & increase density
of relationships
• Value of graph increase
according to Metcalfe’s
Law (V=n2)
• Customer applications
iterate every 3 months
32. Evolution of Graphs in Digital Transformation
32
1st Graph
Connects “like” objects
People, computer networks,
telco, etc
33. Disruptors of the SMAC Generation
Social Network Search
Social, Mobile, Analytics, & Cloud technologies are all Graph-based innovations
Maps
34. Evolution of Graphs in Digital Transformation
34
Context Paths
1st Graph
Desire for more context to
follow connections
Connects like objects
People, computer networks,
telco, etc
35. 35
Disruptors of the SMAC Generation
Social Network
Rideshare
Cloud Computing
Retail
Personal
Entertainment
Delivery
Autonomous Vehicles
Media consumption
Facial recognition
AI-Powered Investing
Travel
Real estate utility
Search
CollaborationAd Networks
Mobile Computing
Social, Mobile, Analytics, & Cloud technologies are all Graph-based innovations
Maps
39. “Lessons Learned Database”
A half-century of collective NASA engineering
knowledge
“How do we make sure the command module
doesn’t tip over and sink?”
40. Let’s Hear a Few Stories
— David Meza, Chief Knowledge
Architect at NASA
“Neo4j saved well over two
years of work and one million
dollars of taxpayer funds.”
Impact
41. Evolution of Graph Adoption
41
Context Paths
1st Graph
Cross-connect
Connect unlike objects such
as people to products,
locations
Mobile App Explosion
Recommendation Engines
Fraud Detectors
Desire for more context to
follow connections
Connects like objects
People, computer networks,
telco, etc
43. 43
• Record “Cyber Monday” sales
• About 35M daily transactions
• Each transaction is 3-22 hops
• Queries executed in 4ms or less
• Replaced IBM Websphere commerce
• 300M pricing operations per day
• 10x transaction throughput on half the
hardware compared to Oracle
• Replaced Oracle database
• Large postal service with over 500k
employees
• Neo4j routes 7M+ packages daily at peak,
with peaks of 5,000+ routing operations per
second.
Transformative Customer Experiences
Real-time promotion
recommendations
Marriott’s Real-time
Pricing Engine
Handling Package
Routing in Real-Time
44. Background
• Personal shopping assistant
• Converses with buyer via text, picture and voice
to provide real-time recommendations
• Combines AI and natural language understanding
(NLU) in Neo4j Knowledge Graph
• First of many apps in eBay's AI Platform
Business Problem
• Improve personal context in online shopping
• Transform buyer-provided context into ideal
purchase recommendations over social platforms
• "Feels like talking to a friend"
Solution and Benefits
• 3 developers, 8M nodes, 20M relationships
• Needed high-performance traversals to respond
to live customer requests
• Easy to train new algorithms and grow model
• Generating revenue since launch
eBay ShopBot ONLINE RETAIL
Knowledge Graph powers Real-Time Recommendations44
EE Customer since 2016 Q3
55. Architecture Patterns for Multiple Graphs Layers
From Disparate Silos
To Cross-Silo Connections
From Tabular Data
To Connected Data
From Data Lake Analytics
to Real-Time Operations
56. In-Q-Tel’s Mission Economy
56
• Venture Capital sponsored by
National Intelligence
• Decomposes and reassembles
technology stacks into common
“genome” vocabulary
• Matches mission problems to
technology assemblies and
vendors
• Evaluates tech across
communications, Bio tech,
robotics, software, hardware, IoT
• Faster evaluations, better
innovations
57. Evolution of Graph Adoption
57
Context Paths
Auto-Graphs
Graph Layers
1st Graph
Cross-connect
Cross-tech applications
Internet of Things operations
Transparent Neural
Networks
(How Tensorflow thinks)
Blockchain-managed
systems
Adjacent graph layers inspire
new innovations
Connect unlike objects such
as people to products,
locations
Desire for more context to
follow connections
Connects like objects
People, computer networks,
telco, etc
62. Common Thread: Graphs Are VERY Hungry for Data
Graphs’ appetite to connect more data accelerates
the ability to find adjacent innovations
Customer iteration cycles from 2 weeks to 3 months
63. • Connect more users—Technical and non
• Graph density creates innovation opportunities
• Customer Experience drives transformation initiatives
• Agility and iteration accelerates projects
• Longstanding experience & community ensure success
• The future… IoT, AI & Blockchain are graphs
63
Summary: Graphs are Digital Transformation
65. 65
Graph Platform: Connects to Many Roles in Enterprise
DEVELOPERS
ADMINS
Graph
Analytics
Graph
Transactions
DATA
ANALYSTS
DATA
SCIENTISTS
APPLICATIONS
Drivers & APIs
Data Integration
BIG DATA IT
Analytics
Tooling
BUSINESS USERS
Discovery & Visualization
Development &
Administration
68. Lesson: Maintain Agility and Iterate Quickly
68
Design Thinking
Sprints
Executive
Communication
Tools
69. Cypher: Powerful and Expressive Query Language
MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse)
MARRIED_TO
Dan Ann
NODE RELATIONSHIP TYPE
LABEL PROPERTY VARIABLE
70. Finds the optimal path
or evaluates route
availability and quality
• Single Source Short Path
• All-Nodes SSP
• Parallel paths
Evaluates how a
group is clustered or
partitioned
• Label Propagation
• Union Find
• Strongly Connected
Components
• Louvain
• Triangle-Count
Determines the
importance of distinct
nodes in the network
• PageRank
• Betweeness
• Closeness
• Degree
Data Science Algorithms
71. "Neo4j continues to
dominate the graph
database market.”
October, 2017
“The most widely stated
reason in the survey for
selecting Neo4j was
to drive innovation”
February, 2018
“In fact, the rapid rise of
Neo4j and other graph
technologies may signal
that data connectedness is
indeed a separate paradigm
from the model
consolidation happening
across the rest of the
NoSQL landscape.”
March, 2018
Graph is a Unique Paradigm