5. What We'll Discuss Today:
Data Ecosystems: Where it All Fits
Neo4j, the Database: How and Why it Works
Beyond the Database: Building Graph Native Organizations
Our Long-Term Vision for a Connected Enterprise
18. graph database
applications
services
products
raw data
data structurekey value storecolumn storedocument storerelational database ?relational database
Conway's Law
Any organization that designs a system or
application are constrained to produce a
design whose structure is a copy of the
organization's (data) communication
structure.
19. graph database
applications
services
products
raw data
data structurekey value storecolumn storedocument storerelational database
Conway's Law
Any organization that designs a system or
application are constrained to produce a
design whose structure is a copy of the
organization's (data) communication
structure.
20. graph database
applications
services
products
raw data
data structurekey value storecolumn storedocument store
Conway's Law
Any organization that designs a system or
application are constrained to produce a
design whose structure is a copy of the
organization's (data) communication
structure.
21. graph database
applications
services
products
raw data
data structurekey value storecolumn store
Conway's Law
Any organization that designs a system or
application are constrained to produce a
design whose structure is a copy of the
organization's (data) communication
structure.
22. graph database
applications
services
products
raw data
data structurekey value store
Conway's Law
Any organization that designs a system or
application are constrained to produce a
design whose structure is a copy of the
organization's (data) communication
structure.
23. graph database
applications
services
products
raw data
data structure
Conway's Law
Any organization that designs a system or
application are constrained to produce a
design whose structure is a copy of the
organization's (data) communication
structure.
38. On Stage Behind the Scenes
Personal Computer: Mainstream Movement Toward Client-Server
SQL and RDBMs Consumes Market
Teeny Tiny RAM
Paper Rules
Spinning Patters
39. On Stage
Consumer Internet
Behind the Scenes
Distributed Systems Become Commonplace
GBs RAM
SSDs prohibitively expensive
Batch Compute
Commodity Hardware
40. On Stage
Mobile
Behind the Scenes
NoSQL and Cloud Native Deployment
Cloud / On Demand Hardware
Map Reduce Your Insights
OSS Rules the Roost
41. On Stage Behind the Scenes
The Internet of Me:
Mainstream AI
Behind the Scenes
Graphs
Abundant Cheap RAM TB+
FGPAs for Algos
Dynamic real world systems
45. Graph
Database
Relational
Database
A way of representing data
Good for:
• Dynamic systems: where the data
topology is difficult to predict
• Dynamic requirements:
that evolve with the business
• Problems where the relationships in
data contribute meaning & value
Good for:
• Well-understood data structures that
don’t change too frequently
• Known problems involving discrete
parts of the data, or minimal
connectivity
59. At Write Time:
Data is connected
as it is stored
At Read Time:
Lightning-fast retrieval of data and relationships
via pointer chasing
Index-Free Adjacency:
60. How Fast is Fast?
• Sample Social Graph with roughly 1,000 persons
• On average each person has 50 friends
• pathExists(a,b) limited to depth 4
• Caches warmed up to eliminate disk I/O
61. How Fast is Fast?
DATABASE # OF PERSONS QUERY TIME
• Sample Social Graph with roughly 1,000 persons
• On average each person has 50 friends
• pathExists(a,b) limited to depth 4
• Caches warmed up to eliminate disk I/O
62. How Fast is Fast?
DATABASE # OF PERSONS QUERY TIME
MySQL 1,000 2,000 ms
• Sample Social Graph with roughly 1,000 persons
• On average each person has 50 friends
• pathExists(a,b) limited to depth 4
• Caches warmed up to eliminate disk I/O
63. How Fast is Fast?
DATABASE # OF PERSONS QUERY TIME
MySQL 1,000 2,000 ms
Neo4j 1,000 2 ms
• Sample Social Graph with roughly 1,000 persons
• On average each person has 50 friends
• pathExists(a,b) limited to depth 4
• Caches warmed up to eliminate disk I/O
64. How Fast is Fast?
DATABASE # OF PERSONS QUERY TIME
MySQL 1,000 2,000 ms
Neo4j 1,000 2 ms
Neo4j 10,000,000 2 ms
• Sample Social Graph with roughly 1,000 persons
• On average each person has 50 friends
• pathExists(a,b) limited to depth 4
• Caches warmed up to eliminate disk I/O
68. city: “San Francisco”
Querying the graph
Person
Location
city: “San Francisco”
LIVES_IN
( ) ( )p loc
Pattern: Persons who live in San Francisco
69. city: “San Francisco”
Querying the graph
Person
Location
city: “San Francisco”
LIVES_IN
( )loc( )p :Person
Pattern: Persons who live in San Francisco
70. city: “San Francisco”
Querying the graph
Person
Location
city: “San Francisco”
LIVES_IN
( )loc( )p:Person
Pattern: Persons who live in San Francisco
73. Querying the graph
Person
Location
city: “San Francisco”
LIVES_IN
( loc :Location( )p:Person ->-
Pattern: Persons who live in San Francisco
{city: “San Francisco”} )
74. Querying the graph
Person
Location
city: “San Francisco”(loc :Location( )p:Person ->- [:LIVES_IN]
Pattern: Persons who live in San Francisco
{city: “San Francisco”} )
77. (SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM (
SELECT manager.pid AS directReportees, 0 AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
UNION
SELECT manager.pid AS directReportees, count(manager.directly_manages) AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT manager.pid AS directReportees, count(reportee.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT manager.pid AS directReportees, count(L2Reportees.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM (
SELECT manager.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
UNION
SELECT reportee.pid AS directReportees, count(reportee.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT depth1Reportees.pid AS directReportees,
count(depth2Reportees.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM(
SELECT reportee.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT L2Reportees.pid AS directReportees, count(L2Reportees.directly_manages) AS
count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT L2Reportees.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
)
80. Cypher: Key Benefits
Example HR Query in SQL
The Same Query using Cypher
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”
RETURN sub.name AS Subordinate,
count(report) AS Total
Project Impact
Less time writing queries
• More time understanding the answers
• Leaving time to ask the next question
Less time debugging queries:
• More time writing the next piece of code
• Improved quality of overall code base
Code that’s easier to read:
• Faster ramp-up for new project members
• Improved maintainability & troubleshooting
87. Common Integration Patterns
From Disparate Silos
To Cross-Silo Connections
From Tabular Data
To Connected Data
From Data Lake Analytics to
Real-Time Operations
90. Native Language Drivers
• Java
• .Net
• Python
• Javascript
• more to come…
• Massive Community Support (Go, Ruby R, Perl, Clojure, C/C++…)
• Partners like GraphAware (PHP Client)
Drivers and APIs
92. The Neo4j Graph Platform Vision
BUSINESS USERS
DEVELOPERS Graph
Analytics
Data Integration
Discovery & Visualization
DATA
ANALYSTS
Drivers & APIs
APPLICATIONS
AI
Graph
Transactions
93. Graph Discovery & Visualization
Software that allows users to realize insights by interacting directly with their data
Neo4j Browser
Custom / JS Libraries
Partner Applications
94. The Neo4j Graph Platform Vision
BUSINESS USERS
DEVELOPERS Graph
Analytics
Data Integration
Discovery & Visualization
DATA
ANALYSTS
Drivers & APIs
APPLICATIONS
AI
Graph
Transactions
95. The Neo4j Graph Platform Vision
Development &
Administration
BUSINESS USERS
DEVELOPERS
ADMINS
Graph
Analytics
Data Integration
Discovery & Visualization
DATA
ANALYSTS
Drivers & APIs
APPLICATIONS
AI
Graph
Transactions
102. Finds the optimal path
or evaluates route
availability and quality
Evaluates how a
group is clustered
or partitioned
Determines the
importance of distinct
nodes in the network
Neo4j Graph Algorithm Library
106. Neo4j-Cloud: Preview
The Challenge
• Neo4j without the headache of operations
• Lower cost of entry to enterprise-ready features
• Scale cost with value
The Solution
• Neo4j as a Service on the public cloud
• Cloud-hosted database in < 5 minutes
• Automated backup, upgrades, and restores
The Preview Program for Graph Tour
• Looking for a wide variety of participants
• Join the list at http://neo4j.com/cloud
107. • Graphs are fundamentally the most ergonomic and humane
way of working with data at scale
• Ecosystems are most powerful with seamless access to
adjacent technologies
• The best technology is the one that is easiest to use
Why We Believe in Building a Graph Platform