Relational databases were conceived to digitize paper forms and automate well-structured business processes, and still have their uses. But, oftentimes with RDBMS, performance degrades with the increasing number and levels of data relationships and data size.
A graph database like Neo4j naturally stores, manages, analyzes, and uses data within the context of connections meaning Neo4j provides faster query performance and vastly improved flexibility in handling complex hierarchies than SQL.
This webinar explains why companies are shifting away from RDBMS towards graphs to unlock the business value in their data relationships.
2. Data used to be stored like this: punch tape. Or punch cards.
Horrible way to read and understand data.
Impossible to index easily, cross-reference, eliminate inconsistencies and cross-reference.
3. Then we started storing data in tables, and “relational” databases.
Sometimes those tables are human-readable.
But as soon as you normalize the data to eliminate duplication and inconsistencies, many fields start referencing auto-generated
numerical foreign keys. And your data becomes difficult to understand and maintain without complicated JOIN queries.
4. ACCOUNT
HOLDER 2
ACCOUNT
HOLDER 1
ACCOUNT
HOLDER 3
CREDIT
CARD
BANK
ACCOUNT
BANK
ACCOUNT
BANK
ACCOUNT
ADDRESS
PHONE NUMBER
PHONE NUMBER
SSN 2
UNSECURE LOAN
SSN 2
UNSECURE LOAN
CREDIT
CARD
Enter Graph Databases. The future is now.
Graph Databases, like Neo4j, store data in a much more logical way. A way that represents the real world, and prioritizes the
representation, discoverability and maintainability of data relationships.
9. Speed
“We found Neo4j to be literally thousands of times faster
than our prior MySQL solution, with queries that require
10-100 times less code. Today, Neo4j provides eBay with
functionality that was previously impossible.”
- Volker Pacher, Senior Developer
“Minutes to milliseconds” performance
Queries up to 1000x faster than RDBMS or other NoSQL
11. A Naturally Adaptive Model
A Query Language Designed
for Connectedness
+
=Agility
12. Cypher
Typical Complex SQL Join The Same Query using Cypher
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”
RETURN sub.name AS Subordinate,
count(report) AS Total
Project Impact
Less time writing queries
Less time debugging queries
Code that’s easier to read
Less time writing queries
More time understanding the answers
Leaving time to ask the next question
Less time debugging queries:
More time writing the next piece of code
Improved quality of overall code base
Code that’s easier to read:
Faster ramp-up for new project members
Improved maintainability & troubleshooting
13. ABOUT ME
• Developed web apps for 5 years
including e-commerce, business
workflow, more.
• Worked at Google for 8 years on
Google Apps, Cloud Platform
• Technologies: Python, Java,
BigQuery, Oracle, MySQL, OAuth
ryan@neo4j.com
@ryguyrg
14. NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
15. NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Real Time Recommendations
VIEWED
VIEWED
BOUGHT
VIEWED
BOUGHT
BOUGHT
BOUGHT
BOUGHT
Real-Time Recommendations could be about finding the relationsships relevant to make recommend a product or a service….
…which is exactly why Walmart is using Neo4j.
16. “As the current market leader in graph databases,
and with enterprise features for scalability and
availability, Neo4j is the right choice to meet our
demands.” Marcos Wada
Software Developer, Walmart
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
17. NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Master Data Management
MANAGES
MANAGES
LEADS
REGION
M
ANAG
ES
MANAGES
REGION
LEADS
LEADS
COLLABORATES
Master Data Management is about bringing together all the entities within an organization and external to the organization.
To understand the relationship between each of them.
18. Neo4j is the heart of Cisco HMP: used for governance
and single source of truth and a one-stop shop for all
of Cisco’s hierarchies.
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
Cisco uses it for this — to power their content management, resources and knowledge-base articles for use by sales teams. It also powers product recommendations to make sure customers are getting the power of their offerings.
Although this project is focused on sales teams, another group has used Neo4j to power all of their helpdesk content -
19. NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Master Data Management
Solu%on
Support
Case
Support
Case
Knowledge
Base Ar%cle
Message
Knowledge
Base Ar%cle
Knowledge
Base Ar%cle
Neo4j is the heart of Cisco’s Helpdesk Solution too.
Master Data Management is about bringing together all the entities within an organization and external to the organization.
To understand the relationship between each of them.
20. NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Fraud Detection
O
PENED_ACCO
UNT
HAS
IS_ISSUED
HAS
LIVES
LIVES
IS_ISSUED
OPENED_ACCOUNT
Discovering fraud is another use case that is particularly suitable to graphs, because it’s all about about finding fraudulent patterns. Here we work with the top banks and insurance companies as well as many governments..
21. “Graph databases offer new methods of uncovering
fraud rings and other sophisticated scams with a
high-level of accuracy, and are capable of stopping
advanced fraud scenarios in real-time.”
Gorka Sadowski
Cyber Security Expert
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
22. GRAPH THINKING:
Graph Based Search
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
PUBLISH
INCLUDE
INCLUDE
CREATE
CAPTURE
IN
IN
SOURCE
USES
USES
IN
IN
USES
SOURCE
SOURCE
23. Uses Neo4j to manage the digital assets inside of its next
generation in-flight entertainment system.
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
24. NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
BROWSES
CONNECTS
BRIDGES
ROUTES
POWERS
ROUTES
POWERS
POWERS
HOSTS
QUERIES
GRAPH THINKING:
Network & IT-Operations
Decency analysis
Root cause analysis
25. Uses Neo4j for network topology analysis
for big telco service providers
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
26. GRAPH THINKING:
Identity And Access Management
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
TRUSTS
TRUSTS
ID
ID
AUTHENTICATES
AUTHENTICATES
O
W
NS
OWNS
CAN_READ
Think of organizational hierarchies. No longer is it just a tree.
27. UBS was the recipient of the 2014
Graphie Award for “Best Identify And
Access Management App”
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
28. Neo4j Adoption by Selected Verticals
SOFTWARE
FINANCIAL
SERVICES
RETAIL
MEDIA &
BROADCASTING
SOCIAL
NETWORKS
TELECOM HEALTHCARE
29. AGENDA
• Use Cases
• SQL Pains
• Building a Neo4j Application
• Moving from RDBMS -> Graph Models
• Walk through an Example
• Creating Data in Graphs
• Querying Data
30. I hired this kid for all the handwriting you’ll see throughout the presentation.
So, don’t blame me.
31. SQL
Day in the Life of a RDBMS Developer
Let’s explore how your SQL developer works today.
32. They work with data in tables.
Here’s a table of people and where they're from, their hair color and the university they attended.
This table is fairly natural, but duplicating values across multiple rows. Let’s say you want to change the name of a university or a
country, you’d have to update all rows.
33. So, instead, you’d create a separate table for the country, with an ID that references it. This is your primary key.
35. Now, you use that ID to reference the country in the people table - a foreign key.
36. And you’d want to normalize the university table as well.
37. And use the university ID to reference it. Now your table it a lot less readable.
38. So, we see this set of 3 tables with arrows indicating references between primary keys and foreign keys, used in JOINs.
39. SELECT
p.name,
c.country, c.leader, p.hair,
u.name, u.pres, u.state
FROM
people p
LEFT JOIN country c ON c.ID=p.country
LEFT JOIN uni u ON p.uni=u.id
WHERE
u.state=‘CT’
Your SQL looks like this.
Only, this is a super simple JOIN across 3 tables. I’ve often had to work with 10+ tables being JOINed.
43. Meanwhile, it’s expensive to find data.
So we add indexes to make it easier.
But when we have to do index lookups for each and every JOIN?
And we have a dozen JOINs?
That’s expensive.
44. What’s the solution?
Denormalize! But now hard to maintain and have consistent data.
45. • Complex to model and store relationships
• Performance degrades with increases in data
• Queries get long and complex
• Maintenance is painful
SQL Pains
46. • Easy to model and store relationships
• Performance of relationship traversal remains constant with
growth in data size
• Queries are shortened and more readable
• Adding additional properties and relationships can be done on
the fly - no migrations
Graph Gains
47. John Resig, who you may know as the creator of jQuery, loves Neo4j because it simplifies life.
48. What does this Graph look like?
So you’ve seen what tables look like. How do graphs make this better?
63. RDBMS to Graph Options
MIGRATE
ALL DATA
MIGRATE
SUBSET
DUPLICATE
SUBSET
Non-Graph Queries Graph Queries
Graph Queries Non-Graph Queries
All Queries
Rela3onal
Database
Graph
Database
Application
Application
Application
Non Graph
Data
All Data
81. using openCypher
Declarative query language
Easy to learn for someone familiar with languages like SQL
But optimized for graphs, and quickly readable
83. Who do people report to?
MATCH
(e:Employee)<-[:REPORTS_TO]-(sub:Employee)
RETURN
*
84. Who do people report to?
Results can be returned as nodes and relationships
85. Who do people report to?
MATCH
(e:Employee)<-[:REPORTS_TO]-(sub:Employee)
RETURN
e.employeeID AS managerID,
e.firstName AS managerName,
sub.employeeID AS employeeID,
sub.firstName AS employeeName;
or alternatively as a table.
89. What is Robert’s reporting chain?
MATCH
p=(e:Employee)<-[:REPORTS_TO*]-(sub:Employee)
WHERE
sub.firstName = ‘Robert’
RETURN
p
But the power of the graph is in the ability to query arbitrary length paths.
See the asterisks.
95. (ASIDE ON GRAPH COMPUTE)
Optimized for OLTP
But can be used for Graph Compute
Either with built-in functions
Or server-side extensions
Or via exporting data to spark / graphx for analysis
96. Shortest Path Between Airports
MATCH
p = shortestPath(
(a:Airport {code:”SFO”})-[*0..2]->
(b:Airport {code: “MSO”}))
RETURN
p
Example using built-in algorithms.
Dijkstra also available for weighted paths
109. 3 Steps to Creating the Graph
IMPORT NODES CREATE INDEXES IMPORT RELATIONSHIPS
110. Importing Nodes
// Create customers
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/customers.csv" AS row
CREATE (:Customer {companyName: row.CompanyName, customerID:
row.CustomerID, fax: row.Fax, phone: row.Phone});
// Create products
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/products.csv" AS row
CREATE (:Product {productName: row.ProductName, productID:
row.ProductID, unitPrice: toFloat(row.UnitPrice)});
111. Importing Nodes
// Create suppliers
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/suppliers.csv" AS row
CREATE (:Supplier {companyName: row.CompanyName, supplierID:
row.SupplierID});
// Create employees
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/employees.csv" AS row
CREATE (:Employee {employeeID:row.EmployeeID, firstName:
row.FirstName, lastName: row.LastName, title: row.Title});
112. Importing Nodes
// Create categories
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/categories.csv" AS row
CREATE (:Category {categoryID: row.CategoryID, categoryName:
row.CategoryName, description: row.Description});
// Create orders
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/orders.csv" AS row
MERGE (order:Order {orderID: row.OrderID}) ON CREATE SET
order.shipName = row.ShipName;
113. Creating Indexes
CREATE INDEX ON :Product(productID);
CREATE INDEX ON :Product(productName);
CREATE INDEX ON :Category(categoryID);
CREATE INDEX ON :Employee(employeeID);
CREATE INDEX ON :Supplier(supplierID);
CREATE INDEX ON :Customer(customerID);
CREATE INDEX ON :Customer(customerName);
114. Creating Relationships
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/orders.csv" AS row
MATCH (order:Order {orderID: row.OrderID})
MATCH (customer:Customer {customerID: row.CustomerID})
MERGE (customer)-[:PURCHASED]->(order);
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/products.csv" AS row
MATCH (product:Product {productID: row.ProductID})
MATCH (supplier:Supplier {supplierID: row.SupplierID})
MERGE (supplier)-[:SUPPLIES]->(product);
115. Creating Relationships
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-
contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS row
MATCH (order:Order {orderID: row.OrderID})
MATCH (product:Product {productID: row.ProductID})
MERGE (order)-[pu:INCLUDES]->(product)
ON CREATE SET pu.unitPrice = toFloat(row.UnitPrice), pu.quantity =
toFloat(row.Quantity);
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-
contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS row
MATCH (order:Order {orderID: row.OrderID})
MATCH (employee:Employee {employeeID: row.EmployeeID})
MERGE (employee)-[:SOLD]->(order);
116. Creating Relationships
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/
neo4j-contrib/developer-resources/gh-pages/data/northwind/
products.csv" AS row
MATCH (product:Product {productID: row.ProductID})
MATCH (category:Category {categoryID: row.CategoryID})
MERGE (product)-[:PART_OF]->(category);
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/
neo4j-contrib/developer-resources/gh-pages/data/northwind/
employees.csv" AS row
MATCH (employee:Employee {employeeID: row.EmployeeID})
MATCH (manager:Employee {employeeID: row.ReportsTo})
MERGE (employee)-[:REPORTS_TO]->(manager);
119. “We found Neo4j to be literally thousands of times faster
than our prior MySQL solution, with queries that require
10 to 100 times less code. Today, Neo4j provides eBay
with functionality that was previously impossible.”
Volker Pacher
Senior Developer