2. Einführung in Neo4j
2
• Die primären Anwendungsfälle von Graphdatenbanken
• Die ‚Secret Sauce‘ von Neo4j, die diese möglich machen
• Die Visualisierung von Graphen
• Einfach mit Neo4j loslegen, aber wie?
3. Who am I … Andrew Frei
3
(Andrew) -[:LIVES_NEAR]->(Zurich)
(Andrew) -[:HAS_SHOESIZE]-> (47)
(Andrew) -[:IS_TODAYS]-> (host)
(Andrew) -[:LOVES]-> (Neo4j)
(Andrew) -[:LOVES_SELLING]-> (Neo4j)
Tagline: Supporting Companies to 'Connect the Dots' with Graph Databases & Analytics
Contact: linkedin.com/in/andrew-frei/ and andrew.frei@neo4j.com
9. DRIVES
Labeled Property Graph Model
MARRIED TO
LIVES WITH
OW
NS
99
• Nodes – Represent objects
in the graph
• Relationship – Relate
nodes by type and direction
10. DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Latitude: 37.5629900°
Longitude: -122.3255300°
Labeled Property Graph Model
MARRIED TO
LIVES WITH
OW
NS
1010
• Nodes – Represent objects
in the graph
• Relationship – Relate
nodes by type and direction
• Property – Name-value
pairs that can go on nodes
and relationships
11. CAR
DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Latitude: 37.5629900°
Longitude: -122.3255300°
Labeled Property Graph Model
MARRIED TO
LIVES WITH
OW
NS
PERSON PERSON
1111
• Nodes – Represent objects
in the graph
• Relationship – Relate
nodes by type and direction
• Property – Name-value
pairs that can go on nodes
and relationships
• Label – Group nodes and
shape the domain
16. Adobe Behance
Social Network of 10M
Graphic Artists
Background
● Social network of 10M graphic artists
● Peer-to-peer evaluation of art and works-in-progress
● Job sourcing site for creatives
● Massive, millions of updates (reads & writes) to Activity Feed
● 150 Mongos to 48 Cassandras to 3 Neo4j’s!
Business Problem
● Artists subscribe, appreciate and curate “galleries” of works of their own
and from other artists
● Activities Feed is how everyone receives updates
● 1st implementation was 150 MongoDB instances
● 2nd implementation shrunk to 48 Cassandras, but it was still too slow and
required heavy IT overhead
Solution and Benefits
● 3rd implementation shrunk to 3 Neo4j instances
● Saved over $500k in annual AWS fees
● Reduced data footprint from 50TB to 40GB
● Significantly easier to introduce new features like, “New projects in your
Network”
18. Dun & Bradstreet
Neo4j for Tracking Beneficial
Ownership
Background
● Regulations and requirements around beneficial
ownership
● Needed to let B2B clients book new business promptly
via accelerated due diligence investigations
Business Problem
● Investigations call for highly trained staff, and this activity is
hard to scale. A single request might tie up key people for
10-15 days, resulting in lost revenue
Solution and Benefits
● Use Neo4j to quickly query historic relationships between
business owners and companies
● Query responses take milliseconds versus days of skilled
manual research
19. ICIJ Panama Papers
Fraud Detection /
Graph-Based Search
Background
● International investigative team specializing in
cross-border crime, corruption and accountability of
power
Business Problem
● Find relationships between people, accounts, shell companies
and offshore accounts
● Biggest “Snowden-Style” document leak ever; 11.5 million
documents, 2.6TB of data
Solution and Benefits
● Pulitzer Prize winning investigation resulted in robust
coverage of fraud and corruption
● PM of Iceland resigned, exposed Putin, Prime Ministers,
gangsters, celebrities (Messi) - Trials are ongoing
20. How Neo4j Fits — Common Architecture Patterns
From Disparate Silos
To Cross-Silo Connections
From Tabular Data
To Connected Data
From Data Lake Analytics
to Real-Time Operations
22. • Degree Centrality
• Closeness Centrality
• Harmonic Centrality
• Betweenness Centrality & Approx.
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• Balanced Triad (identification)
50+ Graph Algorithms in Neo4j
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• A* Shortest Path
• Yen’s K Shortest Path
• Minimum Weight Spanning Tree
• K-Spanning Tree (MST)
• Random Walk
• Breadth & Depth First Search
• Triangle Count
• Local Clustering Coefficient
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• K-1 Coloring
• Modularity Optimization
• Euclidean Distance
• Cosine Similarity
• Node Similarity (Jaccard)
• Overlap Similarity
• Pearson Similarity
• Approximate KNN
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
... Auxiliary Functions:
• Random
graph generation
• Graph export
• One hot encoding
• Distributions & metrics
Embeddings
• Node2Vec
• Random Projections
• GraphSAGE
23. Mission
23
❏ Tom Hanks is getting old(er)...
❏ We can boost his career and put him together with a few new faces. Who do
you recommend? How would you give weight to your recommendation?
25. Analytics
Tooling
Graph
Transactions
Dev.
& Admin
Graph Analytics &
Data Science
25
Applications Business Users
Native Graph Technology for Applications & Analytics
Data Analysts
Data Scientists
Drivers & APIs Discovery & Visualization
Data Integration
Developers
Admins
26. 7/10
20/25
7/10
Top Retail Firms
Top Financial Firms
Top Software Vendors
Anyway You Like It
Neo4j - The Graph Company
The Industry’s Largest Dedicated Investment in Graphs
26
Creator of the Property
Graph and Cypher language
at the core of the GQL ISO
project
Thousands of Customers
World-Wide
HQ in Silicon Valley, offices
include London, Munich,
Paris & Malmo
Industry Leaders use Neo4j
On-Prem
DB-as-a-Service
In the Cloud