SlideShare a Scribd company logo
1 of 35
Download to read offline
A Modern Approach to
Connected Data
Practical Graph Algorithms with
Neo4j
About me
Head of Developer Relations Engineering
Follow & tweet at me @mesirii
(graphs)-[:ARE]->(everywhere)
Value from Data Relationships
Common Current Use Cases
Internal Applications
Master Data Management
Network and
IT Operations
Fraud Detection
Customer-Facing Applications
Real-Time Recommendations
Graph-Based Search
Identity and
Access Management
http://neo4j.com/use-cases
Insights from Algorithms
Improving all use-cases
Graph Algorithms
Relevance
Clustering
Structural Insights
Machine Learning
Classification, Regression
NLP, Struct/Content Pred
NN <-> Graph
Graph As Compute
http://neo4j.com/use-cases
Example: Twitter Analytics
Right Relevance
neo4j.com/blog/graph-algorithms-make-election-data-great-again
Reasoning RR
Custom Pipeline Neo4j <-> R
Goal: Simplification & Performance
● PageRank
● Clustering
● Weighted Similarities
● ... prop. IP ...
Applicable Tools
• Store Tweets in Neo4j (e.g. 300M)
• Use APOC + Graph Algorithms for processing
• Use NLP Algorithms / Procedures (e.g. from GraphAware)
• Neo4j Tableau (WDC) Connector
• Use APOC for streaming results to Gephi
Graph Algorithms
Project Goals
• high performance graph
algorithms
• user friendly (procedures)
• support graph projections
• augment OLTP with OLAP
• integrate efficiently with live
Neo4j database
(read, write, Cypher)
• common Graph API to write your
own
Kudos
Development
• Martin Knobloch (AVGL)
• Paul Horn (AVGL)
Documentation
• Tomaz Bratanic
Usage 1. Call as Cypher procedure
2. Pass in specification (Label, Prop, Query) and
configuration
3. ~.stream variant returns (a lot) of results
CALL algo.<name>.stream('Label','TYPE',{conf})
YIELD nodeId, score
4. non-stream variant writes results to graph
returns statistics
CALL algo.<name>('Label','TYPE',{conf});
5. Cypher projection:
pass in Cypher for node- and relationship-lists
CALL algo.<name>(
'MATCH ... RETURN id(n)',
'MATCH (n)-->(m) RETURN id(n), id(m)',
{graph:'cypher'})
Architecture
1. Load Data in parallel
from Neo4j
2. Store in efficient
Datastructure
3. Run Graph Algorithm
in parallel using
Graph API
4. Write data back in
parallel
Neo4j
1, 2
Algorithm
Datastructures
4
3
Graph API
TheAlgorithms
Centralities
•PageRank (baseline)
•Betweeness
•Closeness
•Degree
Example: PageRank
https://neo4j-contrib.github.io/neo4j-graph-algorithms/#_page_rank
CALL algo.pageRank.stream
('Page', 'LINKS', {iterations:20, dampingFactor:0.85})
YIELD node, score
RETURN node,score order by score desc limit 20
CALL algo.pageRank('Page', 'LINKS',
{iterations:20, dampingFactor:0.85,
write: true,writeProperty:"pagerank"})
YIELD nodes, iterations, loadMillis, computeMillis,
writeMillis, dampingFactor, write, writeProperty
Clustering
•Label Propagation
•Louvain (Phase2)
•Union Find / WCC
•Strongly Connected
Components
•Triangle-Count/Clusteri
ng CoeffTheAlgorithms
Example: UnionFind (CC)
https://neo4j-contrib.github.io/neo4j-graph-algorithms/#_connected_components
CALL algo.unionFind.stream('User', 'FRIEND', {})
YIELD nodeId,setId
RETURN setId, count(*)
ORDER BY count(*) DESC LIMIT 100;
CALL algo.unionFind('User', 'FRIEND',
{write:true, partitionProperty:'partition'})
YIELD nodes, setCount, loadMillis, computeMillis,
writeMillis
Path-Expansion / Traversal
• Single Short Path
• All-Nodes SSP
• Parallel BFS / DFS
TheAlgorithms
MATCH(start:Loc {name:'A'})
CALL algo.deltaStepping.stream(start, 'cost', 3.0)
YIELD nodeId, distance
RETURN nodeId, distance ORDER LIMIT 20
MATCH(start:Node {name:'A'})
CALL algo.deltaStepping(start, 'cost', 3.0,
{write:true, writeProperty:'ssp'})
YIELD nodeCount, loadDuration, evalDuration,
writeDuration
RETURN *
Example: UnionFind (CC)
https://neo4j-contrib.github.io/neo4j-graph-algorithms/#_all_pairs_and_single_source_shortest_path
DEMO
Datasets
● Neo4j Community Graph
○ 280k nodes, 1.4m relationships
○ Centralities, Clustering, Grouping
● DBPedia
○ 11m nodes, 116m relationships
○ Page Rank, unionFind
● Bitcoin
○ 1.7bn nodes, 2.7bn rels
○ degree distribution
○ pageRank, unionFind
Neo4j Community Graph
● Neo4j Community activity from Twitter, GitHub, StackOverflow
● Let's look at tweets
● Tweet-PageRank
● Projection -> mention-user-network
○ centralities
○ clustering
○ grouping
DBPedia
● Shallow Copy of Wikipedia
● Just (Page) -[:Link]-> (Page)
CALL algo.pageRank.stream('Page', 'Link', {iterations:5}) YIELD node, score
WITH * ORDER BY score DESC LIMIT 5
RETURN node.title, score;
+--------------------------------------+
| node.title | score |
+--------------------------------------+
| "United States" | 13349.2 |
| "Animal" | 6077.77 |
| "France" | 5025.61 |
| "List of sovereign states" | 4913.92 |
| "Germany" | 4662.32 |
+--------------------------------------+
5 rows
46247 ms
DBPedia
CALL algo.pageRank('Page', 'Link', {write:true,iterations:20});
+--------------------------------------------------------------------------------------+
| nodes | iter | loadMillis | computeMillis | writeMillis | damping | writeProperty |
+--------------------------------------------------------------------------------------+
| 11474730 | 20 | 34106 | 9712 | 1810 | 0.85 | "pagerank" |
+--------------------------------------------------------------------------------------+
1 row
47888 ms
BitCoin
● Full Copy of the BitCoin BlockChain
○ from learnmeabitcoin.com
○ thanks Greg Walker
○ (see Online Meetup)
● 1.7bn nodes, 2.7bn rels
○ 474k blocks,
○ 240m tx,
○ 280m addresses
○ 650m outputs
BitCoin
● distribution of "locked" relationships for "addresses"
○ = participation in transactions
call apoc.stats.degrees('<locked');
+--------------------------------------------------------------------------------------------------------------+
| type | direction | total | p50 | p75 | p90 | p95 | p99 | p999 | max | min | mean |
+--------------------------------------------------------------------------------------------------------------+
| "locked" | "INCOMING" | 654662356 | 0 | 0 | 1 | 1 | 2 | 28 | 1891327 | 0 | 0.37588608290716047 |
+--------------------------------------------------------------------------------------------------------------+
1 row
308619 ms
● Inferred network of addresses, via transaction and output
● (a)<-[:locked]-(o)-[:in]->(tx)-[:out]->(o2)-[:locked]->(a2)
● use cypher projections
● 1M outputs (24s) to start with, connected addresses via tx
● 10M outputs (296s)
call algo.unionFind.stream(
'match (o:output)-[:locked]->(a) with a limit 10000000
return id(a) as id',
'match (o:output)-[:locked]->(a) with o,a limit 10000000
match (o)-[:in]->(tx)-[:out]->(o2)-[:locked]->(a2)
return id(a) as source, id(a2) as target, count(tx) as value',
{graph:'cypher'}) YIELD setId, nodeId
RETURn setId, count(*) as size
ORDER BY size DESC LIMIT 10;
BitCoin
+-------------------+
| setId | size |
+-------------------+
| 5036 | 4409420 |
| 6295282 | 1999 |
| 5839746 | 1488 |
| 9356302 | 833 |
| 6560901 | 733 |
| 6370777 | 637 |
| 8101710 | 392 |
| 5945867 | 369 |
| 2489036 | 264 |
| 1703620 | 203 |
+-------------------+
10 rows
296109 ms
First release July 2017
• neo4j.com/blog/efficient-graph-algorithms-neo4j
Second Release Sept/Oct 2017
• huge graphs, additional algorithms, bugfixes
Timing
• Launch & Perf:
• Docs: neo4j-contrib.github.io/neo4j-graph-algorithms
• Tomaz Bratanic: tbgraph.wordpress.com (docs, social, tube, GoT)
• Community Graph: github.com/community-graph
• Twitter Analytics:
neo4j.com/blog/graph-algorithms-make-election-data-great-again
More examples
Reading Material
• Thanks to Amy Hodler
• 13 Top Resources on Graph Theory
• neo4j.com/blog/
top-13-resources-graph-theory-algorithms/
We need your feedback & use-cases!
neo4j.com/slack -> #neo4j-graph-algorithms
github.com/neo4j-contrib/neo4j-graph-algorithms
michael@neo4j.com
Please get in touch!!
The Sky is the Limit
Questions?
Don't forget! Subscribe & Like
youtube.com/c/neo4j

More Related Content

What's hot

Combine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklCombine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklNeo4j
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...jexp
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageNeo4j
 
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...Databricks
 
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseApache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseMo Patel
 
Building a modern Application with DataFrames
Building a modern Application with DataFramesBuilding a modern Application with DataFrames
Building a modern Application with DataFramesSpark Summit
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jWilliam Lyon
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXAndrea Iacono
 
New Directions for Spark in 2015 - Spark Summit East
New Directions for Spark in 2015 - Spark Summit EastNew Directions for Spark in 2015 - Spark Summit East
New Directions for Spark in 2015 - Spark Summit EastDatabricks
 
Graph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBGraph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBMohamed Taher Alrefaie
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4jjexp
 
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Databricks
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriFlink Forward
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiDatabricks
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesStratio
 
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...Databricks
 
Apache Spark & Cassandra use case at Telefónica Cbs by Antonio Alcacer
Apache Spark & Cassandra use case at Telefónica Cbs by Antonio AlcacerApache Spark & Cassandra use case at Telefónica Cbs by Antonio Alcacer
Apache Spark & Cassandra use case at Telefónica Cbs by Antonio AlcacerStratio
 
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.Lucidworks
 
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Stratio
 

What's hot (20)

Combine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklCombine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quickl
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
 
Power of Polyglot Search
Power of Polyglot SearchPower of Polyglot Search
Power of Polyglot Search
 
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseApache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
 
Building a modern Application with DataFrames
Building a modern Application with DataFramesBuilding a modern Application with DataFrames
Building a modern Application with DataFrames
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4j
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphX
 
New Directions for Spark in 2015 - Spark Summit East
New Directions for Spark in 2015 - Spark Summit EastNew Directions for Spark in 2015 - Spark Summit East
New Directions for Spark in 2015 - Spark Summit East
 
Graph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBGraph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DB
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
 
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia Kalavri
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph Datasources
 
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
 
Apache Spark & Cassandra use case at Telefónica Cbs by Antonio Alcacer
Apache Spark & Cassandra use case at Telefónica Cbs by Antonio AlcacerApache Spark & Cassandra use case at Telefónica Cbs by Antonio Alcacer
Apache Spark & Cassandra use case at Telefónica Cbs by Antonio Alcacer
 
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
 
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
 

Similar to Practical Graph Algorithms with Neo4j

New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...jexp
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...InfluxData
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Gruter
 
Swift distributed tracing method and tools v2
Swift distributed tracing method and tools v2Swift distributed tracing method and tools v2
Swift distributed tracing method and tools v2zhang hua
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineDataWorks Summit
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Databricks
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)Amazon Web Services
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScyllaDB
 
Adaptive Query Optimization
Adaptive Query OptimizationAdaptive Query Optimization
Adaptive Query OptimizationAnju Garg
 
Ge aviation spark application experience porting analytics into py spark ml p...
Ge aviation spark application experience porting analytics into py spark ml p...Ge aviation spark application experience porting analytics into py spark ml p...
Ge aviation spark application experience porting analytics into py spark ml p...Databricks
 
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld
 
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingSveta Smirnova
 
Writing efficient sql
Writing efficient sqlWriting efficient sql
Writing efficient sqlj9soto
 
Dynamic Tracing of your AMP web site
Dynamic Tracing of your AMP web siteDynamic Tracing of your AMP web site
Dynamic Tracing of your AMP web siteSriram Natarajan
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for GraphsJean Ihm
 
Window functions in MySQL 8.0
Window functions in MySQL 8.0Window functions in MySQL 8.0
Window functions in MySQL 8.0Mydbops
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemAccumulo Summit
 
How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesCloudera, Inc.
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak PROIDEA
 

Similar to Practical Graph Algorithms with Neo4j (20)

New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
 
Swift distributed tracing method and tools v2
Swift distributed tracing method and tools v2Swift distributed tracing method and tools v2
Swift distributed tracing method and tools v2
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
 
Adaptive Query Optimization
Adaptive Query OptimizationAdaptive Query Optimization
Adaptive Query Optimization
 
Ge aviation spark application experience porting analytics into py spark ml p...
Ge aviation spark application experience porting analytics into py spark ml p...Ge aviation spark application experience porting analytics into py spark ml p...
Ge aviation spark application experience porting analytics into py spark ml p...
 
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
 
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
 
Presentation
PresentationPresentation
Presentation
 
Writing efficient sql
Writing efficient sqlWriting efficient sql
Writing efficient sql
 
Dynamic Tracing of your AMP web site
Dynamic Tracing of your AMP web siteDynamic Tracing of your AMP web site
Dynamic Tracing of your AMP web site
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
 
Window functions in MySQL 8.0
Window functions in MySQL 8.0Window functions in MySQL 8.0
Window functions in MySQL 8.0
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
 
How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issues
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 

More from jexp

Looming Marvelous - Virtual Threads in Java Javaland.pdf
Looming Marvelous - Virtual Threads in Java Javaland.pdfLooming Marvelous - Virtual Threads in Java Javaland.pdf
Looming Marvelous - Virtual Threads in Java Javaland.pdfjexp
 
Easing the daily grind with the awesome JDK command line tools
Easing the daily grind with the awesome JDK command line toolsEasing the daily grind with the awesome JDK command line tools
Easing the daily grind with the awesome JDK command line toolsjexp
 
Looming Marvelous - Virtual Threads in Java
Looming Marvelous - Virtual Threads in JavaLooming Marvelous - Virtual Threads in Java
Looming Marvelous - Virtual Threads in Javajexp
 
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptxGraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptxjexp
 
Neo4j Connector Apache Spark FiNCENFiles
Neo4j Connector Apache Spark FiNCENFilesNeo4j Connector Apache Spark FiNCENFiles
Neo4j Connector Apache Spark FiNCENFilesjexp
 
How Graphs Help Investigative Journalists to Connect the Dots
How Graphs Help Investigative Journalists to Connect the DotsHow Graphs Help Investigative Journalists to Connect the Dots
How Graphs Help Investigative Journalists to Connect the Dotsjexp
 
The Home Office. Does it really work?
The Home Office. Does it really work?The Home Office. Does it really work?
The Home Office. Does it really work?jexp
 
Polyglot Applications with GraalVM
Polyglot Applications with GraalVMPolyglot Applications with GraalVM
Polyglot Applications with GraalVMjexp
 
Neo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache KafkaNeo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache Kafkajexp
 
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Library
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures LibraryAPOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Library
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Libraryjexp
 
Refactoring, 2nd Edition
Refactoring, 2nd EditionRefactoring, 2nd Edition
Refactoring, 2nd Editionjexp
 
A Game of Data and GraphQL
A Game of Data and GraphQLA Game of Data and GraphQL
A Game of Data and GraphQLjexp
 
Querying Graphs with GraphQL
Querying Graphs with GraphQLQuerying Graphs with GraphQL
Querying Graphs with GraphQLjexp
 
Graphs & Neo4j - Past Present Future
Graphs & Neo4j - Past Present FutureGraphs & Neo4j - Past Present Future
Graphs & Neo4j - Past Present Futurejexp
 
Class graph neo4j and software metrics
Class graph neo4j and software metricsClass graph neo4j and software metrics
Class graph neo4j and software metricsjexp
 
New Neo4j Auto HA Cluster
New Neo4j Auto HA ClusterNew Neo4j Auto HA Cluster
New Neo4j Auto HA Clusterjexp
 
Spring Data Neo4j Intro SpringOne 2012
Spring Data Neo4j Intro SpringOne 2012Spring Data Neo4j Intro SpringOne 2012
Spring Data Neo4j Intro SpringOne 2012jexp
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypherjexp
 
Geekout publish
Geekout publishGeekout publish
Geekout publishjexp
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 

More from jexp (20)

Looming Marvelous - Virtual Threads in Java Javaland.pdf
Looming Marvelous - Virtual Threads in Java Javaland.pdfLooming Marvelous - Virtual Threads in Java Javaland.pdf
Looming Marvelous - Virtual Threads in Java Javaland.pdf
 
Easing the daily grind with the awesome JDK command line tools
Easing the daily grind with the awesome JDK command line toolsEasing the daily grind with the awesome JDK command line tools
Easing the daily grind with the awesome JDK command line tools
 
Looming Marvelous - Virtual Threads in Java
Looming Marvelous - Virtual Threads in JavaLooming Marvelous - Virtual Threads in Java
Looming Marvelous - Virtual Threads in Java
 
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptxGraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
 
Neo4j Connector Apache Spark FiNCENFiles
Neo4j Connector Apache Spark FiNCENFilesNeo4j Connector Apache Spark FiNCENFiles
Neo4j Connector Apache Spark FiNCENFiles
 
How Graphs Help Investigative Journalists to Connect the Dots
How Graphs Help Investigative Journalists to Connect the DotsHow Graphs Help Investigative Journalists to Connect the Dots
How Graphs Help Investigative Journalists to Connect the Dots
 
The Home Office. Does it really work?
The Home Office. Does it really work?The Home Office. Does it really work?
The Home Office. Does it really work?
 
Polyglot Applications with GraalVM
Polyglot Applications with GraalVMPolyglot Applications with GraalVM
Polyglot Applications with GraalVM
 
Neo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache KafkaNeo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache Kafka
 
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Library
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures LibraryAPOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Library
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Library
 
Refactoring, 2nd Edition
Refactoring, 2nd EditionRefactoring, 2nd Edition
Refactoring, 2nd Edition
 
A Game of Data and GraphQL
A Game of Data and GraphQLA Game of Data and GraphQL
A Game of Data and GraphQL
 
Querying Graphs with GraphQL
Querying Graphs with GraphQLQuerying Graphs with GraphQL
Querying Graphs with GraphQL
 
Graphs & Neo4j - Past Present Future
Graphs & Neo4j - Past Present FutureGraphs & Neo4j - Past Present Future
Graphs & Neo4j - Past Present Future
 
Class graph neo4j and software metrics
Class graph neo4j and software metricsClass graph neo4j and software metrics
Class graph neo4j and software metrics
 
New Neo4j Auto HA Cluster
New Neo4j Auto HA ClusterNew Neo4j Auto HA Cluster
New Neo4j Auto HA Cluster
 
Spring Data Neo4j Intro SpringOne 2012
Spring Data Neo4j Intro SpringOne 2012Spring Data Neo4j Intro SpringOne 2012
Spring Data Neo4j Intro SpringOne 2012
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypher
 
Geekout publish
Geekout publishGeekout publish
Geekout publish
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 

Recently uploaded

英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfIdiosysTechnologies1
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 

Recently uploaded (20)

英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdf
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 

Practical Graph Algorithms with Neo4j

  • 1. A Modern Approach to Connected Data Practical Graph Algorithms with Neo4j
  • 2. About me Head of Developer Relations Engineering Follow & tweet at me @mesirii
  • 4. Value from Data Relationships Common Current Use Cases Internal Applications Master Data Management Network and IT Operations Fraud Detection Customer-Facing Applications Real-Time Recommendations Graph-Based Search Identity and Access Management http://neo4j.com/use-cases
  • 5. Insights from Algorithms Improving all use-cases Graph Algorithms Relevance Clustering Structural Insights Machine Learning Classification, Regression NLP, Struct/Content Pred NN <-> Graph Graph As Compute http://neo4j.com/use-cases
  • 6. Example: Twitter Analytics Right Relevance neo4j.com/blog/graph-algorithms-make-election-data-great-again
  • 9. Goal: Simplification & Performance ● PageRank ● Clustering ● Weighted Similarities ● ... prop. IP ...
  • 10. Applicable Tools • Store Tweets in Neo4j (e.g. 300M) • Use APOC + Graph Algorithms for processing • Use NLP Algorithms / Procedures (e.g. from GraphAware) • Neo4j Tableau (WDC) Connector • Use APOC for streaming results to Gephi
  • 12. Project Goals • high performance graph algorithms • user friendly (procedures) • support graph projections • augment OLTP with OLAP • integrate efficiently with live Neo4j database (read, write, Cypher) • common Graph API to write your own
  • 13. Kudos Development • Martin Knobloch (AVGL) • Paul Horn (AVGL) Documentation • Tomaz Bratanic
  • 14. Usage 1. Call as Cypher procedure 2. Pass in specification (Label, Prop, Query) and configuration 3. ~.stream variant returns (a lot) of results CALL algo.<name>.stream('Label','TYPE',{conf}) YIELD nodeId, score 4. non-stream variant writes results to graph returns statistics CALL algo.<name>('Label','TYPE',{conf}); 5. Cypher projection: pass in Cypher for node- and relationship-lists CALL algo.<name>( 'MATCH ... RETURN id(n)', 'MATCH (n)-->(m) RETURN id(n), id(m)', {graph:'cypher'})
  • 15. Architecture 1. Load Data in parallel from Neo4j 2. Store in efficient Datastructure 3. Run Graph Algorithm in parallel using Graph API 4. Write data back in parallel Neo4j 1, 2 Algorithm Datastructures 4 3 Graph API
  • 17. Example: PageRank https://neo4j-contrib.github.io/neo4j-graph-algorithms/#_page_rank CALL algo.pageRank.stream ('Page', 'LINKS', {iterations:20, dampingFactor:0.85}) YIELD node, score RETURN node,score order by score desc limit 20 CALL algo.pageRank('Page', 'LINKS', {iterations:20, dampingFactor:0.85, write: true,writeProperty:"pagerank"}) YIELD nodes, iterations, loadMillis, computeMillis, writeMillis, dampingFactor, write, writeProperty
  • 18. Clustering •Label Propagation •Louvain (Phase2) •Union Find / WCC •Strongly Connected Components •Triangle-Count/Clusteri ng CoeffTheAlgorithms
  • 19. Example: UnionFind (CC) https://neo4j-contrib.github.io/neo4j-graph-algorithms/#_connected_components CALL algo.unionFind.stream('User', 'FRIEND', {}) YIELD nodeId,setId RETURN setId, count(*) ORDER BY count(*) DESC LIMIT 100; CALL algo.unionFind('User', 'FRIEND', {write:true, partitionProperty:'partition'}) YIELD nodes, setCount, loadMillis, computeMillis, writeMillis
  • 20. Path-Expansion / Traversal • Single Short Path • All-Nodes SSP • Parallel BFS / DFS TheAlgorithms
  • 21. MATCH(start:Loc {name:'A'}) CALL algo.deltaStepping.stream(start, 'cost', 3.0) YIELD nodeId, distance RETURN nodeId, distance ORDER LIMIT 20 MATCH(start:Node {name:'A'}) CALL algo.deltaStepping(start, 'cost', 3.0, {write:true, writeProperty:'ssp'}) YIELD nodeCount, loadDuration, evalDuration, writeDuration RETURN * Example: UnionFind (CC) https://neo4j-contrib.github.io/neo4j-graph-algorithms/#_all_pairs_and_single_source_shortest_path
  • 22. DEMO
  • 23. Datasets ● Neo4j Community Graph ○ 280k nodes, 1.4m relationships ○ Centralities, Clustering, Grouping ● DBPedia ○ 11m nodes, 116m relationships ○ Page Rank, unionFind ● Bitcoin ○ 1.7bn nodes, 2.7bn rels ○ degree distribution ○ pageRank, unionFind
  • 24. Neo4j Community Graph ● Neo4j Community activity from Twitter, GitHub, StackOverflow ● Let's look at tweets ● Tweet-PageRank ● Projection -> mention-user-network ○ centralities ○ clustering ○ grouping
  • 25. DBPedia ● Shallow Copy of Wikipedia ● Just (Page) -[:Link]-> (Page) CALL algo.pageRank.stream('Page', 'Link', {iterations:5}) YIELD node, score WITH * ORDER BY score DESC LIMIT 5 RETURN node.title, score; +--------------------------------------+ | node.title | score | +--------------------------------------+ | "United States" | 13349.2 | | "Animal" | 6077.77 | | "France" | 5025.61 | | "List of sovereign states" | 4913.92 | | "Germany" | 4662.32 | +--------------------------------------+ 5 rows 46247 ms
  • 26. DBPedia CALL algo.pageRank('Page', 'Link', {write:true,iterations:20}); +--------------------------------------------------------------------------------------+ | nodes | iter | loadMillis | computeMillis | writeMillis | damping | writeProperty | +--------------------------------------------------------------------------------------+ | 11474730 | 20 | 34106 | 9712 | 1810 | 0.85 | "pagerank" | +--------------------------------------------------------------------------------------+ 1 row 47888 ms
  • 27. BitCoin ● Full Copy of the BitCoin BlockChain ○ from learnmeabitcoin.com ○ thanks Greg Walker ○ (see Online Meetup) ● 1.7bn nodes, 2.7bn rels ○ 474k blocks, ○ 240m tx, ○ 280m addresses ○ 650m outputs
  • 28. BitCoin ● distribution of "locked" relationships for "addresses" ○ = participation in transactions call apoc.stats.degrees('<locked'); +--------------------------------------------------------------------------------------------------------------+ | type | direction | total | p50 | p75 | p90 | p95 | p99 | p999 | max | min | mean | +--------------------------------------------------------------------------------------------------------------+ | "locked" | "INCOMING" | 654662356 | 0 | 0 | 1 | 1 | 2 | 28 | 1891327 | 0 | 0.37588608290716047 | +--------------------------------------------------------------------------------------------------------------+ 1 row 308619 ms
  • 29. ● Inferred network of addresses, via transaction and output ● (a)<-[:locked]-(o)-[:in]->(tx)-[:out]->(o2)-[:locked]->(a2) ● use cypher projections ● 1M outputs (24s) to start with, connected addresses via tx ● 10M outputs (296s) call algo.unionFind.stream( 'match (o:output)-[:locked]->(a) with a limit 10000000 return id(a) as id', 'match (o:output)-[:locked]->(a) with o,a limit 10000000 match (o)-[:in]->(tx)-[:out]->(o2)-[:locked]->(a2) return id(a) as source, id(a2) as target, count(tx) as value', {graph:'cypher'}) YIELD setId, nodeId RETURn setId, count(*) as size ORDER BY size DESC LIMIT 10; BitCoin +-------------------+ | setId | size | +-------------------+ | 5036 | 4409420 | | 6295282 | 1999 | | 5839746 | 1488 | | 9356302 | 833 | | 6560901 | 733 | | 6370777 | 637 | | 8101710 | 392 | | 5945867 | 369 | | 2489036 | 264 | | 1703620 | 203 | +-------------------+ 10 rows 296109 ms
  • 30. First release July 2017 • neo4j.com/blog/efficient-graph-algorithms-neo4j Second Release Sept/Oct 2017 • huge graphs, additional algorithms, bugfixes Timing
  • 31. • Launch & Perf: • Docs: neo4j-contrib.github.io/neo4j-graph-algorithms • Tomaz Bratanic: tbgraph.wordpress.com (docs, social, tube, GoT) • Community Graph: github.com/community-graph • Twitter Analytics: neo4j.com/blog/graph-algorithms-make-election-data-great-again More examples
  • 32. Reading Material • Thanks to Amy Hodler • 13 Top Resources on Graph Theory • neo4j.com/blog/ top-13-resources-graph-theory-algorithms/
  • 33. We need your feedback & use-cases! neo4j.com/slack -> #neo4j-graph-algorithms github.com/neo4j-contrib/neo4j-graph-algorithms michael@neo4j.com Please get in touch!!
  • 34. The Sky is the Limit Questions?
  • 35. Don't forget! Subscribe & Like youtube.com/c/neo4j