4. 4
Neo4j Graph Algorithms Overview
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Link Prediction
Finds optimal paths
or evaluates route availability and
quality.
Determines the importance of
dis9nct nodes in the network.
Detects group clustering or
partition options.
Evaluates how alike nodes are
by neighbors and
relationships.
Estimates the likelihood of
nodes forming a future
relationship.
Similarity
5. Graph and ML algorithms in Neo4j
• Minimum Weight Spanning Tree
• Shortest Path
• Single Source Shortest Path
• All Pairs Shortest Path
• A*
• Yen’s K-shortest Paths
• Random Walk
• Breadth First Search
• Depth First Search
• Degree Centrality
• Closeness Centrality
• Betweenness Centrality
• PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count / Clustering Coefficient
• Weakly Connected Components
• Strongly Connected Components
• Label PropagaJon
• Louvain Modularity
• K-1 Colouring
• Modularity OpJmisaJon
• Node Similarity
• Approximate Nearest Neighbours
• Cosine Similarity
• Euclidean Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detec<on
Similarity
https://neo4j.com/docs/graph-data-science/1.0/
Link
Prediction
• Adamic Adar
• Common Neighbours
• PreferenJal ARachment
• Resource AllocaJons
• Same Community
• Total Neighbours
5
7. 7
Some Examples of Typical Bank Data
Event DataProduct and
Services Data
Customer DataOrganisational
Data
3rd Party Data
Documentation
Employee
Data
Processes
Systems and
Databases
KPIs and Reports
Address
Personal Data
Documents
Relationships
Assets
Documentation
Processes
Product / Service
Details
Product / Service
Hierarchy
Pricing
Money
Movements
Web / App Activity
Customer Contact
Social Media
Credit Rating
Agencies
Market Data
Organisational
Hierarchy
Corporate Data
8. 8
Some Examples of Typical Bank Data
Event DataProduct and
Services Data
Customer DataOrganisational
Data
3rd Party Data
Documentation
Employee
Data
Processes
Systems and
Databases
KPIs and Reports
Address
Personal Data
Documents
Relationships
Assets
Documentation
Processes
Product / Service
Details
Product / Service
Hierarchy
Pricing
Money
Movements
Web / App Activity
Customer Contact
Social Media
Credit Rating
Agencies
Market Data
Organisational
Hierarchy
Corporate Data
12. PageRank
What: Finds important nodes based
on their relationships.
Why: Identify important or
influential Client nodes by
quantifying the flows of money
towards them.
Uses:
- Fraud detection
- Anti-money Laundering
- Inform prioritization during analysis
and investigation12
14. 14
The PageRank Algorithm
PageRank: what nodes can be considered
‘important’ in our graph based on money flows ?
Inputs
.pagerank
Property Output
15. Betweenness Centrality
15
What: The sum of the % shortest paths that
pass through a node, calculated by pairs.
Why: Identify well connected nodes within
the graph, or bridges between
communities.
Uses:
- Fraud detection
- Anti-money Laundering
- Inform prioritization during analysis and
investigation
16. 16
The Betweenness Centrality Algorithm
Betweenness Centrality: what nodes can be
considered ‘important’ in our graph based on
how many ‘shortest paths’ they are present in?
17. 17
The Betweenness Centrality Algorithm
Betweenness Centrality: what nodes can be
considered ‘important’ in our graph based on
how many ‘shortest paths’ they are present in?
Inputs
.centrality
Property Output
18. Weakly Connected Components
What: Finds disconnected
community subgraphs in our data.
Why: Identify communities based
on connections with shared pieces
of identity.
Uses:
- Householding
- Synthetic identities
- Stolen identities
18
19. 19
The Weakly Connected Components Algorithm
Weakly Connected
Components: what
communities exist in the
data based on connections
to pieces of identity ?
20. 20
The Weakly Connected Components Algorithm
Weakly Connected
Components: what
communiFes exist in the
data based on connecFons
to pieces of idenFty ?
.component_id
Property Output
Inputs
21. Louvain Modularity
What: Finds communi?es in our
graph who are connected. Can
return intermediate results.
Why: Useful for iden?fying
communi?es based on transac?on
behaviour rather than iden?ty.
Uses:
- Fraud ring detecHon
- AnH-money Laundering
21
24. Node Similarity
What: Similarity between nodes
based on neighbours. Writes a new
relationship to the graph.
Why: Identify similar nodes who
share common pieces of identity.
Uses:
- Entity Resolution
- Synthetic identities
- Stolen identities
24
25. 25
The Node Similarity Algorithm
Node Similarity : how
similar are two Client
nodes based on the non-
Merchant accounts they
transact with ?
26. 26
The Node Similarity Algorithm
Node Similarity : how
similar are two Client
nodes based on the non-
Merchant accounts they
transact with ?
SIMILAR Relationship Output
with .score property
Inputs