This document provides an agenda and overview for a presentation on using graph algorithms in banking. The presentation introduces graphs and the Neo4j graph database, demonstrates sample banking data modeled as a graph, and reviews several graph algorithms that could be used for applications like fraud detection, including PageRank, weakly connected components, node similarity, and Louvain modularity. The document concludes with a demo and Q&A section.
Generative AI on Enterprise Cloud with NiFi and Milvus
How Graph Algorithms Answer your Business Questions in Banking and Beyond
1. Graph Algorithms in Banking
Joe Depeau
Sr. Presales Consultant, UK
15th April, 2020
@joedepeau
http://linkedin.com/in/joedepeau
2. • Introduction to Graphs and Neo4j
• Introduction to The Neo4j Graph Data Science Library
• Demo Data Overview
• Review of Graph Algorithms for Demo
• Demo
• Q&A
2
Agenda
7. 7
Car
DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Anatomy of a Property Graph Database
Nodes
• Represent the objects in the
graph
• Can be labeled
Relationships
• Relate nodes by type and
direction
Properties
• Name-value pairs that can go on
nodes and relationships.
LOVES
LOVES
LIVES WITH
OW
NS
Person Person
9. Graph Algorithms are calculations that describe the
topology and connectivity of your graph
9
What the heck are graph algorithms?
- Global traversals & computations
- Learning overall structure
- Typically heuristics and
approximations
- Extracting new data from what you
already have
What’s important? What’s similar? What are efficient traversals?
10. 10
...and what do I do with them?
Explore, plan, measure
Find significant patterns and plan for
optimal structures
Score outcomes and set a threshold
value for a prediction
Machine learning
Use the measures as features to train an
ML model
1st
node
2nd
node
Common
neighbors
Preferential
attachment
Label
1 2 4 15 1
3 4 7 12 1
5 6 1 1 0
11. 11
Tell me more!
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Link Prediction
Finds optimal paths
or evaluates route availability and
quality.
Determines the importance of
distinct nodes in the network.
Detects group clustering or
partition options.
Evaluates how alike nodes are
by neighbors and
relationships.
Estimates the likelihood of
nodes forming a future
relationship.
Similarity
12. Graph and ML algorithms in Neo4j
• Minimum Weight Spanning Tree
• Shortest Path
• Single Source Shortest Path
• All Pairs Shortest Path
• A*
• Yen’s K-shortest Paths
• Random Walk
• Breadth First Search
• Depth First Search
• Degree Centrality
• Closeness Centrality
• Betweenness Centrality
• PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count / Clustering Coefficient
• Weakly Connected Components
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• K-1 Colouring
• Modularity Optimisation
• Node Similarity
• Approximate Nearest Neighbours
• Cosine Similarity
• Euclidean Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
https://neo4j.com/docs/graph-data-science/1.0/
Link
Prediction
• Adamic Adar
• Common Neighbours
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbours
12
14. 14
Some Examples of Typical Bank Data
Event DataProduct and
Services Data
Customer DataOrganisational
Data
3rd Party Data
Documentation
Employee
Data
Processes
Systems and
Databases
KPIs and Reports
Address
Personal Data
Documents
Relationships
Assets
Documentation
Processes
Product / Service
Details
Product / Service
Hierarchy
Pricing
Money
Movements
Web / App Activity
Customer Contact
Social Media
Credit Rating
Agencies
Market Data
Organisational
Hierarchy
Corporate Data
15. 15
Some Examples of Typical Bank Data
Event DataProduct and
Services Data
Customer DataOrganisational
Data
3rd Party Data
Documentation
Employee
Data
Processes
Systems and
Databases
KPIs and Reports
Address
Personal Data
Documents
Relationships
Assets
Documentation
Processes
Product / Service
Details
Product / Service
Hierarchy
Pricing
Money
Movements
Web / App Activity
Customer Contact
Social Media
Credit Rating
Agencies
Market Data
Organisational
Hierarchy
Corporate Data
19. PageRank
What: Finds important nodes based
on their relationships.
Why: Identify important or
influential Client nodes by
quantifying the flows of money
towards them.
Uses:
- Fraud detection
- Anti-money Laundering
- Inform prioritization during analysis
and investigation19
21. 21
The PageRank Algorithm
PageRank: what nodes can be considered
‘important’ in our graph based on money flows ?
Inputs
.pagerank
Property Output
22. Weakly Connected Components
What: Finds disconnected
community subgraphs in our data.
Why: Identify communities based
on connections with shared pieces
of identity.
Uses:
- Householding
- Synthetic identities
- Stolen identities
22
23. 23
The Weakly Connected Components Algorithm
Weakly Connected
Components: what
communities exist in the
data based on connections
to pieces of identity ?
24. 24
The Weakly Connected Components Algorithm
Weakly Connected
Components: what
communities exist in the
data based on connections
to pieces of identity ?
.component_id
Property Output
Inputs
25. Node Similarity
What: Similarity between nodes
based on neighbours. Writes a new
relationship to the graph.
Why: Identify similar nodes who
share common pieces of identity.
Uses:
- Entity Resolution
- Synthetic identities
- Stolen identities
25
26. 26
The Node Similarity Algorithm
Node Similarity : how
similar are two Client
nodes based on pieces of
shared identity ?
27. 27
The Node Similarity Algorithm
Node Similarity : how
similar are two Client
nodes based on pieces of
shared identity ?
SIMILAR Relationship Output
with .score property
Inputs
28. Louvain Modularity
What: Finds communities in our
graph who are connected. Can
return intermediate results.
Why: Useful for identifying
communities based on transaction
behaviour rather than identity.
Uses:
- Fraud ring detection
- Anti-money Laundering
28