Graph Data Science DEMO for fraud analysis

Graph Data Science Demo for
Fraud Analysis
Joe Depeau
Sr. Presales Consultant, UK
28th April, 2020
@joedepeau
http://linkedin.com/in/joedepeau

• Review of the Neo4j Graph Data Science Library
• Demo Data Overview
• Review of Graph Algorithms for Demo
• Demo
• Q&A
2
Agenda

Neo4j Graph Data Science
Library
3

4
Neo4j Graph Algorithms Overview
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Link Prediction
Finds optimal paths
or evaluates route availability and
quality.
Determines the importance of
dis9nct nodes in the network.
Detects group clustering or
partition options.
Evaluates how alike nodes are
by neighbors and
relationships.
Estimates the likelihood of
nodes forming a future
relationship.
Similarity

Graph and ML algorithms in Neo4j
• Minimum Weight Spanning Tree
• Shortest Path
• Single Source Shortest Path
• All Pairs Shortest Path
• A*
• Yen’s K-shortest Paths
• Random Walk
• Breadth First Search
• Depth First Search
• Degree Centrality
• Closeness Centrality
• Betweenness Centrality
• PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count / Clustering Coeﬃcient
• Weakly Connected Components
• Strongly Connected Components
• Label PropagaJon
• Louvain Modularity
• K-1 Colouring
• Modularity OpJmisaJon
• Node Similarity
• Approximate Nearest Neighbours
• Cosine Similarity
• Euclidean Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detec<on
Similarity
https://neo4j.com/docs/graph-data-science/1.0/
Link
Prediction
• Adamic Adar
• Common Neighbours
• PreferenJal ARachment
• Resource AllocaJons
• Same Community
• Total Neighbours
5

7
Some Examples of Typical Bank Data
Event DataProduct and
Services Data
Customer DataOrganisational
Data
3rd Party Data
Documentation
Employee
Data
Processes
Systems and
Databases
KPIs and Reports
Address
Personal Data
Documents
Relationships
Assets
Documentation
Processes
Product / Service
Details
Product / Service
Hierarchy
Pricing
Money
Movements
Web / App Activity
Customer Contact
Social Media
Credit Rating
Agencies
Market Data
Organisational
Hierarchy
Corporate Data

8
Some Examples of Typical Bank Data
Event DataProduct and
Services Data
Customer DataOrganisational
Data
3rd Party Data
Documentation
Employee
Data
Processes
Systems and
Databases
KPIs and Reports
Address
Personal Data
Documents
Relationships
Assets
Documentation
Processes
Product / Service
Details
Product / Service
Hierarchy
Pricing
Money
Movements
Web / App Activity
Customer Contact
Social Media
Credit Rating
Agencies
Market Data
Organisational
Hierarchy
Corporate Data

9
Our Graph Model
Total Nodes : 3.4m
Total Relationships : 10.2m

10
Three ways a Client node can be Flagged
Performed a transaction flagged as fraud Share a SSN with another Client
Have more than one SSN on file

Graph Algorithms for
Demonstra>on
11

PageRank
What: Finds important nodes based
on their relationships.
Why: Identify important or
influential Client nodes by
quantifying the flows of money
towards them.
Uses:
- Fraud detection
- Anti-money Laundering
- Inform prioritization during analysis
and investigation12

13
The PageRank Algorithm
PageRank: what nodes can be considered
‘important’ in our graph based on money flows ?

14
The PageRank Algorithm
PageRank: what nodes can be considered
‘important’ in our graph based on money flows ?
Inputs
.pagerank
Property Output

Betweenness Centrality
15
What: The sum of the % shortest paths that
pass through a node, calculated by pairs.
Why: Identify well connected nodes within
the graph, or bridges between
communities.
Uses:
- Fraud detection
- Anti-money Laundering
- Inform prioritization during analysis and
investigation

16
The Betweenness Centrality Algorithm
Betweenness Centrality: what nodes can be
considered ‘important’ in our graph based on
how many ‘shortest paths’ they are present in?

17
The Betweenness Centrality Algorithm
Betweenness Centrality: what nodes can be
considered ‘important’ in our graph based on
how many ‘shortest paths’ they are present in?
Inputs
.centrality
Property Output

Weakly Connected Components
What: Finds disconnected
community subgraphs in our data.
Why: Identify communities based
on connections with shared pieces
of identity.
Uses:
- Householding
- Synthetic identities
- Stolen identities
18

19
The Weakly Connected Components Algorithm
Weakly Connected
Components: what
communities exist in the
data based on connections
to pieces of identity ?

20
The Weakly Connected Components Algorithm
Weakly Connected
Components: what
communiFes exist in the
data based on connecFons
to pieces of idenFty ?
.component_id
Property Output
Inputs

Louvain Modularity
What: Finds communi?es in our
graph who are connected. Can
return intermediate results.
Why: Useful for iden?fying
communi?es based on transac?on
behaviour rather than iden?ty.
Uses:
- Fraud ring detecHon
- AnH-money Laundering
21

22
The Louvain Algorithm
Louvain: what communiFes of nodes transact
amongst themselves ?

23
The Louvain Algorithm
Louvain: what communities of nodes transact
amongst themselves ?
Inputs
.louvain_community
Property Output

Node Similarity
What: Similarity between nodes
based on neighbours. Writes a new
relationship to the graph.
Why: Identify similar nodes who
share common pieces of identity.
Uses:
- Entity Resolution
- Synthetic identities
- Stolen identities
24

25
The Node Similarity Algorithm
Node Similarity : how
similar are two Client
nodes based on the non-
Merchant accounts they
transact with ?

26
The Node Similarity Algorithm
Node Similarity : how
similar are two Client
nodes based on the non-
Merchant accounts they
transact with ?
SIMILAR Relationship Output
with .score property
Inputs

Graph Data Science DEMO for fraud analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Graph Data Science DEMO for fraud analysis

Similar to Graph Data Science DEMO for fraud analysis (20)

More from Neo4j

More from Neo4j (20)

Recently uploaded

Recently uploaded (20)

Graph Data Science DEMO for fraud analysis