SlideShare a Scribd company logo
1 of 31
Graph Analytics for
Fraud Detection
Using PaySim and the Neo4j Graph Data Science Library
Dave Voutila <dave.voutila@neo4j.com>
Sales Engineer
1
● Sales Engineer with Neo4j! 👋
● Based in Vermont, USA 🌲🍁⛰
○ Work primarily with our Canadian clients
● You can find me on…
○ ...the web: https://sisu.io
○ ...LinkedIn: https://www.linkedin.com/in/davevoutila/
○ ...GitHub: https://github.com/voutilad
○ In the hive of scum and villainy aka Twitter: @voutilad
Who am I?
● Generating realistic, synthetic financial
transactions with PaySim
● Quick rundown of the Neo4j
Graph Data Science Library
● Live Demo of using graph algorithms to
analyze PaySim for fraudulent and risky
behavior
A Tale in 3 Acts
Simulating Fraud
Meet PaySim 👋
● Simulates actors in a mobile
money network
○ Clients
○ Merchants
○ Banks
● Generate synthetic data that
is realistic in the aggregate
● Open source, customizable
● DETERMINISTIC!
Simulations are pretty cool
● Parameterized simulation of Client transactions
● Some fraud simulation, specifically money mules
PaySim v1 & v2-snapshot
● 1st Party / Synthetic Fraud
○ Reuse of identifiers (ssn, email, phone)
○ Fabrication of identifiers
● 3rd Party Fraud
○ Attacks via Merchant vectors
○ Persistence and retargeting of victims
● You can more easily build your own fraudsters
● And a bunch more knobs and dials
PaySim v2.3 (my fork)
An Aside: 1st Party & Synthetic Fraud
See: https://sisu.io/posts/paysim-part3/
PaySim v2.3 -- Data Model
PaySim v2.3 -- Example Data
The Neo4j
Graph Data Science
Library
Graph Data Science is a
science-driven approach to gain
knowledge from the relationships
and structures in data, typically to
power predictions.
What is Graph data science?
Data scientists use
relationships to answer
questions.
Query (e.g. Cypher)
Real-time, local decisioning
and pattern matching
Graph Algorithms
Global analysis
and iterations
You know what you’re
looking for and making a
decision
You’re learning the overall structure
of a network, updating data, and
predicting
Local
Patterns
Global
Computation
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality & Approximate
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• Balanced Triad (identification)
Graph Algorithms & Functions in Neo4j
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• A* Shortest Path
• Yen’s K Shortest Path
• Minimum Weight Spanning Tree
• K-Spanning Tree (MST)
• Random Walk
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• K-1 Coloring
• Modularity Optimization
• Euclidean Distance
• Cosine Similarity
• Node Similarity (Jaccard)
• Overlap Similarity
• Pearson Similarity
• Approximate KNN
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
...and also Auxiliary Functions:
• Random graph generation
• One hot encoding
• Distributions & metrics
It’s easier than it sounds (promise)
The GDS doesn’t operate using the Neo4j kernel API
Graph Algorithms for Detecting Fraud
Graph algorithms enable reasoning
about network structure
Louvain to identify communities
that frequently interact
PageRank to measure influence
and transaction volumes
Connected components
identify disjointed group
sharing identifiers
Jaccard to measure account
similarity
Let’s (un)Do Crimes!
● Each step has a probability of
committing fraud
● If they’re feeling malicious…
○ They have a probability of
re-victimizing someone
○ Or they’ll find a new victim via a high
risk Merchant that they target
(peeking into their history)
● They perform test charges (payments)
● Subsequently may transfer balance
Meet our 3rd Party Fraudster
● The Goal
○ Find unreported Fraud Victims
○ Find at-risk individuals
● The Approach
○ Build a training set of clients
○ Engineer some sort of risk score for merchants (our alleged
fraud vector)
○ Use Client transaction history with Merchants to categorize
them as likely fraud victims
Our mission
● PaySim generated
○ 1.6M Transactions
○ ~10k Clients
○ 500 Merchants
● The graph
○ 1.6M nodes (98% transactions)
○ 5M relationships (we’ll be making more)
Our Playground
● Known Fraud Victims
○ Folks that reported fraudulent charges
○ In the case of PaySim, we are all seeing
● Known Non-Victims
○ What’s a term for non-victims anyway?!
○ These are accounts that have no fraud
Our Training Set
● We’ll primarily be relating Clients to Merchants
Bipartite Graphs
Clients
Merchants
Transactions
Exponential de-what-now?
24
Plot-exponential-decay.png: PeterQderivative work: Autopilot / CC BY-SA
(https://creativecommons.org/licenses/by-sa/2.5)
PageRank
What: Finds important nodes
based on their relationships
Why: Recommendations,
identifying influencers
Features:
- Tolerance
- Damping
Label Propagation
What: Finds communities
Why: Useful for
recommendations, fraud
detection, finding common
co-occurrences. Very fast.
Features:
- Seeding
- Directed relationships
- Weighted relationships
Demo Time!
28
Going further
Some recommended next steps
● Those ~650 high-risk client accounts...
○ Can similarity routines reveal
anything?
○ What if we look at additional historical
transactions?
● That suspect merchant…
○ What can we glean from their activity?
● Operationalizing our findings...
○ How can we implement mutable
graph projections?
Possible next steps in our investigation
● Make your own Fraudsters
○ https://github.com/voutilad/paysim
○ https://www.sisu.io/posts/paysim
○ Requires Java JDK 8 or newer (tested with 11)
● Integrate PaySim with Neo4j
○ https://github.com/voutilad/paysim-demo
○ Works with both Neo4j 3.5 and 4.0
Your Turn: Getting & Using PaySim
seed=time
nbSteps=720
multiplier=1
nbClients=10000
nbFraudsters=500
nbMerchants=500
nbBanks=5
firstPartyFraudProbability=0.001
thirdPartyFraudProbability=0.05
thirdPartyNewVictimProbability=0.025
thirdPartyPercentHighRiskMerchants=0.005
The PaySim Parameters
SORRY...I used TIME as a seed!

More Related Content

What's hot

Workshop Introduction to Neo4j
Workshop Introduction to Neo4jWorkshop Introduction to Neo4j
Workshop Introduction to Neo4jNeo4j
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j
 
Smarter Fraud Detection With Graph Data Science
Smarter Fraud Detection With Graph Data ScienceSmarter Fraud Detection With Graph Data Science
Smarter Fraud Detection With Graph Data ScienceNeo4j
 
How to Build a Fraud Detection Solution with Neo4j
How to Build a Fraud Detection Solution with Neo4jHow to Build a Fraud Detection Solution with Neo4j
How to Build a Fraud Detection Solution with Neo4jNeo4j
 
Tracxn - Top Business Models -Consumer Tech - Apr 2022
Tracxn - Top Business Models -Consumer Tech - Apr 2022Tracxn - Top Business Models -Consumer Tech - Apr 2022
Tracxn - Top Business Models -Consumer Tech - Apr 2022Tracxn
 
Neo4j y GenAI
Neo4j y GenAI Neo4j y GenAI
Neo4j y GenAI Neo4j
 
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...Neo4j
 
Neo4j Popular use case
Neo4j Popular use case Neo4j Popular use case
Neo4j Popular use case Neo4j
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j Max De Marzi
 
Training Week: Introduction to Neo4j Bloom
Training Week: Introduction to Neo4j BloomTraining Week: Introduction to Neo4j Bloom
Training Week: Introduction to Neo4j BloomNeo4j
 
Modeling Manufacturing With Graph Databases: A Journey Towards a Digital Factory
Modeling Manufacturing With Graph Databases: A Journey Towards a Digital FactoryModeling Manufacturing With Graph Databases: A Journey Towards a Digital Factory
Modeling Manufacturing With Graph Databases: A Journey Towards a Digital FactoryNeo4j
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceNeo4j
 
DataDay 2023 Presentation
DataDay 2023 PresentationDataDay 2023 Presentation
DataDay 2023 PresentationMax De Marzi
 
Leveraging Graphs for Artificial Intelligence and Machine Learning - Phani Da...
Leveraging Graphs for Artificial Intelligence and Machine Learning - Phani Da...Leveraging Graphs for Artificial Intelligence and Machine Learning - Phani Da...
Leveraging Graphs for Artificial Intelligence and Machine Learning - Phani Da...Neo4j
 
Optimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j GraphOptimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j GraphNeo4j
 
Introduction to Cypher
Introduction to Cypher Introduction to Cypher
Introduction to Cypher Neo4j
 
Why biased matrix factorization works well?
Why biased matrix factorization works well?Why biased matrix factorization works well?
Why biased matrix factorization works well?Joonyoung Yi
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
 
Easily Identify Sources of Supply Chain Gridlock
Easily Identify Sources of Supply Chain GridlockEasily Identify Sources of Supply Chain Gridlock
Easily Identify Sources of Supply Chain GridlockNeo4j
 

What's hot (20)

Workshop Introduction to Neo4j
Workshop Introduction to Neo4jWorkshop Introduction to Neo4j
Workshop Introduction to Neo4j
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
 
Smarter Fraud Detection With Graph Data Science
Smarter Fraud Detection With Graph Data ScienceSmarter Fraud Detection With Graph Data Science
Smarter Fraud Detection With Graph Data Science
 
How to Build a Fraud Detection Solution with Neo4j
How to Build a Fraud Detection Solution with Neo4jHow to Build a Fraud Detection Solution with Neo4j
How to Build a Fraud Detection Solution with Neo4j
 
Tracxn - Top Business Models -Consumer Tech - Apr 2022
Tracxn - Top Business Models -Consumer Tech - Apr 2022Tracxn - Top Business Models -Consumer Tech - Apr 2022
Tracxn - Top Business Models -Consumer Tech - Apr 2022
 
Neo4j y GenAI
Neo4j y GenAI Neo4j y GenAI
Neo4j y GenAI
 
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
EY + Neo4j: Why graph technology makes sense for fraud detection and customer...
 
Neo4j Popular use case
Neo4j Popular use case Neo4j Popular use case
Neo4j Popular use case
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j
 
Training Week: Introduction to Neo4j Bloom
Training Week: Introduction to Neo4j BloomTraining Week: Introduction to Neo4j Bloom
Training Week: Introduction to Neo4j Bloom
 
Modeling Manufacturing With Graph Databases: A Journey Towards a Digital Factory
Modeling Manufacturing With Graph Databases: A Journey Towards a Digital FactoryModeling Manufacturing With Graph Databases: A Journey Towards a Digital Factory
Modeling Manufacturing With Graph Databases: A Journey Towards a Digital Factory
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
 
User behavior analytics
User behavior analyticsUser behavior analytics
User behavior analytics
 
DataDay 2023 Presentation
DataDay 2023 PresentationDataDay 2023 Presentation
DataDay 2023 Presentation
 
Leveraging Graphs for Artificial Intelligence and Machine Learning - Phani Da...
Leveraging Graphs for Artificial Intelligence and Machine Learning - Phani Da...Leveraging Graphs for Artificial Intelligence and Machine Learning - Phani Da...
Leveraging Graphs for Artificial Intelligence and Machine Learning - Phani Da...
 
Optimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j GraphOptimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j Graph
 
Introduction to Cypher
Introduction to Cypher Introduction to Cypher
Introduction to Cypher
 
Why biased matrix factorization works well?
Why biased matrix factorization works well?Why biased matrix factorization works well?
Why biased matrix factorization works well?
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
 
Easily Identify Sources of Supply Chain Gridlock
Easily Identify Sources of Supply Chain GridlockEasily Identify Sources of Supply Chain Gridlock
Easily Identify Sources of Supply Chain Gridlock
 

Similar to Leveraging Graph Analytics for Fraud Detection in PaySim Data

Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNeo4j
 
Outlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopOutlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopPranab Ghosh
 
Assessing a cloud based approach to cyber security
Assessing a cloud based approach to cyber securityAssessing a cloud based approach to cyber security
Assessing a cloud based approach to cyber securityAladdin Dandis
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMNeo4j
 
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jAI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jIvan Zoratti
 
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j
 
Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Andrew Clark
 
Oxford Lectures Part 1
Oxford Lectures Part 1Oxford Lectures Part 1
Oxford Lectures Part 1Andrea Pasqua
 
201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detectionRik Van Bruggen
 
Knowledge graphs + Chatbots with Neo4j
Knowledge graphs + Chatbots with Neo4jKnowledge graphs + Chatbots with Neo4j
Knowledge graphs + Chatbots with Neo4jChristophe Willemsen
 
All Your Cards Are Belong To Us: Understanding Online Carding Forums
All Your Cards Are Belong To Us: Understanding Online Carding ForumsAll Your Cards Are Belong To Us: Understanding Online Carding Forums
All Your Cards Are Belong To Us: Understanding Online Carding ForumsJeremiah Onaolapo
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your EnterpriseWSO2
 
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4jNeo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4jNeo4j
 
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Alexey Zinoviev
 
THOTCON 0x6: Going Kinetic on Electronic Crime Networks
THOTCON 0x6: Going Kinetic on Electronic Crime NetworksTHOTCON 0x6: Going Kinetic on Electronic Crime Networks
THOTCON 0x6: Going Kinetic on Electronic Crime NetworksJohn Bambenek
 
Detecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousDetecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousNeo4j
 
Automatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to ProductionAutomatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to ProductionAlexey Grigorev
 

Similar to Leveraging Graph Analytics for Fraud Detection in PaySim Data (20)

Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4j
 
Outlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopOutlier and fraud detection using Hadoop
Outlier and fraud detection using Hadoop
 
Assessing a cloud based approach to cyber security
Assessing a cloud based approach to cyber securityAssessing a cloud based approach to cyber security
Assessing a cloud based approach to cyber security
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAM
 
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jAI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
 
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
 
Test driven relevancy
Test driven relevancyTest driven relevancy
Test driven relevancy
 
Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know
 
Oxford Lectures Part 1
Oxford Lectures Part 1Oxford Lectures Part 1
Oxford Lectures Part 1
 
201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection
 
Big Data in FinTech
Big Data in FinTechBig Data in FinTech
Big Data in FinTech
 
Knowledge graphs + Chatbots with Neo4j
Knowledge graphs + Chatbots with Neo4jKnowledge graphs + Chatbots with Neo4j
Knowledge graphs + Chatbots with Neo4j
 
All Your Cards Are Belong To Us: Understanding Online Carding Forums
All Your Cards Are Belong To Us: Understanding Online Carding ForumsAll Your Cards Are Belong To Us: Understanding Online Carding Forums
All Your Cards Are Belong To Us: Understanding Online Carding Forums
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your Enterprise
 
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4jNeo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
 
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
 
THOTCON 0x6: Going Kinetic on Electronic Crime Networks
THOTCON 0x6: Going Kinetic on Electronic Crime NetworksTHOTCON 0x6: Going Kinetic on Electronic Crime Networks
THOTCON 0x6: Going Kinetic on Electronic Crime Networks
 
Detecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousDetecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and Linkurious
 
Javantura v6 - Case Study: Marketplace App with Java and Hyperledger Fabric -...
Javantura v6 - Case Study: Marketplace App with Java and Hyperledger Fabric -...Javantura v6 - Case Study: Marketplace App with Java and Hyperledger Fabric -...
Javantura v6 - Case Study: Marketplace App with Java and Hyperledger Fabric -...
 
Automatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to ProductionAutomatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to Production
 

More from Neo4j

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansNeo4j
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 

More from Neo4j (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Leveraging Graph Analytics for Fraud Detection in PaySim Data

  • 1. Graph Analytics for Fraud Detection Using PaySim and the Neo4j Graph Data Science Library Dave Voutila <dave.voutila@neo4j.com> Sales Engineer 1
  • 2. ● Sales Engineer with Neo4j! 👋 ● Based in Vermont, USA 🌲🍁⛰ ○ Work primarily with our Canadian clients ● You can find me on… ○ ...the web: https://sisu.io ○ ...LinkedIn: https://www.linkedin.com/in/davevoutila/ ○ ...GitHub: https://github.com/voutilad ○ In the hive of scum and villainy aka Twitter: @voutilad Who am I?
  • 3. ● Generating realistic, synthetic financial transactions with PaySim ● Quick rundown of the Neo4j Graph Data Science Library ● Live Demo of using graph algorithms to analyze PaySim for fraudulent and risky behavior A Tale in 3 Acts
  • 5. Meet PaySim 👋 ● Simulates actors in a mobile money network ○ Clients ○ Merchants ○ Banks ● Generate synthetic data that is realistic in the aggregate ● Open source, customizable ● DETERMINISTIC!
  • 7. ● Parameterized simulation of Client transactions ● Some fraud simulation, specifically money mules PaySim v1 & v2-snapshot
  • 8. ● 1st Party / Synthetic Fraud ○ Reuse of identifiers (ssn, email, phone) ○ Fabrication of identifiers ● 3rd Party Fraud ○ Attacks via Merchant vectors ○ Persistence and retargeting of victims ● You can more easily build your own fraudsters ● And a bunch more knobs and dials PaySim v2.3 (my fork)
  • 9. An Aside: 1st Party & Synthetic Fraud See: https://sisu.io/posts/paysim-part3/
  • 10. PaySim v2.3 -- Data Model
  • 11. PaySim v2.3 -- Example Data
  • 12. The Neo4j Graph Data Science Library
  • 13. Graph Data Science is a science-driven approach to gain knowledge from the relationships and structures in data, typically to power predictions. What is Graph data science? Data scientists use relationships to answer questions.
  • 14. Query (e.g. Cypher) Real-time, local decisioning and pattern matching Graph Algorithms Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation
  • 15. • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality & Approximate • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity • Balanced Triad (identification) Graph Algorithms & Functions in Neo4j • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • A* Shortest Path • Yen’s K Shortest Path • Minimum Weight Spanning Tree • K-Spanning Tree (MST) • Random Walk • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity • K-1 Coloring • Modularity Optimization • Euclidean Distance • Cosine Similarity • Node Similarity (Jaccard) • Overlap Similarity • Pearson Similarity • Approximate KNN Pathfinding & Search Centrality / Importance Community Detection Similarity Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors ...and also Auxiliary Functions: • Random graph generation • One hot encoding • Distributions & metrics
  • 16. It’s easier than it sounds (promise) The GDS doesn’t operate using the Neo4j kernel API
  • 17. Graph Algorithms for Detecting Fraud Graph algorithms enable reasoning about network structure Louvain to identify communities that frequently interact PageRank to measure influence and transaction volumes Connected components identify disjointed group sharing identifiers Jaccard to measure account similarity
  • 19. ● Each step has a probability of committing fraud ● If they’re feeling malicious… ○ They have a probability of re-victimizing someone ○ Or they’ll find a new victim via a high risk Merchant that they target (peeking into their history) ● They perform test charges (payments) ● Subsequently may transfer balance Meet our 3rd Party Fraudster
  • 20. ● The Goal ○ Find unreported Fraud Victims ○ Find at-risk individuals ● The Approach ○ Build a training set of clients ○ Engineer some sort of risk score for merchants (our alleged fraud vector) ○ Use Client transaction history with Merchants to categorize them as likely fraud victims Our mission
  • 21. ● PaySim generated ○ 1.6M Transactions ○ ~10k Clients ○ 500 Merchants ● The graph ○ 1.6M nodes (98% transactions) ○ 5M relationships (we’ll be making more) Our Playground
  • 22. ● Known Fraud Victims ○ Folks that reported fraudulent charges ○ In the case of PaySim, we are all seeing ● Known Non-Victims ○ What’s a term for non-victims anyway?! ○ These are accounts that have no fraud Our Training Set
  • 23. ● We’ll primarily be relating Clients to Merchants Bipartite Graphs Clients Merchants Transactions
  • 24. Exponential de-what-now? 24 Plot-exponential-decay.png: PeterQderivative work: Autopilot / CC BY-SA (https://creativecommons.org/licenses/by-sa/2.5)
  • 25. PageRank What: Finds important nodes based on their relationships Why: Recommendations, identifying influencers Features: - Tolerance - Damping
  • 26. Label Propagation What: Finds communities Why: Useful for recommendations, fraud detection, finding common co-occurrences. Very fast. Features: - Seeding - Directed relationships - Weighted relationships
  • 29. ● Those ~650 high-risk client accounts... ○ Can similarity routines reveal anything? ○ What if we look at additional historical transactions? ● That suspect merchant… ○ What can we glean from their activity? ● Operationalizing our findings... ○ How can we implement mutable graph projections? Possible next steps in our investigation
  • 30. ● Make your own Fraudsters ○ https://github.com/voutilad/paysim ○ https://www.sisu.io/posts/paysim ○ Requires Java JDK 8 or newer (tested with 11) ● Integrate PaySim with Neo4j ○ https://github.com/voutilad/paysim-demo ○ Works with both Neo4j 3.5 and 4.0 Your Turn: Getting & Using PaySim