SlideShare a Scribd company logo
1 of 36
Download to read offline
building intelligent data products
what actually is fraud
architecting flexible data ‘plumbing’
building solid data products on top of them
stephen whitworth
2 years at Hailo as data scientist/jack of some trades out of
university
product and marketplace analytics, agent based
modelling, data engineering, ‘ML’ services
data science/engineering at ravelin, specifically
focused on our detection capabilities
what is ravelin?
online fraud detection and prevention platform
stream application/server data to our events API
we give fraud probability + beautiful data visualisation
backed by techstars/passion/playfair/amadeus/indeed.com
founder/wonga founder amongst other great investors
fraud?
$14B
a dollar for every year the universe has existed
Same day delivery On-demand services
‘victimless crime’
police ill-equipped to handle
low barrier to entry from dark net
3D secure - conversion killer
traditional: human generated rules, born of deep expertise
order-centric view of the world
hybrid: augment expertise by learning rules from data
cards don’t commit fraud, people do
building
good
plumbing
receive firehose through API
decode arbitrary data and store
extract hundreds of features
http/slack/whatever notification to customer
in 100-300ms (ish)
run through N models and rule engine to get probability
BUZZWORDS ABOUND
go
postgres
AWS
microservices
zookeeper
NSQ python
event-driven
elasticsearch bigquery dynamodb
redis
Building Intelligent Data Products
instrumentation
different
databases
for different
needs
kudos if you get The Office reference
postgres: solid, start here
dynamodb: very high throughput, low latency data
bigquery: to answer any question you could possibly have
elasticsearch: rich querying in a reasonable amount of time
graph db: haven’t decided, recommendations?
asynchronous systemsfirehoses
nice deployment patterns
‘lambda architecture’ - the append only log
services store their own interpretation of events
services are almost entirely decoupled
asynchronous systemsfirehoses
error propagation is challenging
no guarantees of SLA - at least as slow as your queue
hard to know who or what is consuming your data
building
data
products
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
N
precision: of all of my predictions, what % was I correct?
recall: out of all of the fraudsters, what % did I catch?
implicit tradeoff between conversion and fraud loss
‘accuracy’ a useless metric for fraud
99.8% ACCURATE
Building Intelligent Data Products
keep model interfaces simple
hide arbitrarily complex transformations behind it
blend global and client specific models
building and training statistical models
currently batch
will combine with online
RANDOM FORESTS
‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
RANDOM FORESTS
MONITORING
probabilistic, not deterministic
dogfood - use live robot customers
run models in ‘dark mode’ to determine performance
why not deep learning? ..yet
ability to debug random forests
had nice results with keras
serialisation and deployment: an unsolved problem
in beta and signing up clients
looking for on-demand services/marketplaces
talk to me afterwards
obligatory: we are hiring!
senior machine learning engineers/data scientists
stephen.whitworth@ravelin.com or talk to me after
@sjwhitworth
www.ravelin.com - @ravelinhq

More Related Content

Similar to Building Intelligent Data Products

Experiment
ExperimentExperiment
Experimentjbashask
 
Nasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningNasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningRatnakar Pandey
 
LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014Ashlie Steele
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Francesca Lazzeri, PhD
 
Mark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
Mark Villinski - Top 10 Tips for Educating Employees about CybersecurityMark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
Mark Villinski - Top 10 Tips for Educating Employees about Cybersecuritycentralohioissa
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Nick Galbreath
 
ThreatMetrix ARRC 2016 presentation by Ted Egan
ThreatMetrix ARRC 2016 presentation by Ted EganThreatMetrix ARRC 2016 presentation by Ted Egan
ThreatMetrix ARRC 2016 presentation by Ted EganKen Lam
 
Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015Den Reymer
 
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptxshyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptxShyamaprasadMS
 
Ai and machine learning help detect, predict and prevent fraud - IBM Watson ...
Ai and machine learning help detect, predict and prevent fraud -  IBM Watson ...Ai and machine learning help detect, predict and prevent fraud -  IBM Watson ...
Ai and machine learning help detect, predict and prevent fraud - IBM Watson ...Institute of Contemporary Sciences
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksDatabricks
 
GraphTalks Italy - Using graphs to fight financial fraud
GraphTalks Italy - Using graphs to fight financial fraudGraphTalks Italy - Using graphs to fight financial fraud
GraphTalks Italy - Using graphs to fight financial fraudNeo4j
 
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraudGraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraudNeo4j
 
A Practical Guide to Post-EMV Card Not Present Fraud
A Practical Guide to Post-EMV Card Not Present FraudA Practical Guide to Post-EMV Card Not Present Fraud
A Practical Guide to Post-EMV Card Not Present FraudForter
 
Data Loss Threats and Mitigations
Data Loss Threats and MitigationsData Loss Threats and Mitigations
Data Loss Threats and MitigationsApril Mardock CISSP
 
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...DevSecCon
 
Detecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousDetecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousNeo4j
 

Similar to Building Intelligent Data Products (20)

WeDo Technologies Blog 2014
WeDo Technologies Blog 2014WeDo Technologies Blog 2014
WeDo Technologies Blog 2014
 
Experiment
ExperimentExperiment
Experiment
 
Nasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learningNasscom how can you identify fraud in fintech lending using deep learning
Nasscom how can you identify fraud in fintech lending using deep learning
 
LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014LSI Spring Agent Open House 2014
LSI Spring Agent Open House 2014
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
 
Mark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
Mark Villinski - Top 10 Tips for Educating Employees about CybersecurityMark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
Mark Villinski - Top 10 Tips for Educating Employees about Cybersecurity
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
 
ThreatMetrix ARRC 2016 presentation by Ted Egan
ThreatMetrix ARRC 2016 presentation by Ted EganThreatMetrix ARRC 2016 presentation by Ted Egan
ThreatMetrix ARRC 2016 presentation by Ted Egan
 
Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015
 
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptxshyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
shyampresentaaaaaaaaaaaaaaaaaaaaaaa.pptx
 
Ai and machine learning help detect, predict and prevent fraud - IBM Watson ...
Ai and machine learning help detect, predict and prevent fraud -  IBM Watson ...Ai and machine learning help detect, predict and prevent fraud -  IBM Watson ...
Ai and machine learning help detect, predict and prevent fraud - IBM Watson ...
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In Databricks
 
GraphTalks Italy - Using graphs to fight financial fraud
GraphTalks Italy - Using graphs to fight financial fraudGraphTalks Italy - Using graphs to fight financial fraud
GraphTalks Italy - Using graphs to fight financial fraud
 
Falcon 012009
Falcon 012009Falcon 012009
Falcon 012009
 
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraudGraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
GraphTalks Frankfurt - Leveraging Graph-Technology to fight financial fraud
 
A Practical Guide to Post-EMV Card Not Present Fraud
A Practical Guide to Post-EMV Card Not Present FraudA Practical Guide to Post-EMV Card Not Present Fraud
A Practical Guide to Post-EMV Card Not Present Fraud
 
Data Loss Threats and Mitigations
Data Loss Threats and MitigationsData Loss Threats and Mitigations
Data Loss Threats and Mitigations
 
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...DevSecCon London 2018: How to fit threat modelling into agile development: sl...
DevSecCon London 2018: How to fit threat modelling into agile development: sl...
 
Detecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousDetecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and Linkurious
 
CyberDen 2020
CyberDen 2020CyberDen 2020
CyberDen 2020
 

Recently uploaded

Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 

Recently uploaded (20)

Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 

Building Intelligent Data Products