SlideShare a Scribd company logo
1 of 19
Download to read offline
Copyright 2017 RESTRICTED CIRCULATION
USING ML TO FIND A NEEDLE IN A HAYSTACK
Dr. Nilesh Karnik 2nd Feb 2018
Copyright 2017 RESTRICTED CIRCULATION
About 12-15%
of claims made
in 2016-17 could
be fraudulent.*
2
LIFE
~INR 2078 CR
NON – LIFE
~INR 12100 CR
POTENTIAL LOSSES
* IRDAI Reports
Copyright 2017 RESTRICTED CIRCULATION
Using ML and AI to fight Fraud
PATTERN
IDENTIFICATION
CONNECT
IDENTITIES
UNDERSTAND
UNSTRUCTURED
BUILD
NETWORK
3
Copyright 2017 RESTRICTED CIRCULATION
Proportions of early claims
or fraud are typically of the
order of 1% or less of the
policy portfolio*
4* Based on Aureus Customer Engagements
Copyright 2017 RESTRICTED CIRCULATION
Machine Learning
Machine learning
algorithm
Model that approximates the underlying process
Using data to understand an underlying process
Process
X Y
5
Copyright 2017 RESTRICTED CIRCULATION
Prediction is difficult –
especially if it is about the
future.
PAST FUTUREPRESENT
Use knowledge
gathered from the
past
To make an
inference about
the future
6
Why use a predictive model?
A predictive model lets us distinguish between different subsets of
policies on the basis of the probability of the predicted event.
Policy
buckets
Number of
policies issued
Number of policies
that will pay their 13th
month premium
% of policies
paying
top 10% 1000 600 60%
10-20% 1000 600 60%
20-30% 1000 600 60%
30-40% 1000 600 60%
40-50% 1000 600 60%
50-60% 1000 600 60%
60-70% 1000 600 60%
70-80% 1000 600 60%
80-90% 1000 600 60%
Bottom 10% 1000 600 60%
Total 10000 6000 60%
Policy
buckets
Number of
policies
issued
Number of policies that
will pay their 13th
month premium
% of policies
paying
top 10% 1000 990 99%
10-20% 1000 880 88%
20-30% 1000 750 75%
30-40% 1000 700 70%
40-50% 1000 650 65%
50-60% 1000 630 63%
60-70% 1000 620 62%
70-80% 1000 480 48%
80-90% 1000 250 25%
Bottom 10% 1000 50 5%
Total 10000 6000 60%
Randomly ordered collection of policies Ordered using scores from a predictive model
7
The life-cycle of a machine learning model
FeedbackImplement
Business
strategy to
act on
model
output
Training &
Testing
Data
availability
Problem
definition
Periodic Retraining
8
Copyright 2017 RESTRICTED CIRCULATION
A Needle in the Haystack
Unbalanced
problems can
be
troublesome
Additional effort
required in the
training phase
Algorithm
choices may
become more
important
Testing across
different time
periods is crucial
%s become
deceptive
Accuracy
metrics become
deceptive
Customers
expect
more!
9
Copyright 2017 RESTRICTED CIRCULATION
A needle in the haystack … measuring model
accuracy
Can you imagine
a model that is
99.5% accurate,
but completely
unusable ?
A predictive model
which predicts that
this portfolio will
get no early claims
will go wrong only
in 0.5% of cases.
Imagine a policy
portfolio which
historically has
a 0.5% risk of
early claims.
10
“Lift” provided by a predictive model – Scenario 1
• A policy portfolio with a 60% average probability of payment
• A payment propensity model identifies green subset of size 25% with
90% payment probability and a red subset of size 20% with 12%
payment probability
1.5x of the average
probability
1/5 of the average
probability
60%
90%
64%
12%
100%
25%
55%
20%
11
“Lift” provided by a predictive model – Scenario 2
• A proposal portfolio with a 3 in 1000 average probability of claim fraud
• A predictive model identifies red subset of size 25% with 4.5 in 1,000
probability of claim fraud and a green subset of size 20% with 6 in
10,000 probability of claim fraud
Although the lift provided by
this model is similar to the
pervious scenario, it still
needs improvement because
this result means that the
business team still needs to
sift through ¼ th of the
portfolio to eliminate about
38% of risk.
3 in 1,000
4.5 in 1,000
3.2 in 1,000
6 in 10,000
25%
55%
20%
100%
12
Example of results with customer portfolios –
Early claim - 1
• A proposal portfolio with a 7 in 1000 average probability of early claims
• Prediction at proposal submission
7 in 1,000
8.6%
7 in 10,000
4.2%
31.5%
100%
12x of the average
probability.
Captures nearly half
of risk in a small set
less than 5% of the
portfolio size
1/10 of the average
probability
13
Example of results with customer portfolios –
Early claims -2
• A proposal portfolio with a 4.4 in 1000 average probability of early claims
• Prediction at proposal submission
4.4 in 1,000
2.5%
6 in 10,000
5.4%
14%
100%
5.6x of the average
probability.
Captures nearly 1/3
of the risk in a small
set of about 5% of
the portfolio size
1/7 of the average
probability
14
Example of results with customer portfolios –
Fraudulent claims-1
• A portfolio with a 0.9% average risk of fraud for early claims
• Prediction at claim notification
0.9%
7.2%
0.16%
10%
90%
100%
8x of the average
probability.
Captures more than
80% of risk in a
small subset of
about 10% of all
early claims
1/5 of the average
probability
15
Example of results with customer portfolios –
Fraudulent claims-2
• A portfolio with a 3.6% average risk of fraud for early claims
• Prediction at claim notification
3.6%
30%
0.56%
6%
50%
100%
8.3x of the average
probability.
Captures more than
half of risk in a
small subset of size
6% of all early
claims
1/6 of the average
probability
16
Copyright 2017 RESTRICTED CIRCULATION
Benefits
Investigate high
risk proposals to
reduce risk of
early claims
Optimize claim
investigation
efforts
Quicker
issuance for
low risk
proposals
Reduce
settlement TAT
for low risk
claims
Improve
customer
experience
17
Copyright 2017 RESTRICTED CIRCULATION
Companies Which Use ML
CUSTOMER EXPERIENCE
CUSTOMER BEHAVIOUR
FRAUD MITIGATION
BUYING BEHAVIOUR
TRAVEL SAFETY
VIEWING PREDICTION
Source : Secondary Research
Copyright 2017 RESTRICTED CIRCULATION 19
Make Customer Experience your
Differentiator.
www.aureusanalytics.com
@AureusAnalytics
info@aureusanalytics.com
blog.aureusanalytics.com
https://www.linkedin.com/company/aureus-analytics

More Related Content

Similar to Using Machine Learning to Find a needle in a haystack Aureus Analytics

Study ROI of Supply Chain Risk Management (riskmethods Nov 2014)
Study ROI of Supply Chain Risk Management (riskmethods Nov 2014)Study ROI of Supply Chain Risk Management (riskmethods Nov 2014)
Study ROI of Supply Chain Risk Management (riskmethods Nov 2014)Heiko Schwarz
 
The Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged CareThe Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged CareAltegra Health
 
Jay Budzik, Ai4 Finance, Aug 21, 2019
Jay Budzik, Ai4 Finance, Aug 21, 2019Jay Budzik, Ai4 Finance, Aug 21, 2019
Jay Budzik, Ai4 Finance, Aug 21, 2019Bruce Upbin
 
Is Bigger Data Really Better? 10 Facts from Theory and Practice
Is Bigger Data Really Better? 10 Facts from Theory and PracticeIs Bigger Data Really Better? 10 Facts from Theory and Practice
Is Bigger Data Really Better? 10 Facts from Theory and PracticeDataWorks Summit
 
Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
Analytics, Big Data and The Cloud II Conference - Kiribatu LabsAnalytics, Big Data and The Cloud II Conference - Kiribatu Labs
Analytics, Big Data and The Cloud II Conference - Kiribatu LabsPawel Brzeminski
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his macRising Media, Inc.
 
Profiting from customer profitability + big data fitzgerald analytics
Profiting from customer profitability + big data fitzgerald analyticsProfiting from customer profitability + big data fitzgerald analytics
Profiting from customer profitability + big data fitzgerald analyticsFitzgerald Analytics, Inc.
 
six-sigma-and-minitab-13 (1).ppt
six-sigma-and-minitab-13 (1).pptsix-sigma-and-minitab-13 (1).ppt
six-sigma-and-minitab-13 (1).pptPTD QUYCOCTU
 
Increasing Revenue of Prepaid Customers by Recharge Segmentation Models
Increasing Revenue of Prepaid Customers by Recharge Segmentation ModelsIncreasing Revenue of Prepaid Customers by Recharge Segmentation Models
Increasing Revenue of Prepaid Customers by Recharge Segmentation ModelsAlgolytics (old account)
 
Increasing Revenue of Prepaid Customers by Recharge Segmentation Models
Increasing Revenue of Prepaid Customers by Recharge Segmentation ModelsIncreasing Revenue of Prepaid Customers by Recharge Segmentation Models
Increasing Revenue of Prepaid Customers by Recharge Segmentation ModelsAlgolytics
 
Revolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfRevolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfOmar Maher
 
evolution-of-early-warning-system-for-lenders.pdf
evolution-of-early-warning-system-for-lenders.pdfevolution-of-early-warning-system-for-lenders.pdf
evolution-of-early-warning-system-for-lenders.pdfhoney2109
 
Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...
Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...
Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...IIJSRJournal
 
Data Science by Chappuis Halder & Co.
Data Science by Chappuis Halder & Co.Data Science by Chappuis Halder & Co.
Data Science by Chappuis Halder & Co.Genest Benoit
 
Digital Shift in Insurance: How is the Industry Responding with the Influx of...
Digital Shift in Insurance: How is the Industry Responding with the Influx of...Digital Shift in Insurance: How is the Industry Responding with the Influx of...
Digital Shift in Insurance: How is the Industry Responding with the Influx of...DataWorks Summit
 
Advantages of Regression Models Over Expert Judgement for Characterizing Cybe...
Advantages of Regression Models Over Expert Judgement for Characterizing Cybe...Advantages of Regression Models Over Expert Judgement for Characterizing Cybe...
Advantages of Regression Models Over Expert Judgement for Characterizing Cybe...Thomas Lee
 
Oversight Systems: Fraud, Waste & Misuse in T&E
Oversight Systems: Fraud, Waste & Misuse in T&EOversight Systems: Fraud, Waste & Misuse in T&E
Oversight Systems: Fraud, Waste & Misuse in T&EOversight Systems
 
Measuring and Managing Credit Risk With Machine Learning and Artificial Intel...
Measuring and Managing Credit Risk With Machine Learning and Artificial Intel...Measuring and Managing Credit Risk With Machine Learning and Artificial Intel...
Measuring and Managing Credit Risk With Machine Learning and Artificial Intel...accenture
 
5 AI Solutions Every Chief Risk Officer Needs
5 AI Solutions Every Chief Risk Officer Needs5 AI Solutions Every Chief Risk Officer Needs
5 AI Solutions Every Chief Risk Officer NeedsAlisa Karybina
 

Similar to Using Machine Learning to Find a needle in a haystack Aureus Analytics (20)

Study ROI of Supply Chain Risk Management (riskmethods Nov 2014)
Study ROI of Supply Chain Risk Management (riskmethods Nov 2014)Study ROI of Supply Chain Risk Management (riskmethods Nov 2014)
Study ROI of Supply Chain Risk Management (riskmethods Nov 2014)
 
The Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged CareThe Evolution of Predictive Analytics in Maaged Care
The Evolution of Predictive Analytics in Maaged Care
 
Jay Budzik, Ai4 Finance, Aug 21, 2019
Jay Budzik, Ai4 Finance, Aug 21, 2019Jay Budzik, Ai4 Finance, Aug 21, 2019
Jay Budzik, Ai4 Finance, Aug 21, 2019
 
Is Bigger Data Really Better? 10 Facts from Theory and Practice
Is Bigger Data Really Better? 10 Facts from Theory and PracticeIs Bigger Data Really Better? 10 Facts from Theory and Practice
Is Bigger Data Really Better? 10 Facts from Theory and Practice
 
Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
Analytics, Big Data and The Cloud II Conference - Kiribatu LabsAnalytics, Big Data and The Cloud II Conference - Kiribatu Labs
Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his mac
 
Predictive analytics roadshow
Predictive analytics roadshowPredictive analytics roadshow
Predictive analytics roadshow
 
Profiting from customer profitability + big data fitzgerald analytics
Profiting from customer profitability + big data fitzgerald analyticsProfiting from customer profitability + big data fitzgerald analytics
Profiting from customer profitability + big data fitzgerald analytics
 
six-sigma-and-minitab-13 (1).ppt
six-sigma-and-minitab-13 (1).pptsix-sigma-and-minitab-13 (1).ppt
six-sigma-and-minitab-13 (1).ppt
 
Increasing Revenue of Prepaid Customers by Recharge Segmentation Models
Increasing Revenue of Prepaid Customers by Recharge Segmentation ModelsIncreasing Revenue of Prepaid Customers by Recharge Segmentation Models
Increasing Revenue of Prepaid Customers by Recharge Segmentation Models
 
Increasing Revenue of Prepaid Customers by Recharge Segmentation Models
Increasing Revenue of Prepaid Customers by Recharge Segmentation ModelsIncreasing Revenue of Prepaid Customers by Recharge Segmentation Models
Increasing Revenue of Prepaid Customers by Recharge Segmentation Models
 
Revolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfRevolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdf
 
evolution-of-early-warning-system-for-lenders.pdf
evolution-of-early-warning-system-for-lenders.pdfevolution-of-early-warning-system-for-lenders.pdf
evolution-of-early-warning-system-for-lenders.pdf
 
Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...
Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...
Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...
 
Data Science by Chappuis Halder & Co.
Data Science by Chappuis Halder & Co.Data Science by Chappuis Halder & Co.
Data Science by Chappuis Halder & Co.
 
Digital Shift in Insurance: How is the Industry Responding with the Influx of...
Digital Shift in Insurance: How is the Industry Responding with the Influx of...Digital Shift in Insurance: How is the Industry Responding with the Influx of...
Digital Shift in Insurance: How is the Industry Responding with the Influx of...
 
Advantages of Regression Models Over Expert Judgement for Characterizing Cybe...
Advantages of Regression Models Over Expert Judgement for Characterizing Cybe...Advantages of Regression Models Over Expert Judgement for Characterizing Cybe...
Advantages of Regression Models Over Expert Judgement for Characterizing Cybe...
 
Oversight Systems: Fraud, Waste & Misuse in T&E
Oversight Systems: Fraud, Waste & Misuse in T&EOversight Systems: Fraud, Waste & Misuse in T&E
Oversight Systems: Fraud, Waste & Misuse in T&E
 
Measuring and Managing Credit Risk With Machine Learning and Artificial Intel...
Measuring and Managing Credit Risk With Machine Learning and Artificial Intel...Measuring and Managing Credit Risk With Machine Learning and Artificial Intel...
Measuring and Managing Credit Risk With Machine Learning and Artificial Intel...
 
5 AI Solutions Every Chief Risk Officer Needs
5 AI Solutions Every Chief Risk Officer Needs5 AI Solutions Every Chief Risk Officer Needs
5 AI Solutions Every Chief Risk Officer Needs
 

More from Aureus Analytics

Big Data and Analytics Trends 2018
Big Data and Analytics Trends 2018Big Data and Analytics Trends 2018
Big Data and Analytics Trends 2018Aureus Analytics
 
Improving Marketing Campaigns
Improving Marketing CampaignsImproving Marketing Campaigns
Improving Marketing CampaignsAureus Analytics
 
Improvement in Persistency Score for Large Insurer
Improvement in Persistency Score for Large InsurerImprovement in Persistency Score for Large Insurer
Improvement in Persistency Score for Large InsurerAureus Analytics
 
Net Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to AvoidNet Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to AvoidAureus Analytics
 
The Art Of Net Promoter Score
The Art Of Net Promoter ScoreThe Art Of Net Promoter Score
The Art Of Net Promoter ScoreAureus Analytics
 
Big data-analytics-trends-2016
Big data-analytics-trends-2016Big data-analytics-trends-2016
Big data-analytics-trends-2016Aureus Analytics
 
Point of Decision Analytics in Insurance
Point of Decision Analytics in InsurancePoint of Decision Analytics in Insurance
Point of Decision Analytics in InsuranceAureus Analytics
 
Statistical model infographic.compressed (2)
Statistical model infographic.compressed (2)Statistical model infographic.compressed (2)
Statistical model infographic.compressed (2)Aureus Analytics
 
Infographic: Big Data and Analytics trends for 2015
Infographic: Big Data and Analytics trends for 2015Infographic: Big Data and Analytics trends for 2015
Infographic: Big Data and Analytics trends for 2015Aureus Analytics
 
Infographic: Steps to Building a Statistical Model
Infographic: Steps to Building a Statistical ModelInfographic: Steps to Building a Statistical Model
Infographic: Steps to Building a Statistical ModelAureus Analytics
 
Ten Commandments of Statistical Modelling - Infographic
Ten Commandments of Statistical Modelling - InfographicTen Commandments of Statistical Modelling - Infographic
Ten Commandments of Statistical Modelling - InfographicAureus Analytics
 
67 years of Steering Innovation
67 years of Steering Innovation67 years of Steering Innovation
67 years of Steering InnovationAureus Analytics
 
Demystifying A Data Scientist
Demystifying A Data ScientistDemystifying A Data Scientist
Demystifying A Data ScientistAureus Analytics
 
FIFA World Cup 2014 in Numbers
FIFA World Cup 2014 in NumbersFIFA World Cup 2014 in Numbers
FIFA World Cup 2014 in NumbersAureus Analytics
 
Big Data and Analytics Trends in 2014
Big Data and Analytics Trends in 2014Big Data and Analytics Trends in 2014
Big Data and Analytics Trends in 2014Aureus Analytics
 
Implementation challenges in Big Data - Dr. Nilesh Karnik
Implementation challenges in Big Data - Dr. Nilesh KarnikImplementation challenges in Big Data - Dr. Nilesh Karnik
Implementation challenges in Big Data - Dr. Nilesh KarnikAureus Analytics
 

More from Aureus Analytics (16)

Big Data and Analytics Trends 2018
Big Data and Analytics Trends 2018Big Data and Analytics Trends 2018
Big Data and Analytics Trends 2018
 
Improving Marketing Campaigns
Improving Marketing CampaignsImproving Marketing Campaigns
Improving Marketing Campaigns
 
Improvement in Persistency Score for Large Insurer
Improvement in Persistency Score for Large InsurerImprovement in Persistency Score for Large Insurer
Improvement in Persistency Score for Large Insurer
 
Net Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to AvoidNet Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to Avoid
 
The Art Of Net Promoter Score
The Art Of Net Promoter ScoreThe Art Of Net Promoter Score
The Art Of Net Promoter Score
 
Big data-analytics-trends-2016
Big data-analytics-trends-2016Big data-analytics-trends-2016
Big data-analytics-trends-2016
 
Point of Decision Analytics in Insurance
Point of Decision Analytics in InsurancePoint of Decision Analytics in Insurance
Point of Decision Analytics in Insurance
 
Statistical model infographic.compressed (2)
Statistical model infographic.compressed (2)Statistical model infographic.compressed (2)
Statistical model infographic.compressed (2)
 
Infographic: Big Data and Analytics trends for 2015
Infographic: Big Data and Analytics trends for 2015Infographic: Big Data and Analytics trends for 2015
Infographic: Big Data and Analytics trends for 2015
 
Infographic: Steps to Building a Statistical Model
Infographic: Steps to Building a Statistical ModelInfographic: Steps to Building a Statistical Model
Infographic: Steps to Building a Statistical Model
 
Ten Commandments of Statistical Modelling - Infographic
Ten Commandments of Statistical Modelling - InfographicTen Commandments of Statistical Modelling - Infographic
Ten Commandments of Statistical Modelling - Infographic
 
67 years of Steering Innovation
67 years of Steering Innovation67 years of Steering Innovation
67 years of Steering Innovation
 
Demystifying A Data Scientist
Demystifying A Data ScientistDemystifying A Data Scientist
Demystifying A Data Scientist
 
FIFA World Cup 2014 in Numbers
FIFA World Cup 2014 in NumbersFIFA World Cup 2014 in Numbers
FIFA World Cup 2014 in Numbers
 
Big Data and Analytics Trends in 2014
Big Data and Analytics Trends in 2014Big Data and Analytics Trends in 2014
Big Data and Analytics Trends in 2014
 
Implementation challenges in Big Data - Dr. Nilesh Karnik
Implementation challenges in Big Data - Dr. Nilesh KarnikImplementation challenges in Big Data - Dr. Nilesh Karnik
Implementation challenges in Big Data - Dr. Nilesh Karnik
 

Recently uploaded

IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...boychatmate1
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligencePriyadharshiniG41
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 

Recently uploaded (20)

IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligence
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 

Using Machine Learning to Find a needle in a haystack Aureus Analytics

  • 1. Copyright 2017 RESTRICTED CIRCULATION USING ML TO FIND A NEEDLE IN A HAYSTACK Dr. Nilesh Karnik 2nd Feb 2018
  • 2. Copyright 2017 RESTRICTED CIRCULATION About 12-15% of claims made in 2016-17 could be fraudulent.* 2 LIFE ~INR 2078 CR NON – LIFE ~INR 12100 CR POTENTIAL LOSSES * IRDAI Reports
  • 3. Copyright 2017 RESTRICTED CIRCULATION Using ML and AI to fight Fraud PATTERN IDENTIFICATION CONNECT IDENTITIES UNDERSTAND UNSTRUCTURED BUILD NETWORK 3
  • 4. Copyright 2017 RESTRICTED CIRCULATION Proportions of early claims or fraud are typically of the order of 1% or less of the policy portfolio* 4* Based on Aureus Customer Engagements
  • 5. Copyright 2017 RESTRICTED CIRCULATION Machine Learning Machine learning algorithm Model that approximates the underlying process Using data to understand an underlying process Process X Y 5
  • 6. Copyright 2017 RESTRICTED CIRCULATION Prediction is difficult – especially if it is about the future. PAST FUTUREPRESENT Use knowledge gathered from the past To make an inference about the future 6
  • 7. Why use a predictive model? A predictive model lets us distinguish between different subsets of policies on the basis of the probability of the predicted event. Policy buckets Number of policies issued Number of policies that will pay their 13th month premium % of policies paying top 10% 1000 600 60% 10-20% 1000 600 60% 20-30% 1000 600 60% 30-40% 1000 600 60% 40-50% 1000 600 60% 50-60% 1000 600 60% 60-70% 1000 600 60% 70-80% 1000 600 60% 80-90% 1000 600 60% Bottom 10% 1000 600 60% Total 10000 6000 60% Policy buckets Number of policies issued Number of policies that will pay their 13th month premium % of policies paying top 10% 1000 990 99% 10-20% 1000 880 88% 20-30% 1000 750 75% 30-40% 1000 700 70% 40-50% 1000 650 65% 50-60% 1000 630 63% 60-70% 1000 620 62% 70-80% 1000 480 48% 80-90% 1000 250 25% Bottom 10% 1000 50 5% Total 10000 6000 60% Randomly ordered collection of policies Ordered using scores from a predictive model 7
  • 8. The life-cycle of a machine learning model FeedbackImplement Business strategy to act on model output Training & Testing Data availability Problem definition Periodic Retraining 8
  • 9. Copyright 2017 RESTRICTED CIRCULATION A Needle in the Haystack Unbalanced problems can be troublesome Additional effort required in the training phase Algorithm choices may become more important Testing across different time periods is crucial %s become deceptive Accuracy metrics become deceptive Customers expect more! 9
  • 10. Copyright 2017 RESTRICTED CIRCULATION A needle in the haystack … measuring model accuracy Can you imagine a model that is 99.5% accurate, but completely unusable ? A predictive model which predicts that this portfolio will get no early claims will go wrong only in 0.5% of cases. Imagine a policy portfolio which historically has a 0.5% risk of early claims. 10
  • 11. “Lift” provided by a predictive model – Scenario 1 • A policy portfolio with a 60% average probability of payment • A payment propensity model identifies green subset of size 25% with 90% payment probability and a red subset of size 20% with 12% payment probability 1.5x of the average probability 1/5 of the average probability 60% 90% 64% 12% 100% 25% 55% 20% 11
  • 12. “Lift” provided by a predictive model – Scenario 2 • A proposal portfolio with a 3 in 1000 average probability of claim fraud • A predictive model identifies red subset of size 25% with 4.5 in 1,000 probability of claim fraud and a green subset of size 20% with 6 in 10,000 probability of claim fraud Although the lift provided by this model is similar to the pervious scenario, it still needs improvement because this result means that the business team still needs to sift through ¼ th of the portfolio to eliminate about 38% of risk. 3 in 1,000 4.5 in 1,000 3.2 in 1,000 6 in 10,000 25% 55% 20% 100% 12
  • 13. Example of results with customer portfolios – Early claim - 1 • A proposal portfolio with a 7 in 1000 average probability of early claims • Prediction at proposal submission 7 in 1,000 8.6% 7 in 10,000 4.2% 31.5% 100% 12x of the average probability. Captures nearly half of risk in a small set less than 5% of the portfolio size 1/10 of the average probability 13
  • 14. Example of results with customer portfolios – Early claims -2 • A proposal portfolio with a 4.4 in 1000 average probability of early claims • Prediction at proposal submission 4.4 in 1,000 2.5% 6 in 10,000 5.4% 14% 100% 5.6x of the average probability. Captures nearly 1/3 of the risk in a small set of about 5% of the portfolio size 1/7 of the average probability 14
  • 15. Example of results with customer portfolios – Fraudulent claims-1 • A portfolio with a 0.9% average risk of fraud for early claims • Prediction at claim notification 0.9% 7.2% 0.16% 10% 90% 100% 8x of the average probability. Captures more than 80% of risk in a small subset of about 10% of all early claims 1/5 of the average probability 15
  • 16. Example of results with customer portfolios – Fraudulent claims-2 • A portfolio with a 3.6% average risk of fraud for early claims • Prediction at claim notification 3.6% 30% 0.56% 6% 50% 100% 8.3x of the average probability. Captures more than half of risk in a small subset of size 6% of all early claims 1/6 of the average probability 16
  • 17. Copyright 2017 RESTRICTED CIRCULATION Benefits Investigate high risk proposals to reduce risk of early claims Optimize claim investigation efforts Quicker issuance for low risk proposals Reduce settlement TAT for low risk claims Improve customer experience 17
  • 18. Copyright 2017 RESTRICTED CIRCULATION Companies Which Use ML CUSTOMER EXPERIENCE CUSTOMER BEHAVIOUR FRAUD MITIGATION BUYING BEHAVIOUR TRAVEL SAFETY VIEWING PREDICTION Source : Secondary Research
  • 19. Copyright 2017 RESTRICTED CIRCULATION 19 Make Customer Experience your Differentiator. www.aureusanalytics.com @AureusAnalytics info@aureusanalytics.com blog.aureusanalytics.com https://www.linkedin.com/company/aureus-analytics