Using Machine Learning to Find a needle in a haystack Aureus Analytics

Copyright 2017 RESTRICTED CIRCULATION
USING ML TO FIND A NEEDLE IN A HAYSTACK
Dr. Nilesh Karnik 2nd Feb 2018

About 12-15%
of claims made
in 2016-17 could
be fraudulent.*
2
LIFE
~INR 2078 CR
NON – LIFE
~INR 12100 CR
POTENTIAL LOSSES
* IRDAI Reports

Using ML and AI to fight Fraud
PATTERN
IDENTIFICATION
CONNECT
IDENTITIES
UNDERSTAND
UNSTRUCTURED
BUILD
NETWORK
3

Proportions of early claims
or fraud are typically of the
order of 1% or less of the
policy portfolio*
4* Based on Aureus Customer Engagements

Machine Learning
Machine learning
algorithm
Model that approximates the underlying process
Using data to understand an underlying process
Process
X Y
5

Prediction is difficult –
especially if it is about the
future.
PAST FUTUREPRESENT
Use knowledge
gathered from the
past
To make an
inference about
the future
6

Why use a predictive model?
A predictive model lets us distinguish between different subsets of
policies on the basis of the probability of the predicted event.
Policy
buckets
Number of
policies issued
Number of policies
that will pay their 13th
month premium
% of policies
paying
top 10% 1000 600 60%
10-20% 1000 600 60%
20-30% 1000 600 60%
30-40% 1000 600 60%
40-50% 1000 600 60%
50-60% 1000 600 60%
60-70% 1000 600 60%
70-80% 1000 600 60%
80-90% 1000 600 60%
Bottom 10% 1000 600 60%
Total 10000 6000 60%
Policy
buckets
Number of
policies
issued
Number of policies that
will pay their 13th
month premium
% of policies
paying
top 10% 1000 990 99%
10-20% 1000 880 88%
20-30% 1000 750 75%
30-40% 1000 700 70%
40-50% 1000 650 65%
50-60% 1000 630 63%
60-70% 1000 620 62%
70-80% 1000 480 48%
80-90% 1000 250 25%
Bottom 10% 1000 50 5%
Total 10000 6000 60%
Randomly ordered collection of policies Ordered using scores from a predictive model
7

The life-cycle of a machine learning model
FeedbackImplement
Business
strategy to
act on
model
output
Training &
Testing
Data
availability
Problem
definition
Periodic Retraining
8

A Needle in the Haystack
Unbalanced
problems can
be
troublesome
Additional effort
required in the
training phase
Algorithm
choices may
become more
important
Testing across
different time
periods is crucial
%s become
deceptive
Accuracy
metrics become
deceptive
Customers
expect
more!
9

A needle in the haystack … measuring model
accuracy
Can you imagine
a model that is
99.5% accurate,
but completely
unusable ?
A predictive model
which predicts that
this portfolio will
get no early claims
will go wrong only
in 0.5% of cases.
Imagine a policy
portfolio which
historically has
a 0.5% risk of
early claims.
10

“Lift” provided by a predictive model – Scenario 1
• A policy portfolio with a 60% average probability of payment
• A payment propensity model identifies green subset of size 25% with
90% payment probability and a red subset of size 20% with 12%
payment probability
1.5x of the average
probability
1/5 of the average
probability
60%
90%
64%
12%
100%
25%
55%
20%
11

“Lift” provided by a predictive model – Scenario 2
• A proposal portfolio with a 3 in 1000 average probability of claim fraud
• A predictive model identifies red subset of size 25% with 4.5 in 1,000
probability of claim fraud and a green subset of size 20% with 6 in
10,000 probability of claim fraud
Although the lift provided by
this model is similar to the
pervious scenario, it still
needs improvement because
this result means that the
business team still needs to
sift through ¼ th of the
portfolio to eliminate about
38% of risk.
3 in 1,000
4.5 in 1,000
3.2 in 1,000
6 in 10,000
25%
55%
20%
100%
12

Example of results with customer portfolios –
Early claim - 1
• A proposal portfolio with a 7 in 1000 average probability of early claims
• Prediction at proposal submission
7 in 1,000
8.6%
7 in 10,000
4.2%
31.5%
100%
12x of the average
probability.
Captures nearly half
of risk in a small set
less than 5% of the
portfolio size
1/10 of the average
probability
13

Early claims -2
• A proposal portfolio with a 4.4 in 1000 average probability of early claims
• Prediction at proposal submission
4.4 in 1,000
2.5%
6 in 10,000
5.4%
14%
100%
5.6x of the average
probability.
Captures nearly 1/3
of the risk in a small
set of about 5% of
the portfolio size
1/7 of the average
probability
14

Fraudulent claims-1
• A portfolio with a 0.9% average risk of fraud for early claims
• Prediction at claim notification
0.9%
7.2%
0.16%
10%
90%
100%
8x of the average
probability.
Captures more than
80% of risk in a
small subset of
about 10% of all
early claims
1/5 of the average
probability
15

Fraudulent claims-2
• A portfolio with a 3.6% average risk of fraud for early claims
• Prediction at claim notification
3.6%
30%
0.56%
6%
50%
100%
8.3x of the average
probability.
Captures more than
half of risk in a
small subset of size
6% of all early
claims
1/6 of the average
probability
16

Benefits
Investigate high
risk proposals to
reduce risk of
early claims
Optimize claim
investigation
efforts
Quicker
issuance for
low risk
proposals
Reduce
settlement TAT
for low risk
claims
Improve
customer
experience
17

Companies Which Use ML
CUSTOMER EXPERIENCE
CUSTOMER BEHAVIOUR
FRAUD MITIGATION
BUYING BEHAVIOUR
TRAVEL SAFETY
VIEWING PREDICTION
Source : Secondary Research

Copyright 2017 RESTRICTED CIRCULATION 19
Make Customer Experience your
Differentiator.
www.aureusanalytics.com
@AureusAnalytics
info@aureusanalytics.com
blog.aureusanalytics.com
https://www.linkedin.com/company/aureus-analytics

Using Machine Learning to Find a needle in a haystack Aureus Analytics

Recommended

Recommended

More Related Content

Similar to Using Machine Learning to Find a needle in a haystack Aureus Analytics

Similar to Using Machine Learning to Find a needle in a haystack Aureus Analytics (20)

More from Aureus Analytics

More from Aureus Analytics (16)

Recently uploaded

Recently uploaded (20)

Using Machine Learning to Find a needle in a haystack Aureus Analytics