Machine learning and artificial intelligence have great potential in helping insurance companies improve their internal processes such as underwriting, cross selling, fraud and claims prevention which directly have an impact on the consumer experience.
In this presentation Dr. Karnik shares his experience in building models for prediction of early claims and potentially fraudulent claims. These are based on working with insurance carriers of all sizes.
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Using Machine Learning to Find a needle in a haystack Aureus Analytics
1. Copyright 2017 RESTRICTED CIRCULATION
USING ML TO FIND A NEEDLE IN A HAYSTACK
Dr. Nilesh Karnik 2nd Feb 2018
2. Copyright 2017 RESTRICTED CIRCULATION
About 12-15%
of claims made
in 2016-17 could
be fraudulent.*
2
LIFE
~INR 2078 CR
NON – LIFE
~INR 12100 CR
POTENTIAL LOSSES
* IRDAI Reports
3. Copyright 2017 RESTRICTED CIRCULATION
Using ML and AI to fight Fraud
PATTERN
IDENTIFICATION
CONNECT
IDENTITIES
UNDERSTAND
UNSTRUCTURED
BUILD
NETWORK
3
4. Copyright 2017 RESTRICTED CIRCULATION
Proportions of early claims
or fraud are typically of the
order of 1% or less of the
policy portfolio*
4* Based on Aureus Customer Engagements
5. Copyright 2017 RESTRICTED CIRCULATION
Machine Learning
Machine learning
algorithm
Model that approximates the underlying process
Using data to understand an underlying process
Process
X Y
5
6. Copyright 2017 RESTRICTED CIRCULATION
Prediction is difficult –
especially if it is about the
future.
PAST FUTUREPRESENT
Use knowledge
gathered from the
past
To make an
inference about
the future
6
7. Why use a predictive model?
A predictive model lets us distinguish between different subsets of
policies on the basis of the probability of the predicted event.
Policy
buckets
Number of
policies issued
Number of policies
that will pay their 13th
month premium
% of policies
paying
top 10% 1000 600 60%
10-20% 1000 600 60%
20-30% 1000 600 60%
30-40% 1000 600 60%
40-50% 1000 600 60%
50-60% 1000 600 60%
60-70% 1000 600 60%
70-80% 1000 600 60%
80-90% 1000 600 60%
Bottom 10% 1000 600 60%
Total 10000 6000 60%
Policy
buckets
Number of
policies
issued
Number of policies that
will pay their 13th
month premium
% of policies
paying
top 10% 1000 990 99%
10-20% 1000 880 88%
20-30% 1000 750 75%
30-40% 1000 700 70%
40-50% 1000 650 65%
50-60% 1000 630 63%
60-70% 1000 620 62%
70-80% 1000 480 48%
80-90% 1000 250 25%
Bottom 10% 1000 50 5%
Total 10000 6000 60%
Randomly ordered collection of policies Ordered using scores from a predictive model
7
8. The life-cycle of a machine learning model
FeedbackImplement
Business
strategy to
act on
model
output
Training &
Testing
Data
availability
Problem
definition
Periodic Retraining
8
9. Copyright 2017 RESTRICTED CIRCULATION
A Needle in the Haystack
Unbalanced
problems can
be
troublesome
Additional effort
required in the
training phase
Algorithm
choices may
become more
important
Testing across
different time
periods is crucial
%s become
deceptive
Accuracy
metrics become
deceptive
Customers
expect
more!
9
10. Copyright 2017 RESTRICTED CIRCULATION
A needle in the haystack … measuring model
accuracy
Can you imagine
a model that is
99.5% accurate,
but completely
unusable ?
A predictive model
which predicts that
this portfolio will
get no early claims
will go wrong only
in 0.5% of cases.
Imagine a policy
portfolio which
historically has
a 0.5% risk of
early claims.
10
11. “Lift” provided by a predictive model – Scenario 1
• A policy portfolio with a 60% average probability of payment
• A payment propensity model identifies green subset of size 25% with
90% payment probability and a red subset of size 20% with 12%
payment probability
1.5x of the average
probability
1/5 of the average
probability
60%
90%
64%
12%
100%
25%
55%
20%
11
12. “Lift” provided by a predictive model – Scenario 2
• A proposal portfolio with a 3 in 1000 average probability of claim fraud
• A predictive model identifies red subset of size 25% with 4.5 in 1,000
probability of claim fraud and a green subset of size 20% with 6 in
10,000 probability of claim fraud
Although the lift provided by
this model is similar to the
pervious scenario, it still
needs improvement because
this result means that the
business team still needs to
sift through ¼ th of the
portfolio to eliminate about
38% of risk.
3 in 1,000
4.5 in 1,000
3.2 in 1,000
6 in 10,000
25%
55%
20%
100%
12
13. Example of results with customer portfolios –
Early claim - 1
• A proposal portfolio with a 7 in 1000 average probability of early claims
• Prediction at proposal submission
7 in 1,000
8.6%
7 in 10,000
4.2%
31.5%
100%
12x of the average
probability.
Captures nearly half
of risk in a small set
less than 5% of the
portfolio size
1/10 of the average
probability
13
14. Example of results with customer portfolios –
Early claims -2
• A proposal portfolio with a 4.4 in 1000 average probability of early claims
• Prediction at proposal submission
4.4 in 1,000
2.5%
6 in 10,000
5.4%
14%
100%
5.6x of the average
probability.
Captures nearly 1/3
of the risk in a small
set of about 5% of
the portfolio size
1/7 of the average
probability
14
15. Example of results with customer portfolios –
Fraudulent claims-1
• A portfolio with a 0.9% average risk of fraud for early claims
• Prediction at claim notification
0.9%
7.2%
0.16%
10%
90%
100%
8x of the average
probability.
Captures more than
80% of risk in a
small subset of
about 10% of all
early claims
1/5 of the average
probability
15
16. Example of results with customer portfolios –
Fraudulent claims-2
• A portfolio with a 3.6% average risk of fraud for early claims
• Prediction at claim notification
3.6%
30%
0.56%
6%
50%
100%
8.3x of the average
probability.
Captures more than
half of risk in a
small subset of size
6% of all early
claims
1/6 of the average
probability
16
17. Copyright 2017 RESTRICTED CIRCULATION
Benefits
Investigate high
risk proposals to
reduce risk of
early claims
Optimize claim
investigation
efforts
Quicker
issuance for
low risk
proposals
Reduce
settlement TAT
for low risk
claims
Improve
customer
experience
17
18. Copyright 2017 RESTRICTED CIRCULATION
Companies Which Use ML
CUSTOMER EXPERIENCE
CUSTOMER BEHAVIOUR
FRAUD MITIGATION
BUYING BEHAVIOUR
TRAVEL SAFETY
VIEWING PREDICTION
Source : Secondary Research
19. Copyright 2017 RESTRICTED CIRCULATION 19
Make Customer Experience your
Differentiator.
www.aureusanalytics.com
@AureusAnalytics
info@aureusanalytics.com
blog.aureusanalytics.com
https://www.linkedin.com/company/aureus-analytics