SlideShare a Scribd company logo
1 of 19
Download to read offline
Modeling Challenges for
insurance pricing
Xavier Conort
Chief Data Scientist - DataRobot
Tuesday, February 23, 2016 @
Agenda
● Preamble
● Unfriendly distribution shape of claims cost
● Regulation or operational constraints
● Need to predict the future!
● Claims that are Incurred But Not Reported or Not
Enough Reported (IBNR, IBNER)
Automation is integral part of human civilization
● Ivy league approach - only for the chosen ones
● Focused on activities - detached from outcomes
● Assumption based: model selection is based on
modeler‘s understanding of the world?
● Development is costly and limited
● Heavy dependence on programmers
Next generation tools, platforms, approaches to data science
Traditional Approach Data Science Approach
● Common man approach - for everyone
● Focused on business outcome
● Validation based: model selected if it predicts well in
real world
● Development is crowd sourced, peer reviewed
● Automated solutions are taking care of programming
open source
programming
social network
of coders
automated
solutions
“90% of the data in the world today has been created in the
last two years alone”
and some companies have been very successful in
using data thanks to data science
better
service
newer
product
improved
operations
SO WHY THE ROBOT HAS NOT
REPLACED THE ACTUARY YET ?
No Open Data => Slower innovation
Unfriendly distribution shape
● Low Claim frequency
● Claim severity follows a skewed distribution
with sometimes a large tail
● And ...
○ Discontinuities caused by policy limits
○ Environmental and operational changes leading to distributions
constantly changing over time
○ Heterogeneity of risk within risk pools, caused by fraud and
imperfect measures of risk exposure
○ ...
info@DataRobot.com | @DataRobot | DataRobot, INC., CONFIDENTIAL
OUR APPROACH TO WIN 1st PLACE IN:
High severity (presence of
large claims)
Low frequency (0.26% of
claims event)
Many Features
Censor large claims
Downsample majority
events
Get rid of noise
Let DataRobot explore the best
transformations and the Machine
Learning Algorithms for the data
Experiment other models in R
A more actuarial approach
● Modeling per risk type (bodily injury, damage, 3rd party...)
● Censoring large is well accepted but don’t forget to reallocate the cost
of large claims if you use your model for pricing
● Log transformation and downsampling are less a practise. If you do
this, don’t forget to adjust the bias
● Actuaries will mostly use GLMs (Generalized Linear Models with Poisson, Gamma
and Tweedie distributions with a “log” link) and do either:
○ frequency modeling x severity modeling but be aware this makes the strong
assumption that frequency and severity are independent.
○ or cost modeling directly
Can ML algos support Poisson,
Gamma and Tweedie loss functions?
● Yes! As an example, Regularized GLMs and Gradient Boosting Machine
(GBMs) can support any exponential distributions
● But … open source implementation rarely support Gamma and
Tweedie loss functions…
● Good news!
○ Poisson loss function is supported by XGboost, R gbm and R
glmnet
○ open source algorithms are open! so you can patch them. In some
cases, only a few lines of code are necessary
Regulation or operational constraints
Regulation or operational constraints might force you
to keep control of the model built or/and keep it simple
➢ Example of sum insured: The premium should be monotonically increasing with sum
insured, otherwise people will just purchase more cover and pay less...
➢ Many insurance companies use pricing tables
❖ use ML for feature selection and get insight on non linear relationship and interaction. And
then integrate this insight into your GLMs where you have full control
❖ eliminate undesirable features
❖ patch ML algos to add monotonicity (R gbm already does it!) or interaction constraints
Predict the future
sum insured
history
decision tree prediction
actual risk
new values
Models in general are not very good in predicting the future
2 things to keep in mind when you are using GBMs or RFs or bin the
continuous variables...:
● decision trees don’t extrapolate new values. They won't predict higher
claim sizes for sums insured higher than history
● Machine can be naive and lazy
○ see 2 examples in next slide
Naive Machine
● one example is GE Flight Quest. To win the competition, my I2
R
colleagues and I censored the info of the name of the airports to force
Gradient Boosting Machine
○ to learn the reasons of the delay
○ and not to learn that one airport never had delays in the past 3 months and conclude
it will never have delays in the future
● A real life example of this in insurance is when an actuary uses the policy
number as one of the features to predict the claims cost. A naive
predictive model could use policy number as a proxy for both inflationary
effects and tenure effects.
Backtesting
To get good prediction models, you need to be fully aware of the model
limitations and effects changing over time.
To get this insight, Backtesting is an important step
Backtesting is the process of applying an or analytical method to historical data to
see how accurately the method would have predicted actual results.
It should help you uncover underperformance due to poor modeling or
environmental and operational changes.
Machine Learning can be used to automate this experience analysis. And
this time, there is no regulation or operational constraints!
IBNR and IBNER
Long claim developments make the modeling process harder
● It can take a long time to know an event happened and even longer to know the final cost.
The life cycle of a claim typically includes event occurrence, reporting, initial estimation,
case estimate review, payment, recoveries... and the reported losses are usually uncertain
● Risks with such long claims are called long-tail risks. Personal injury compensation schemes
(workers’ compensation and motor accident insurance) are typically long-tail risks.
Advices:
● first focus on short tail risks
● recruit an experienced actuary
● or more exciting! estimate IBNER on individual claims using machine learning
Key takeaways
● Modeling insurance pricing is not easy
● Machine Learning can definitely help in the modeling process
● Open source solutions are not designed for insurance pricing and
innovation will be slower as insurance is not an open world
● Plenty of other exciting modeling projects in insurance:
○ Scan through millions of potential consumers to choose the right few
○ Predict lapses and build price elasticity models
○ Select top 10% of your selected risk to manually review / inspect further
○ Identify claimants with highest likelihood of being fraud and review them manually
○ Text mine the beneficiary clause of life insurance contracts
○ Estimate which claims are likely to become problem claims that require special attention
○ Decide who should handle a claim or to pay it without further checks
○ Geographic features / spacial smoothing
○ Detect changes in mix of business
○ ….
Automation is inevitable
The
Economist,
May 2015
Harvard
Business
Review,
June 2015
Modeling Challenges for insurance pricing

More Related Content

What's hot

What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...Simplilearn
 
Bayesian Portfolio Allocation
Bayesian Portfolio AllocationBayesian Portfolio Allocation
Bayesian Portfolio AllocationQuantUniversity
 
Mclarens @ Data Science Sg
Mclarens @ Data Science SgMclarens @ Data Science Sg
Mclarens @ Data Science SgBenji Thian
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAndrea Dal Pozzolo
 
Credit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionCredit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionLalit Jain
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Francesca Lazzeri, PhD
 
Credit Card Fraud Detection Client Presentation
Credit Card Fraud Detection Client PresentationCredit Card Fraud Detection Client Presentation
Credit Card Fraud Detection Client PresentationAyapparaj SKS
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analyticsPrasad Narasimhan
 
Patrick Hall, H2O.ai - The Case for Model Debugging - H2O World 2019 NYC
Patrick Hall, H2O.ai - The Case for Model Debugging - H2O World 2019 NYCPatrick Hall, H2O.ai - The Case for Model Debugging - H2O World 2019 NYC
Patrick Hall, H2O.ai - The Case for Model Debugging - H2O World 2019 NYCSri Ambati
 
01 deloitte predictive analytics analytics summit-09-30-14_092514
01   deloitte predictive analytics analytics summit-09-30-14_09251401   deloitte predictive analytics analytics summit-09-30-14_092514
01 deloitte predictive analytics analytics summit-09-30-14_092514bethferrara
 
Data Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesData Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesDerek Kane
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili SaghafiBusiness Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili SaghafiProfessor Lili Saghafi
 
The Impact of Data Science on Finance
The Impact of Data Science on FinanceThe Impact of Data Science on Finance
The Impact of Data Science on FinanceRoger Fried
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 
Machine Learning
Machine LearningMachine Learning
Machine LearningVivek Garg
 

What's hot (20)

What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
 
Bayesian Portfolio Allocation
Bayesian Portfolio AllocationBayesian Portfolio Allocation
Bayesian Portfolio Allocation
 
Mclarens @ Data Science Sg
Mclarens @ Data Science SgMclarens @ Data Science Sg
Mclarens @ Data Science Sg
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
Credit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionCredit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly Detection
 
Andrea Dal Pozzolo's CV
Andrea Dal Pozzolo's CVAndrea Dal Pozzolo's CV
Andrea Dal Pozzolo's CV
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
 
Credit Card Fraud Detection Client Presentation
Credit Card Fraud Detection Client PresentationCredit Card Fraud Detection Client Presentation
Credit Card Fraud Detection Client Presentation
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analytics
 
Patrick Hall, H2O.ai - The Case for Model Debugging - H2O World 2019 NYC
Patrick Hall, H2O.ai - The Case for Model Debugging - H2O World 2019 NYCPatrick Hall, H2O.ai - The Case for Model Debugging - H2O World 2019 NYC
Patrick Hall, H2O.ai - The Case for Model Debugging - H2O World 2019 NYC
 
01 deloitte predictive analytics analytics summit-09-30-14_092514
01   deloitte predictive analytics analytics summit-09-30-14_09251401   deloitte predictive analytics analytics summit-09-30-14_092514
01 deloitte predictive analytics analytics summit-09-30-14_092514
 
Predictive data analytics models and their applications
Predictive data analytics models and their applicationsPredictive data analytics models and their applications
Predictive data analytics models and their applications
 
Data Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesData Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics Capabilities
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili SaghafiBusiness Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
 
The Impact of Data Science on Finance
The Impact of Data Science on FinanceThe Impact of Data Science on Finance
The Impact of Data Science on Finance
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
IOT & Machine Learning
IOT & Machine LearningIOT & Machine Learning
IOT & Machine Learning
 

Similar to Modeling Challenges for insurance pricing

Combining Linear and Non Linear Modeling Techniques
Combining Linear and Non Linear Modeling Techniques Combining Linear and Non Linear Modeling Techniques
Combining Linear and Non Linear Modeling Techniques Salford Systems
 
Real-world Strategies for Debugging Machine Learning Systems
Real-world Strategies for Debugging Machine Learning SystemsReal-world Strategies for Debugging Machine Learning Systems
Real-world Strategies for Debugging Machine Learning SystemsDatabricks
 
churn_detection.pptx
churn_detection.pptxchurn_detection.pptx
churn_detection.pptxDhanuDhanu49
 
Loan Approval Prediction
Loan Approval PredictionLoan Approval Prediction
Loan Approval PredictionIRJET Journal
 
Better Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolBetter Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolLouis Cialdella
 
The Machine Learning Audit
The Machine Learning AuditThe Machine Learning Audit
The Machine Learning AuditAndrew Clark
 
Causality without headaches
Causality without headachesCausality without headaches
Causality without headachesBenoît Rostykus
 
Lifetime Value - The Only Metric That Matters (DMC September 2018)
Lifetime Value - The Only Metric That Matters (DMC September 2018)Lifetime Value - The Only Metric That Matters (DMC September 2018)
Lifetime Value - The Only Metric That Matters (DMC September 2018)Luciano Pesci, PhD
 
Lifetime Value (the only metric that matters) Utah DMC September 2018
Lifetime Value (the only metric that matters) Utah DMC September 2018Lifetime Value (the only metric that matters) Utah DMC September 2018
Lifetime Value (the only metric that matters) Utah DMC September 2018Utah Digital Marketing Collective
 
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - ReportAkanksha Gohil
 
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoFraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoInstitute of Contemporary Sciences
 
Credit Card Default Risk
Credit Card Default RiskCredit Card Default Risk
Credit Card Default RiskVipul55627
 
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISKMACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISKIRJET Journal
 
How AGCO implemented an Supply Chain Risk management solution to save millions
How AGCO implemented an Supply Chain Risk management solution to save millionsHow AGCO implemented an Supply Chain Risk management solution to save millions
How AGCO implemented an Supply Chain Risk management solution to save millionsHeiko Schwarz
 
Supercharge your AB testing with automated causal inference - Community Works...
Supercharge your AB testing with automated causal inference - Community Works...Supercharge your AB testing with automated causal inference - Community Works...
Supercharge your AB testing with automated causal inference - Community Works...Egor Kraev
 
Pricing analysis with DataRobot at NTUC Income
Pricing analysis with DataRobot at NTUC IncomePricing analysis with DataRobot at NTUC Income
Pricing analysis with DataRobot at NTUC IncomeAlisa Karybina
 
In Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine LearningIn Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine LearningIRJET Journal
 

Similar to Modeling Challenges for insurance pricing (20)

Combining Linear and Non Linear Modeling Techniques
Combining Linear and Non Linear Modeling Techniques Combining Linear and Non Linear Modeling Techniques
Combining Linear and Non Linear Modeling Techniques
 
Real-world Strategies for Debugging Machine Learning Systems
Real-world Strategies for Debugging Machine Learning SystemsReal-world Strategies for Debugging Machine Learning Systems
Real-world Strategies for Debugging Machine Learning Systems
 
churn_detection.pptx
churn_detection.pptxchurn_detection.pptx
churn_detection.pptx
 
25
2525
25
 
Loan Approval Prediction
Loan Approval PredictionLoan Approval Prediction
Loan Approval Prediction
 
Better Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolBetter Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product School
 
The Machine Learning Audit
The Machine Learning AuditThe Machine Learning Audit
The Machine Learning Audit
 
Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
 
Lifetime Value - The Only Metric That Matters (DMC September 2018)
Lifetime Value - The Only Metric That Matters (DMC September 2018)Lifetime Value - The Only Metric That Matters (DMC September 2018)
Lifetime Value - The Only Metric That Matters (DMC September 2018)
 
Lifetime Value (the only metric that matters) Utah DMC September 2018
Lifetime Value (the only metric that matters) Utah DMC September 2018Lifetime Value (the only metric that matters) Utah DMC September 2018
Lifetime Value (the only metric that matters) Utah DMC September 2018
 
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - Report
 
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoFraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
 
Credit Card Default Risk
Credit Card Default RiskCredit Card Default Risk
Credit Card Default Risk
 
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISKMACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
 
Hackathon
HackathonHackathon
Hackathon
 
How AGCO implemented an Supply Chain Risk management solution to save millions
How AGCO implemented an Supply Chain Risk management solution to save millionsHow AGCO implemented an Supply Chain Risk management solution to save millions
How AGCO implemented an Supply Chain Risk management solution to save millions
 
Supercharge your AB testing with automated causal inference - Community Works...
Supercharge your AB testing with automated causal inference - Community Works...Supercharge your AB testing with automated causal inference - Community Works...
Supercharge your AB testing with automated causal inference - Community Works...
 
Dear Dad
Dear DadDear Dad
Dear Dad
 
Pricing analysis with DataRobot at NTUC Income
Pricing analysis with DataRobot at NTUC IncomePricing analysis with DataRobot at NTUC Income
Pricing analysis with DataRobot at NTUC Income
 
In Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine LearningIn Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine Learning
 

Recently uploaded

GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Recently uploaded (20)

GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

Modeling Challenges for insurance pricing

  • 1. Modeling Challenges for insurance pricing Xavier Conort Chief Data Scientist - DataRobot Tuesday, February 23, 2016 @
  • 2. Agenda ● Preamble ● Unfriendly distribution shape of claims cost ● Regulation or operational constraints ● Need to predict the future! ● Claims that are Incurred But Not Reported or Not Enough Reported (IBNR, IBNER)
  • 3. Automation is integral part of human civilization
  • 4. ● Ivy league approach - only for the chosen ones ● Focused on activities - detached from outcomes ● Assumption based: model selection is based on modeler‘s understanding of the world? ● Development is costly and limited ● Heavy dependence on programmers Next generation tools, platforms, approaches to data science Traditional Approach Data Science Approach ● Common man approach - for everyone ● Focused on business outcome ● Validation based: model selected if it predicts well in real world ● Development is crowd sourced, peer reviewed ● Automated solutions are taking care of programming open source programming social network of coders automated solutions
  • 5. “90% of the data in the world today has been created in the last two years alone” and some companies have been very successful in using data thanks to data science better service newer product improved operations
  • 6. SO WHY THE ROBOT HAS NOT REPLACED THE ACTUARY YET ?
  • 7. No Open Data => Slower innovation
  • 8. Unfriendly distribution shape ● Low Claim frequency ● Claim severity follows a skewed distribution with sometimes a large tail ● And ... ○ Discontinuities caused by policy limits ○ Environmental and operational changes leading to distributions constantly changing over time ○ Heterogeneity of risk within risk pools, caused by fraud and imperfect measures of risk exposure ○ ...
  • 9. info@DataRobot.com | @DataRobot | DataRobot, INC., CONFIDENTIAL OUR APPROACH TO WIN 1st PLACE IN: High severity (presence of large claims) Low frequency (0.26% of claims event) Many Features Censor large claims Downsample majority events Get rid of noise Let DataRobot explore the best transformations and the Machine Learning Algorithms for the data Experiment other models in R
  • 10. A more actuarial approach ● Modeling per risk type (bodily injury, damage, 3rd party...) ● Censoring large is well accepted but don’t forget to reallocate the cost of large claims if you use your model for pricing ● Log transformation and downsampling are less a practise. If you do this, don’t forget to adjust the bias ● Actuaries will mostly use GLMs (Generalized Linear Models with Poisson, Gamma and Tweedie distributions with a “log” link) and do either: ○ frequency modeling x severity modeling but be aware this makes the strong assumption that frequency and severity are independent. ○ or cost modeling directly
  • 11. Can ML algos support Poisson, Gamma and Tweedie loss functions? ● Yes! As an example, Regularized GLMs and Gradient Boosting Machine (GBMs) can support any exponential distributions ● But … open source implementation rarely support Gamma and Tweedie loss functions… ● Good news! ○ Poisson loss function is supported by XGboost, R gbm and R glmnet ○ open source algorithms are open! so you can patch them. In some cases, only a few lines of code are necessary
  • 12. Regulation or operational constraints Regulation or operational constraints might force you to keep control of the model built or/and keep it simple ➢ Example of sum insured: The premium should be monotonically increasing with sum insured, otherwise people will just purchase more cover and pay less... ➢ Many insurance companies use pricing tables ❖ use ML for feature selection and get insight on non linear relationship and interaction. And then integrate this insight into your GLMs where you have full control ❖ eliminate undesirable features ❖ patch ML algos to add monotonicity (R gbm already does it!) or interaction constraints
  • 13. Predict the future sum insured history decision tree prediction actual risk new values Models in general are not very good in predicting the future 2 things to keep in mind when you are using GBMs or RFs or bin the continuous variables...: ● decision trees don’t extrapolate new values. They won't predict higher claim sizes for sums insured higher than history ● Machine can be naive and lazy ○ see 2 examples in next slide
  • 14. Naive Machine ● one example is GE Flight Quest. To win the competition, my I2 R colleagues and I censored the info of the name of the airports to force Gradient Boosting Machine ○ to learn the reasons of the delay ○ and not to learn that one airport never had delays in the past 3 months and conclude it will never have delays in the future ● A real life example of this in insurance is when an actuary uses the policy number as one of the features to predict the claims cost. A naive predictive model could use policy number as a proxy for both inflationary effects and tenure effects.
  • 15. Backtesting To get good prediction models, you need to be fully aware of the model limitations and effects changing over time. To get this insight, Backtesting is an important step Backtesting is the process of applying an or analytical method to historical data to see how accurately the method would have predicted actual results. It should help you uncover underperformance due to poor modeling or environmental and operational changes. Machine Learning can be used to automate this experience analysis. And this time, there is no regulation or operational constraints!
  • 16. IBNR and IBNER Long claim developments make the modeling process harder ● It can take a long time to know an event happened and even longer to know the final cost. The life cycle of a claim typically includes event occurrence, reporting, initial estimation, case estimate review, payment, recoveries... and the reported losses are usually uncertain ● Risks with such long claims are called long-tail risks. Personal injury compensation schemes (workers’ compensation and motor accident insurance) are typically long-tail risks. Advices: ● first focus on short tail risks ● recruit an experienced actuary ● or more exciting! estimate IBNER on individual claims using machine learning
  • 17. Key takeaways ● Modeling insurance pricing is not easy ● Machine Learning can definitely help in the modeling process ● Open source solutions are not designed for insurance pricing and innovation will be slower as insurance is not an open world ● Plenty of other exciting modeling projects in insurance: ○ Scan through millions of potential consumers to choose the right few ○ Predict lapses and build price elasticity models ○ Select top 10% of your selected risk to manually review / inspect further ○ Identify claimants with highest likelihood of being fraud and review them manually ○ Text mine the beneficiary clause of life insurance contracts ○ Estimate which claims are likely to become problem claims that require special attention ○ Decide who should handle a claim or to pay it without further checks ○ Geographic features / spacial smoothing ○ Detect changes in mix of business ○ ….
  • 18. Automation is inevitable The Economist, May 2015 Harvard Business Review, June 2015