SlideShare a Scribd company logo
1 of 46
Download to read offline
© 2016 LigaData, Inc. All Rights Reserved.
Production Model
Lifecycle Management
Presented: Tue, Sept 22, 2016
greg@ligadata.com www.Ligadata.org www.Kamanja.org
© 2016 LigaData, Inc. All Rights Reserved. | 2
Develop a Robust Solution (or get fired)
Selecting the Best Model w/ Model Notebook
Describing the Model
Putting a Model in Production
Model Drift over Time (Non-Stationary)
Retrain or Refresh the Model
Kamanja Open Source PMML Scoring Platform
Contents
Accurate
General
Understandable
Model
Can you
have all 3
Model
attributes?
© 2016 LigaData, Inc. All Rights Reserved. | 3
Epsilon (owned by American Express then)
ACG’s first neural network (1992) (~40 quants in Analytic Consulting Group)
Score 250mm house holds every month, pick the best 5mm hh
Neural net by a previous consultant,
did great “in the lab” !!
did “reasonable” month 1
Develop a Robust Solution (or get fired)
General
© 2016 LigaData, Inc. All Rights Reserved. | 4
Epsilon (owned by American Express then)
ACG’s first neural network (1992) (~40 quants in Analytic Consulting Group)
Score 250mm house holds every month, pick the best 5mm hh
Neural net by a previous consultant,
did great “in the lab” !!
did “reasonable” month 1
did “worse” month 2
“bad” month 3 (no lift over random)
prior consultant was fired
I was hired, and told why I was replacing him
My model captured the same response with 4mm hh mailed
was stable for 24+ months, saved $1mm / month
Why? Good KDD Process (Knowledge Discovery in Databases)
Develop a Robust Solution (or get fired)
General
© 2016 LigaData, Inc. All Rights Reserved. | 5
Develop a Robust Solution (or get fired)
Selecting the Best Model w/ Model Notebook
Describing the Model
Putting a Model in Production
Model Drift over Time (Non-Stationary)
Retrain or Refresh the Model
Kamanja Open Source PMML Scoring Platform
Contents
Model Notebook
6
Bad vs.	Good
Accurate
7
R package “caret”
Same parameter search wrapper over 217 algorithms
http://topepo.github.io/caret/index.html
A “section” of a model notebook
Still need to track the results of each section
Model Notebook
Accurate
8
Bad vs.	Good
217 R Algorithms
Covered
Do you really want a one-off solution?
• Experimenting with Algorithms
• Experimenting with Algorithm Parameters
• Variable description à refine preprocessing
• :
• Deep Learning architectures have many parameters
and network designs
Accurate
Model Notebook
9
Bad vs.
Good
Q) What is the best outcome metric?
ROC, R2, Lift, MAD ….
Accurate
Model Notebook
10
Bad vs.
Good
Q) What is the best outcome metric?
ROC, R2, Lift, MAD ….
A) Deployment simulation of cost-value-strategy
Does the business problem mirror the 80-20 rule?
Just act on top 1% or top 5%?
Is the business deployment over all the score range? [0… 1]?
Just over the top 1% or 5% of the score (then NOT ROC, R2, corr)
Are some records 5* or 20* more valuable?
à Use cost-profit weighting, or more complex system Is this taught in
mining
competitions or
classes?
Accurate
in terms
of
business
focus
Calculate $ of “Business Pain”
zero
error
Over
Stock
Under
Stock
Need to Deeply
Understand
Business Metrics
Accurate
Calculate $ of “Business Pain”
1% bus
pain $
15% business
pain $
zero
error
?
←Equal mistakes →
Unequal PAIN in $
Over
Stock
Under
Stock
Need to Deeply
Understand
Business Metrics
At least use Type I vs.
Type II weighting
Accurate
in terms
of
business
focus
Calculate $ of “Business Pain”
No way – that could get you fired!
New progress in getting feedback
Over
Stock
4 week supply
of SKU →
30% off sale
Under
Stock
1% bus
pain $
30% bus
pain $15% business
pain $
zero
error
←Equal mistakes →
Unequal PAIN in $
Accurate
in terms
of
business
focus
Model Notebook
Outcome Details
• My Heuristic Design Objectives: (yours may be different)
– Accuracy in deployment
– Reliability and consistent behavior, a general solution
• Use one or more hold-out data sets to check consistency
• Penalize more, as the forecast becomes less consistent
– No penalty for model complexity (if it validates consistently)
– Develop a “smooth, continuous metric” to sort and find
models that perform “best” in future deployment
14
What would
you do?
Model Notebook
Outcome Details
• Training = results on the training set
• Validation = results on the validation hold out
• Gap = abs( Training – Validation )
A bigger gap (volatility) is a bigger concern for deployment, a symptom
Minimize Senior VP Heart attacks! (one penalty for volatility)
Set expectations & meet expectations
Regularization helps significantly
• Conservative Result
= worst( Training, Validation) + Gap_penalty
Corr / Lift / Profit → higher is better: Cons Result = min(Trn, Val) - Gap
MAD / RMSE / Risk → lower is better: Cons Result = max(Trn, Val) + Gap
Business Value or Pain ranking = function of( conservative result ) 15
Generalization:
You can’t
optimize
something you
don’t measure
Model Notebook
16
Bad vs.	Good
Accurate & General
Model Notebook Process
Tracking Detail ➔ Training the Data Miner
Input / Test Outcome
Regression
Top
5%
Top
10%
Top
20%
AutoNeural
Neural
Yippeee
!
More
Heuristic Strategy:
• Try a few models of many
algorithm types (seed the search)
• Opportunistically spend
more effort on what is
working (invest in top stocks)
• Still try a few trials on
medium success (diversify,
limited by project time-box)
• Try ensemble methods,
combining model forecasts
& top source vars w/ model
The Data Mining Battle Field
© 2016 LigaData, Inc. All Rights Reserved. | 18
Develop a Robust Solution (or get fired)
Selecting the Best Model w/ Model Notebook
Describing the Model
Putting a Model in Production
Model Drift over Time (Non-Stationary)
Retrain or Refresh the Model
Kamanja Open Source PMML Scoring Platform
Contents
© 2016 LigaData, Inc. All Rights Reserved. | 19
The law does not care how complex the model or ensemble was..
i.e. NOT sex, age, marital status, race, ….
i.e. ”over 180 days late on 2+ bills”
There are solutions to this constraint, for an arbitrary black box
The solutions have broad use in many areas of the model lifecycle
When Rejecting Credit –
Law Requires 4 Record Level Reasons
Understandable
© 2016 LigaData, Inc. All Rights Reserved. | 20
Should a data miner cut algorithm
choices, so they can come up with
reasons?
© 2016 LigaData, Inc. All Rights Reserved. | 21
97% of the time, NO!
(or let me compete with you)
Focus on the most GENERAL & ACCURATE system first
A VP does not need to know how to program a B+ tree, in order to
make a SQL vendor purchase decision. (Be a trusted advisor)
Should a data miner cut algorithm
choices, so they can come up with
reasons?
“I understand how a bike works, but I drive a car to work”
“I can explain the model, to the level of detail needed to drive
your business”
Understandable
© 2016 LigaData, Inc. All Rights Reserved. | 22
Description Solution – Sensitivity Analysis
(OAT) One At a Time
https://en.wikipedia.org/wiki/Sensitivity_analysis
Arbitrarily Complex
Data Mining System
(S) Source fields
Target
field
For source fields with
binned ranges, sensitivity
tells you importance of the
range, i.e. “low”, …. “high”
Can put sensitivity values in
Pivot Tables
or Cluster
Record Level “Reason
codes” can be extracted
from the most important
bins that apply to the given
record
© 2016 LigaData, Inc. All Rights Reserved. | 23
Description Solution – Sensitivity Analysis
(OAT) One At a Time
Arbitrarily Complex
Data Mining System
Present record N, S times, each input 5% bigger (fixed input delta)
Record delta change in output, S times per record
Aggregate: average(abs(delta)), target change per input field delta
(S) Source fields
Target
field
For source fields with
binned ranges, sensitivity
tells you importance of the
range, i.e. “low”, …. “high”
Can put sensitivity values in
Pivot Tables
or Cluster
Record Level “Reason
codes” can be extracted
from the most important
bins that apply to the given
record
Delta in
forecast
© 2016 LigaData, Inc. All Rights Reserved. | 24
Description Solution – Sensitivity Analysis
Applying Reasons per record (independent of var ranking)
• Reason codes are specific to the model and record
record 1 record 2
• Ranked predictive fields Mr. Smith Mr. Jones
max_late_payment_120d 0 1
max_late_payment_90d 1 0
bankrupt_in_last_5_yrs 1 1
max_late_payment_60d 0 0
• Mr. Smith’s reason codes include:
max_late_payment_90d 1
bankrupt_in_last_5_yrs 1
© 2016 LigaData, Inc. All Rights Reserved. | 25
Description Solution – Alternatives
R’s caret offers some feature selection,
• http://topepo.github.io/caret/featureselection.html
Filter methods (univariate)
Wrapper methods
• Recursive feature elimination
• Simulated Annealing
• Genetic algorithms
Variable Importance
• http://topepo.github.io/caret/varimp.html
• Algorithm specific (9 kinds)
• Model Independent Metrics
If classification: ROC curve analysis (univariate) per predictor
If regression: Fit a linear model
With variable ranking
still need to relate field
ranking to record reason
Univariate methods do
NOT cover variable
interactions in the model,
or non-linear
Understandable
© 2016 LigaData, Inc. All Rights Reserved. | 26
Description Solution
Local Interpretable Model-agnostic Explanations (LIME)
”Why Should I Trust You?” Explaining the Predictions of Any
Classifier – Knowledge Discovery in Databases 2016 (August 13-17)
https://arxiv.org/abs/1602.04938 (PDF)
https://github.com/marcotcr/
lime-experiments (Python code)
Describes models locally,
in terms of their variables
Minimize locality-aware loss
Understandable
© 2016 LigaData, Inc. All Rights Reserved. | 27
Description Solution
Local Interpretable Model-agnostic Explanations (LIME)
Understandable
© 2016 LigaData, Inc. All Rights Reserved. | 28
Develop a Robust Solution (or get fired)
Selecting the Best Model w/ Model Notebook
Describing the Model
Putting a Model in Production
Model Drift over Time (Non-Stationary)
Retrain or Refresh the Model
Kamanja Open Source PMML Scoring Platform
Contents
© 2016 LigaData, Inc. All Rights Reserved. | 29
Cut out extra preprocessed variables not used in final model
Minimize passes of the data
Many situations, I have had to RECODE prep and/or model to
meet production system requirements
• BAD: recode to Oracle, move SAS to mainframe & create JCL
Could take 2 months for conversion & full QA
• GOOD: Generate PMML code for model
Build up PMML preprocessing library, like Netflix
Putting a Model in Production
© 2016 LigaData, Inc. All Rights Reserved. | 30
Putting a Model in Production
www.DMG.org/
PMML/products
© 2016 LigaData, Inc. All Rights Reserved. | 31
Develop a Robust Solution (or get fired)
Selecting the Best Model w/ Model Notebook
Describing the Model
Putting a Model in Production
Model Drift over Time (Non-Stationary)
Retrain or Refresh the Model
Kamanja Open Source PMML Scoring Platform
Contents
© 2016 LigaData, Inc. All Rights Reserved. | 32
Tracking Model Drift
(easy to see with 2 input dimensions vs. score)
Current
Scoring
Data
Training
Data
General
© 2016 LigaData, Inc. All Rights Reserved. | 33
A trained model is only as general as
the variety of behavior in the training data
the artifacts abstracted out by preprocessing
Good KDD process and variable designs the analysis universe like
the general scoring universe
Over time, there is “drift” from the behavior represented in the
scoring data, and the original training data
Stock market cycles
Bull à Bear à Bull à …
Tracking Model Drift
General
© 2016 LigaData, Inc. All Rights Reserved. | 34
MODEL DRIFT DETECTOR in N dimensions
• Change in distribution of target (alert over threshold)
During training, find thresholds for 10 or 20 equal frequency bins of the score
During scoring, look at key thresholds around business decisions (act vs not)
Has the % over the fixed threshold changed much?
• Change in distribution of most important input fields
Diagnose CAUSES, what is changing, how much…
Out of the top 25% of the most important input fields…
Which had the largest change in contingency table metric?
Tracking Model Drift
General & Description
© 2016 LigaData, Inc. All Rights Reserved. | 35
A frequent process in companies – RETRAIN EVERY DAY
• Does yesterday’s 4th of July sale training data best represent
your 5th of July activity?
• Have you ”forgotten” past lessons, not in yesterday’s data
The Stability vs. Placticity dilemma or
Learn how to play the guitar without forgetting grandmother
What about fraud cases from 6 months ago?
Same issues exist in online training
• Drifting vs. forgetting?
choose robustness and transparency, which ever you do
Tracking Model Drift
General & Description
© 2016 LigaData, Inc. All Rights Reserved. | 36
Develop a Robust Solution (or get fired)
Selecting the Best Model w/ Model Notebook
Describing the Model
Putting a Model in Production
Model Drift over Time (Non-Stationary)
Retrain or Refresh the Model
Kamanja Open Source PMML Scoring Platform
Contents
© 2016 LigaData, Inc. All Rights Reserved. | 37
Model Retrain
• Brute force, most effort, most expense, most reliable
• Repeat the full data mining model training project
• Re-evaluate all algorithms, preprocessing, ensembles
Model Refresh
• “Minimal retraining”
• Just run the final 1-3 model trainings on “fresher” data
• Do not repeat exploring all algorithms and ensembles
• Assume the ”structure” is a reasonable solution
• Go back to your prior Model Notebook – choose the best as a short cut
Retrain, Refresh or Update DBC
1-2 months
3-5 days
General
© 2016 LigaData, Inc. All Rights Reserved. | 38
Develop a Robust Solution (or get fired)
Selecting the Best Model w/ Model Notebook
Describing the Model
Putting a Model in Production
Model Drift over Time (Non-Stationary)
Retrain or Refresh the Model
Kamanja Open Source PMML Scoring Platform
Contents
© 2016 LigaData, Inc. All Rights Reserved. | 39
Solution Architecture for Threat and Compliance
Lambda Architecture with Continuous Decisioning
1
2
3
4
5
6
© 2016 LigaData, Inc. All Rights Reserved. | 40
Solution Stack for Threat and Compliance
Leveraging Primarily Open Source Big Data Technologies
© 2016 LigaData, Inc. All Rights Reserved. | 41
Problem
Diverse Inputs
• Structured and unstructured data, with
varying latencies
Data Enrichment
• Long and laborious process, manual
and ad hoc
Quality of Threat Intelligence
• Lots of false positives waste analyst
resources
Poor Integrations with Response
Teams
• Manual and Time Consuming Process
Solution
• Ingest IP addresses, malware signatures,
hash values, email addresses, etc. in real
time
• Automatically enrich with third party data
• Check historical logs against new threats
continuously
• Predictive analytics based on machine
learning flag suspicious activity before it
becomes a problem
• Direct integration with dashboards to
generate alerts and speed up investigation
Use Kamanja to detect potential cyber security breaches
Continuous Decisioning
Use Case: Cyber Threat Detection & Response
© 2016 LigaData, Inc. All Rights Reserved. | 42
Problem
• Legacy system is batch oriented
• Months required to create and
implement new alerts
• Slow speed-to-market developing new
source system extracts. Months
required to assimilate new data.
• Risks to PII and NPI, with compliance
implications.
Solution
• Use open source big data stack to migrate to
real time data streaming, rapid model
deployment, and alerts with no manual
intervention.
• Calculate number of times PII/NPI accessed
over eight hour period, and calculate risk to
generate alerts
• Machine learning to identify normal pattern
of out of office hours access. Trigger
automatic alerts when anomalies occur.
• Rapid implementation of new models to
deal with emerging threats.
Use Kamanja to detect insider attacks to sensitive data
Continuous Decisioning
Use Case: Application Monitoring
© 2016 LigaData, Inc. All Rights Reserved. | 43
Problem
• Need timely alerting of potentially
unauthorized trading activity
• Must tie together voluminous data,
reports, and risk measures
• Meet increasingly stringent time
requirements
Solution
• Create a Trader Surveillance Dashboard
• Provide a holistic view of a trader, based on
all relevant information about the trader, the
marketplace, and peers
• Build supervised and unsupervised machine
learning models based on operational,
transactional, and financial data.
• Real-time analysis and monitoring of trader
activity automatically highlights unusual
activity and triggers alerts on trades to
investigate
Use Kamanja to reduce the risk of rogue behavior at an investment bank
Continuous Decisioning
Use Case: Unauthorized Trading Detection
© 2016 LigaData, Inc. All Rights Reserved. | 44
Problem
• $16.3 billion in credit card fraud
losses annually
• Fraud is growing more quickly than
transaction value
• New types of fraud are one step ahead
of existing solutions
• Dependence on third party proprietary
systems means slow reaction times
and expensive changes
Solution
• Apply Kamanja to IVR, web, and
transactional data to trigger alerts
• Initial models detect suspicious web traffic,
common purchase points, and application
rarity
• Leverage existing infrastructure as well as
existing third party systems (Falcon and
TSYS)
• Reduce costs by 80% with open source
software
Use Kamanja to incrementally reduce fraud losses by applying multiple
predictive models for transaction authorization
Continuous Decisioning
Use Case: Credit Card Fraud Detection
© 2016 LigaData, Inc. All Rights Reserved. | 45
You can have it all: accurate, general & describable
• You may fully understand a bike – but drive a car to work (level of detail)
Control and plan complexity: track in a model notebook
• Reuse notebook when you need to retrain
• Balance accuracy and generalization in the notebook outcomes
• Track business net value per model (be more competitive)
Model and record level description helps model lifecycle
• Helps during model building, to improve preprocessing, DBC
• Helps gain trust
• Helps track model drift and degradation
Use Kamanja, a real time decisioning engine for production
deployment
Summary
Accurate
General
Understandable
Model
© 2015 LigaData, Inc. All Rights Reserved.
Thank You
Tuesday, September 20, 2016
greg@ligadata.com
www.Linkedin.com/in/GregMakowski
www.Kamanja.org (Apache open source licensed)

More Related Content

What's hot

Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamDoug Needham
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine LearningSri Ambati
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Itai Yaffe
 
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWXSpark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWXKirk Haslbeck
 
Egypt hackathon 2014 analytics & spss session
Egypt hackathon 2014   analytics & spss sessionEgypt hackathon 2014   analytics & spss session
Egypt hackathon 2014 analytics & spss sessionM Baddar
 
Mastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisMastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisTeradata Aster
 
Deep Credit Risk Ranking with LSTM with Kyle Grove
Deep Credit Risk Ranking with LSTM with Kyle GroveDeep Credit Risk Ranking with LSTM with Kyle Grove
Deep Credit Risk Ranking with LSTM with Kyle GroveDatabricks
 
The Data Science Process
The Data Science ProcessThe Data Science Process
The Data Science ProcessVishal Patel
 
SPSS Modeler 16 What's New!?
SPSS Modeler 16 What's New!?SPSS Modeler 16 What's New!?
SPSS Modeler 16 What's New!?Chris Sparshott
 
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee JooEngineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee JooTwo Sigma
 
DutchMLSchool. ML for Logistics
DutchMLSchool. ML for LogisticsDutchMLSchool. ML for Logistics
DutchMLSchool. ML for LogisticsBigML, Inc
 
Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...Two Sigma
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Impetus Technologies
 
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...Cloudera, Inc.
 
Mark Seiss, Dun & Bradstreet - Importance of Domain Expertise for Building ML...
Mark Seiss, Dun & Bradstreet - Importance of Domain Expertise for Building ML...Mark Seiss, Dun & Bradstreet - Importance of Domain Expertise for Building ML...
Mark Seiss, Dun & Bradstreet - Importance of Domain Expertise for Building ML...Sri Ambati
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyDatabricks
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in productionTuri, Inc.
 
BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6Rod Soto
 
Mbaddar intro pred_anlaytics_spss
Mbaddar intro pred_anlaytics_spssMbaddar intro pred_anlaytics_spss
Mbaddar intro pred_anlaytics_spssM Baddar
 

What's hot (20)

Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine Learning
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"
 
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWXSpark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWX
 
Egypt hackathon 2014 analytics & spss session
Egypt hackathon 2014   analytics & spss sessionEgypt hackathon 2014   analytics & spss session
Egypt hackathon 2014 analytics & spss session
 
Mastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisMastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and Analysis
 
Deep Credit Risk Ranking with LSTM with Kyle Grove
Deep Credit Risk Ranking with LSTM with Kyle GroveDeep Credit Risk Ranking with LSTM with Kyle Grove
Deep Credit Risk Ranking with LSTM with Kyle Grove
 
The Data Science Process
The Data Science ProcessThe Data Science Process
The Data Science Process
 
SPSS Modeler 16 What's New!?
SPSS Modeler 16 What's New!?SPSS Modeler 16 What's New!?
SPSS Modeler 16 What's New!?
 
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee JooEngineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee Joo
 
DutchMLSchool. ML for Logistics
DutchMLSchool. ML for LogisticsDutchMLSchool. ML for Logistics
DutchMLSchool. ML for Logistics
 
Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
 
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
 
QuantConnect - Options Backtesting
QuantConnect - Options BacktestingQuantConnect - Options Backtesting
QuantConnect - Options Backtesting
 
Mark Seiss, Dun & Bradstreet - Importance of Domain Expertise for Building ML...
Mark Seiss, Dun & Bradstreet - Importance of Domain Expertise for Building ML...Mark Seiss, Dun & Bradstreet - Importance of Domain Expertise for Building ML...
Mark Seiss, Dun & Bradstreet - Importance of Domain Expertise for Building ML...
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform Strategy
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6
 
Mbaddar intro pred_anlaytics_spss
Mbaddar intro pred_anlaytics_spssMbaddar intro pred_anlaytics_spss
Mbaddar intro pred_anlaytics_spss
 

Viewers also liked

Linked In Slides 2009 02 24 B
Linked In Slides 2009 02 24 BLinked In Slides 2009 02 24 B
Linked In Slides 2009 02 24 BGreg Makowski
 
SFbayACM ACM Data Science Camp 2015 10 24
SFbayACM ACM Data Science Camp 2015 10 24SFbayACM ACM Data Science Camp 2015 10 24
SFbayACM ACM Data Science Camp 2015 10 24Greg Makowski
 
The 360º Leader (Section 2 of 6)
The 360º Leader (Section 2 of 6)The 360º Leader (Section 2 of 6)
The 360º Leader (Section 2 of 6)Greg Makowski
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Greg Makowski
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchGreg Makowski
 
The 360º Leader (Section 1 of 6)
The 360º Leader (Section 1 of 6)The 360º Leader (Section 1 of 6)
The 360º Leader (Section 1 of 6)Greg Makowski
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsGreg Makowski
 
Three case studies deploying cluster analysis
Three case studies deploying cluster analysisThree case studies deploying cluster analysis
Three case studies deploying cluster analysisGreg Makowski
 
360-Degree Leadership
360-Degree Leadership360-Degree Leadership
360-Degree LeadershipChuck Terrell
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
K-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsK-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsVarad Meru
 
Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Edureka!
 
Cluster analysis for market segmentation
Cluster analysis for market segmentationCluster analysis for market segmentation
Cluster analysis for market segmentationVishal Tandel
 

Viewers also liked (17)

Linked In Slides 2009 02 24 B
Linked In Slides 2009 02 24 BLinked In Slides 2009 02 24 B
Linked In Slides 2009 02 24 B
 
SFbayACM ACM Data Science Camp 2015 10 24
SFbayACM ACM Data Science Camp 2015 10 24SFbayACM ACM Data Science Camp 2015 10 24
SFbayACM ACM Data Science Camp 2015 10 24
 
The 360º Leader (Section 2 of 6)
The 360º Leader (Section 2 of 6)The 360º Leader (Section 2 of 6)
The 360º Leader (Section 2 of 6)
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient search
 
The 360º Leader (Section 1 of 6)
The 360º Leader (Section 1 of 6)The 360º Leader (Section 1 of 6)
The 360º Leader (Section 1 of 6)
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical Applications
 
Three case studies deploying cluster analysis
Three case studies deploying cluster analysisThree case studies deploying cluster analysis
Three case studies deploying cluster analysis
 
360-Degree Leadership
360-Degree Leadership360-Degree Leadership
360-Degree Leadership
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
360 Degree Leader - Ayub Jake Salik
360 Degree Leader - Ayub Jake Salik360 Degree Leader - Ayub Jake Salik
360 Degree Leader - Ayub Jake Salik
 
360 Degree Leadership
360 Degree Leadership360 Degree Leadership
360 Degree Leadership
 
K-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsK-Means, its Variants and its Applications
K-Means, its Variants and its Applications
 
Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
Cluster analysis for market segmentation
Cluster analysis for market segmentationCluster analysis for market segmentation
Cluster analysis for market segmentation
 

Similar to Production model lifecycle management 2016 09

Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Mauro Vallati
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4Salford Systems
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionJaroslaw Szymczak
 
Boost Your Data Expertise - What's New in Minitab 19.2020.1
Boost Your Data Expertise -  What's New in Minitab 19.2020.1Boost Your Data Expertise -  What's New in Minitab 19.2020.1
Boost Your Data Expertise - What's New in Minitab 19.2020.1Minitab, LLC
 
Analytics demystified
Analytics demystifiedAnalytics demystified
Analytics demystifiedMarc Moreau
 
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverThe Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverQA or the Highway
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilitiesAllan D. Butler
 
Advanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarAdvanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarSigOpt
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf
 
Predictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive IndustryPredictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive IndustryMatouš Havlena
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
The "Evils" of Optimization
The "Evils" of OptimizationThe "Evils" of Optimization
The "Evils" of OptimizationBlackRabbitCoder
 
Business Applications of Predictive Modeling at Scale
Business Applications of Predictive Modeling at ScaleBusiness Applications of Predictive Modeling at Scale
Business Applications of Predictive Modeling at ScaleSongtao Guo
 
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathJohn Holden
 
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingConstraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingEray Cakici
 

Similar to Production model lifecycle management 2016 09 (20)

Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 
Analytics
AnalyticsAnalytics
Analytics
 
Boost Your Data Expertise - What's New in Minitab 19.2020.1
Boost Your Data Expertise -  What's New in Minitab 19.2020.1Boost Your Data Expertise -  What's New in Minitab 19.2020.1
Boost Your Data Expertise - What's New in Minitab 19.2020.1
 
Analytics demystified
Analytics demystifiedAnalytics demystified
Analytics demystified
 
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverThe Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
 
CSCCIX2005
CSCCIX2005CSCCIX2005
CSCCIX2005
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilities
 
Advanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarAdvanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise Webinar
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Predictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive IndustryPredictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive Industry
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
The "Evils" of Optimization
The "Evils" of OptimizationThe "Evils" of Optimization
The "Evils" of Optimization
 
Business Applications of Predictive Modeling at Scale
Business Applications of Predictive Modeling at ScaleBusiness Applications of Predictive Modeling at Scale
Business Applications of Predictive Modeling at Scale
 
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
 
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingConstraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
 

More from Greg Makowski

Understanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptxUnderstanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptxGreg Makowski
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxGreg Makowski
 
A Successful Hiring Process for Data Scientists
A Successful Hiring Process for Data ScientistsA Successful Hiring Process for Data Scientists
A Successful Hiring Process for Data ScientistsGreg Makowski
 
Kdd 2019: Standardizing Data Science to Help Hiring
Kdd 2019:  Standardizing Data Science to Help HiringKdd 2019:  Standardizing Data Science to Help Hiring
Kdd 2019: Standardizing Data Science to Help HiringGreg Makowski
 
Tales from an ip worker in consulting and software
Tales from an ip worker in consulting and softwareTales from an ip worker in consulting and software
Tales from an ip worker in consulting and softwareGreg Makowski
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
 

More from Greg Makowski (6)

Understanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptxUnderstanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptx
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptx
 
A Successful Hiring Process for Data Scientists
A Successful Hiring Process for Data ScientistsA Successful Hiring Process for Data Scientists
A Successful Hiring Process for Data Scientists
 
Kdd 2019: Standardizing Data Science to Help Hiring
Kdd 2019:  Standardizing Data Science to Help HiringKdd 2019:  Standardizing Data Science to Help Hiring
Kdd 2019: Standardizing Data Science to Help Hiring
 
Tales from an ip worker in consulting and software
Tales from an ip worker in consulting and softwareTales from an ip worker in consulting and software
Tales from an ip worker in consulting and software
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 

Recently uploaded

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 

Recently uploaded (20)

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 

Production model lifecycle management 2016 09

  • 1. © 2016 LigaData, Inc. All Rights Reserved. Production Model Lifecycle Management Presented: Tue, Sept 22, 2016 greg@ligadata.com www.Ligadata.org www.Kamanja.org
  • 2. © 2016 LigaData, Inc. All Rights Reserved. | 2 Develop a Robust Solution (or get fired) Selecting the Best Model w/ Model Notebook Describing the Model Putting a Model in Production Model Drift over Time (Non-Stationary) Retrain or Refresh the Model Kamanja Open Source PMML Scoring Platform Contents Accurate General Understandable Model Can you have all 3 Model attributes?
  • 3. © 2016 LigaData, Inc. All Rights Reserved. | 3 Epsilon (owned by American Express then) ACG’s first neural network (1992) (~40 quants in Analytic Consulting Group) Score 250mm house holds every month, pick the best 5mm hh Neural net by a previous consultant, did great “in the lab” !! did “reasonable” month 1 Develop a Robust Solution (or get fired) General
  • 4. © 2016 LigaData, Inc. All Rights Reserved. | 4 Epsilon (owned by American Express then) ACG’s first neural network (1992) (~40 quants in Analytic Consulting Group) Score 250mm house holds every month, pick the best 5mm hh Neural net by a previous consultant, did great “in the lab” !! did “reasonable” month 1 did “worse” month 2 “bad” month 3 (no lift over random) prior consultant was fired I was hired, and told why I was replacing him My model captured the same response with 4mm hh mailed was stable for 24+ months, saved $1mm / month Why? Good KDD Process (Knowledge Discovery in Databases) Develop a Robust Solution (or get fired) General
  • 5. © 2016 LigaData, Inc. All Rights Reserved. | 5 Develop a Robust Solution (or get fired) Selecting the Best Model w/ Model Notebook Describing the Model Putting a Model in Production Model Drift over Time (Non-Stationary) Retrain or Refresh the Model Kamanja Open Source PMML Scoring Platform Contents
  • 7. 7 R package “caret” Same parameter search wrapper over 217 algorithms http://topepo.github.io/caret/index.html A “section” of a model notebook Still need to track the results of each section Model Notebook Accurate
  • 8. 8 Bad vs. Good 217 R Algorithms Covered Do you really want a one-off solution? • Experimenting with Algorithms • Experimenting with Algorithm Parameters • Variable description à refine preprocessing • : • Deep Learning architectures have many parameters and network designs Accurate
  • 9. Model Notebook 9 Bad vs. Good Q) What is the best outcome metric? ROC, R2, Lift, MAD …. Accurate
  • 10. Model Notebook 10 Bad vs. Good Q) What is the best outcome metric? ROC, R2, Lift, MAD …. A) Deployment simulation of cost-value-strategy Does the business problem mirror the 80-20 rule? Just act on top 1% or top 5%? Is the business deployment over all the score range? [0… 1]? Just over the top 1% or 5% of the score (then NOT ROC, R2, corr) Are some records 5* or 20* more valuable? à Use cost-profit weighting, or more complex system Is this taught in mining competitions or classes? Accurate in terms of business focus
  • 11. Calculate $ of “Business Pain” zero error Over Stock Under Stock Need to Deeply Understand Business Metrics Accurate
  • 12. Calculate $ of “Business Pain” 1% bus pain $ 15% business pain $ zero error ? ←Equal mistakes → Unequal PAIN in $ Over Stock Under Stock Need to Deeply Understand Business Metrics At least use Type I vs. Type II weighting Accurate in terms of business focus
  • 13. Calculate $ of “Business Pain” No way – that could get you fired! New progress in getting feedback Over Stock 4 week supply of SKU → 30% off sale Under Stock 1% bus pain $ 30% bus pain $15% business pain $ zero error ←Equal mistakes → Unequal PAIN in $ Accurate in terms of business focus
  • 14. Model Notebook Outcome Details • My Heuristic Design Objectives: (yours may be different) – Accuracy in deployment – Reliability and consistent behavior, a general solution • Use one or more hold-out data sets to check consistency • Penalize more, as the forecast becomes less consistent – No penalty for model complexity (if it validates consistently) – Develop a “smooth, continuous metric” to sort and find models that perform “best” in future deployment 14 What would you do?
  • 15. Model Notebook Outcome Details • Training = results on the training set • Validation = results on the validation hold out • Gap = abs( Training – Validation ) A bigger gap (volatility) is a bigger concern for deployment, a symptom Minimize Senior VP Heart attacks! (one penalty for volatility) Set expectations & meet expectations Regularization helps significantly • Conservative Result = worst( Training, Validation) + Gap_penalty Corr / Lift / Profit → higher is better: Cons Result = min(Trn, Val) - Gap MAD / RMSE / Risk → lower is better: Cons Result = max(Trn, Val) + Gap Business Value or Pain ranking = function of( conservative result ) 15 Generalization: You can’t optimize something you don’t measure
  • 17. Model Notebook Process Tracking Detail ➔ Training the Data Miner Input / Test Outcome Regression Top 5% Top 10% Top 20% AutoNeural Neural Yippeee ! More Heuristic Strategy: • Try a few models of many algorithm types (seed the search) • Opportunistically spend more effort on what is working (invest in top stocks) • Still try a few trials on medium success (diversify, limited by project time-box) • Try ensemble methods, combining model forecasts & top source vars w/ model The Data Mining Battle Field
  • 18. © 2016 LigaData, Inc. All Rights Reserved. | 18 Develop a Robust Solution (or get fired) Selecting the Best Model w/ Model Notebook Describing the Model Putting a Model in Production Model Drift over Time (Non-Stationary) Retrain or Refresh the Model Kamanja Open Source PMML Scoring Platform Contents
  • 19. © 2016 LigaData, Inc. All Rights Reserved. | 19 The law does not care how complex the model or ensemble was.. i.e. NOT sex, age, marital status, race, …. i.e. ”over 180 days late on 2+ bills” There are solutions to this constraint, for an arbitrary black box The solutions have broad use in many areas of the model lifecycle When Rejecting Credit – Law Requires 4 Record Level Reasons Understandable
  • 20. © 2016 LigaData, Inc. All Rights Reserved. | 20 Should a data miner cut algorithm choices, so they can come up with reasons?
  • 21. © 2016 LigaData, Inc. All Rights Reserved. | 21 97% of the time, NO! (or let me compete with you) Focus on the most GENERAL & ACCURATE system first A VP does not need to know how to program a B+ tree, in order to make a SQL vendor purchase decision. (Be a trusted advisor) Should a data miner cut algorithm choices, so they can come up with reasons? “I understand how a bike works, but I drive a car to work” “I can explain the model, to the level of detail needed to drive your business” Understandable
  • 22. © 2016 LigaData, Inc. All Rights Reserved. | 22 Description Solution – Sensitivity Analysis (OAT) One At a Time https://en.wikipedia.org/wiki/Sensitivity_analysis Arbitrarily Complex Data Mining System (S) Source fields Target field For source fields with binned ranges, sensitivity tells you importance of the range, i.e. “low”, …. “high” Can put sensitivity values in Pivot Tables or Cluster Record Level “Reason codes” can be extracted from the most important bins that apply to the given record
  • 23. © 2016 LigaData, Inc. All Rights Reserved. | 23 Description Solution – Sensitivity Analysis (OAT) One At a Time Arbitrarily Complex Data Mining System Present record N, S times, each input 5% bigger (fixed input delta) Record delta change in output, S times per record Aggregate: average(abs(delta)), target change per input field delta (S) Source fields Target field For source fields with binned ranges, sensitivity tells you importance of the range, i.e. “low”, …. “high” Can put sensitivity values in Pivot Tables or Cluster Record Level “Reason codes” can be extracted from the most important bins that apply to the given record Delta in forecast
  • 24. © 2016 LigaData, Inc. All Rights Reserved. | 24 Description Solution – Sensitivity Analysis Applying Reasons per record (independent of var ranking) • Reason codes are specific to the model and record record 1 record 2 • Ranked predictive fields Mr. Smith Mr. Jones max_late_payment_120d 0 1 max_late_payment_90d 1 0 bankrupt_in_last_5_yrs 1 1 max_late_payment_60d 0 0 • Mr. Smith’s reason codes include: max_late_payment_90d 1 bankrupt_in_last_5_yrs 1
  • 25. © 2016 LigaData, Inc. All Rights Reserved. | 25 Description Solution – Alternatives R’s caret offers some feature selection, • http://topepo.github.io/caret/featureselection.html Filter methods (univariate) Wrapper methods • Recursive feature elimination • Simulated Annealing • Genetic algorithms Variable Importance • http://topepo.github.io/caret/varimp.html • Algorithm specific (9 kinds) • Model Independent Metrics If classification: ROC curve analysis (univariate) per predictor If regression: Fit a linear model With variable ranking still need to relate field ranking to record reason Univariate methods do NOT cover variable interactions in the model, or non-linear Understandable
  • 26. © 2016 LigaData, Inc. All Rights Reserved. | 26 Description Solution Local Interpretable Model-agnostic Explanations (LIME) ”Why Should I Trust You?” Explaining the Predictions of Any Classifier – Knowledge Discovery in Databases 2016 (August 13-17) https://arxiv.org/abs/1602.04938 (PDF) https://github.com/marcotcr/ lime-experiments (Python code) Describes models locally, in terms of their variables Minimize locality-aware loss Understandable
  • 27. © 2016 LigaData, Inc. All Rights Reserved. | 27 Description Solution Local Interpretable Model-agnostic Explanations (LIME) Understandable
  • 28. © 2016 LigaData, Inc. All Rights Reserved. | 28 Develop a Robust Solution (or get fired) Selecting the Best Model w/ Model Notebook Describing the Model Putting a Model in Production Model Drift over Time (Non-Stationary) Retrain or Refresh the Model Kamanja Open Source PMML Scoring Platform Contents
  • 29. © 2016 LigaData, Inc. All Rights Reserved. | 29 Cut out extra preprocessed variables not used in final model Minimize passes of the data Many situations, I have had to RECODE prep and/or model to meet production system requirements • BAD: recode to Oracle, move SAS to mainframe & create JCL Could take 2 months for conversion & full QA • GOOD: Generate PMML code for model Build up PMML preprocessing library, like Netflix Putting a Model in Production
  • 30. © 2016 LigaData, Inc. All Rights Reserved. | 30 Putting a Model in Production www.DMG.org/ PMML/products
  • 31. © 2016 LigaData, Inc. All Rights Reserved. | 31 Develop a Robust Solution (or get fired) Selecting the Best Model w/ Model Notebook Describing the Model Putting a Model in Production Model Drift over Time (Non-Stationary) Retrain or Refresh the Model Kamanja Open Source PMML Scoring Platform Contents
  • 32. © 2016 LigaData, Inc. All Rights Reserved. | 32 Tracking Model Drift (easy to see with 2 input dimensions vs. score) Current Scoring Data Training Data General
  • 33. © 2016 LigaData, Inc. All Rights Reserved. | 33 A trained model is only as general as the variety of behavior in the training data the artifacts abstracted out by preprocessing Good KDD process and variable designs the analysis universe like the general scoring universe Over time, there is “drift” from the behavior represented in the scoring data, and the original training data Stock market cycles Bull à Bear à Bull à … Tracking Model Drift General
  • 34. © 2016 LigaData, Inc. All Rights Reserved. | 34 MODEL DRIFT DETECTOR in N dimensions • Change in distribution of target (alert over threshold) During training, find thresholds for 10 or 20 equal frequency bins of the score During scoring, look at key thresholds around business decisions (act vs not) Has the % over the fixed threshold changed much? • Change in distribution of most important input fields Diagnose CAUSES, what is changing, how much… Out of the top 25% of the most important input fields… Which had the largest change in contingency table metric? Tracking Model Drift General & Description
  • 35. © 2016 LigaData, Inc. All Rights Reserved. | 35 A frequent process in companies – RETRAIN EVERY DAY • Does yesterday’s 4th of July sale training data best represent your 5th of July activity? • Have you ”forgotten” past lessons, not in yesterday’s data The Stability vs. Placticity dilemma or Learn how to play the guitar without forgetting grandmother What about fraud cases from 6 months ago? Same issues exist in online training • Drifting vs. forgetting? choose robustness and transparency, which ever you do Tracking Model Drift General & Description
  • 36. © 2016 LigaData, Inc. All Rights Reserved. | 36 Develop a Robust Solution (or get fired) Selecting the Best Model w/ Model Notebook Describing the Model Putting a Model in Production Model Drift over Time (Non-Stationary) Retrain or Refresh the Model Kamanja Open Source PMML Scoring Platform Contents
  • 37. © 2016 LigaData, Inc. All Rights Reserved. | 37 Model Retrain • Brute force, most effort, most expense, most reliable • Repeat the full data mining model training project • Re-evaluate all algorithms, preprocessing, ensembles Model Refresh • “Minimal retraining” • Just run the final 1-3 model trainings on “fresher” data • Do not repeat exploring all algorithms and ensembles • Assume the ”structure” is a reasonable solution • Go back to your prior Model Notebook – choose the best as a short cut Retrain, Refresh or Update DBC 1-2 months 3-5 days General
  • 38. © 2016 LigaData, Inc. All Rights Reserved. | 38 Develop a Robust Solution (or get fired) Selecting the Best Model w/ Model Notebook Describing the Model Putting a Model in Production Model Drift over Time (Non-Stationary) Retrain or Refresh the Model Kamanja Open Source PMML Scoring Platform Contents
  • 39. © 2016 LigaData, Inc. All Rights Reserved. | 39 Solution Architecture for Threat and Compliance Lambda Architecture with Continuous Decisioning 1 2 3 4 5 6
  • 40. © 2016 LigaData, Inc. All Rights Reserved. | 40 Solution Stack for Threat and Compliance Leveraging Primarily Open Source Big Data Technologies
  • 41. © 2016 LigaData, Inc. All Rights Reserved. | 41 Problem Diverse Inputs • Structured and unstructured data, with varying latencies Data Enrichment • Long and laborious process, manual and ad hoc Quality of Threat Intelligence • Lots of false positives waste analyst resources Poor Integrations with Response Teams • Manual and Time Consuming Process Solution • Ingest IP addresses, malware signatures, hash values, email addresses, etc. in real time • Automatically enrich with third party data • Check historical logs against new threats continuously • Predictive analytics based on machine learning flag suspicious activity before it becomes a problem • Direct integration with dashboards to generate alerts and speed up investigation Use Kamanja to detect potential cyber security breaches Continuous Decisioning Use Case: Cyber Threat Detection & Response
  • 42. © 2016 LigaData, Inc. All Rights Reserved. | 42 Problem • Legacy system is batch oriented • Months required to create and implement new alerts • Slow speed-to-market developing new source system extracts. Months required to assimilate new data. • Risks to PII and NPI, with compliance implications. Solution • Use open source big data stack to migrate to real time data streaming, rapid model deployment, and alerts with no manual intervention. • Calculate number of times PII/NPI accessed over eight hour period, and calculate risk to generate alerts • Machine learning to identify normal pattern of out of office hours access. Trigger automatic alerts when anomalies occur. • Rapid implementation of new models to deal with emerging threats. Use Kamanja to detect insider attacks to sensitive data Continuous Decisioning Use Case: Application Monitoring
  • 43. © 2016 LigaData, Inc. All Rights Reserved. | 43 Problem • Need timely alerting of potentially unauthorized trading activity • Must tie together voluminous data, reports, and risk measures • Meet increasingly stringent time requirements Solution • Create a Trader Surveillance Dashboard • Provide a holistic view of a trader, based on all relevant information about the trader, the marketplace, and peers • Build supervised and unsupervised machine learning models based on operational, transactional, and financial data. • Real-time analysis and monitoring of trader activity automatically highlights unusual activity and triggers alerts on trades to investigate Use Kamanja to reduce the risk of rogue behavior at an investment bank Continuous Decisioning Use Case: Unauthorized Trading Detection
  • 44. © 2016 LigaData, Inc. All Rights Reserved. | 44 Problem • $16.3 billion in credit card fraud losses annually • Fraud is growing more quickly than transaction value • New types of fraud are one step ahead of existing solutions • Dependence on third party proprietary systems means slow reaction times and expensive changes Solution • Apply Kamanja to IVR, web, and transactional data to trigger alerts • Initial models detect suspicious web traffic, common purchase points, and application rarity • Leverage existing infrastructure as well as existing third party systems (Falcon and TSYS) • Reduce costs by 80% with open source software Use Kamanja to incrementally reduce fraud losses by applying multiple predictive models for transaction authorization Continuous Decisioning Use Case: Credit Card Fraud Detection
  • 45. © 2016 LigaData, Inc. All Rights Reserved. | 45 You can have it all: accurate, general & describable • You may fully understand a bike – but drive a car to work (level of detail) Control and plan complexity: track in a model notebook • Reuse notebook when you need to retrain • Balance accuracy and generalization in the notebook outcomes • Track business net value per model (be more competitive) Model and record level description helps model lifecycle • Helps during model building, to improve preprocessing, DBC • Helps gain trust • Helps track model drift and degradation Use Kamanja, a real time decisioning engine for production deployment Summary Accurate General Understandable Model
  • 46. © 2015 LigaData, Inc. All Rights Reserved. Thank You Tuesday, September 20, 2016 greg@ligadata.com www.Linkedin.com/in/GregMakowski www.Kamanja.org (Apache open source licensed)