In part one of this two-part series, you learned some of the common reasons enterprises struggle to turn insights into actions as well as a strategy for overcoming these challenges to successfully operationalize data science. In part two, it’s time to fill in the architectural and technological details of that strategy.
Pivotal Data Scientist Megha Agarwal will share the key ingredients to successfully put data science models in production and use them to drive actions in real-time. In this webinar, you will learn:
- Adopting extreme programming practices for data science
- Importance of working in a balanced team
- How to put and maintain machine learning models in production
- End-to-end pipeline design
Presenter: Megha Agarwal, Data Scientist
2. Today’s Speaker
Megha Agarwal
Data Scientist, Pivotal
Megha has been with Pivotal since 2 years, helping clients
ranging from startups to Fortune 500 companies identify and
deliver value through their data. She is focused on developing
smart apps by applying machine learning and statistics.
Prior to Pivotal, she was working in the credit risk department
to identify potential delinquency and fraud patterns. She has
done her Masters in Machine Learning and HPC form
University of Bristol.
3. Operationalizing Data Science: Common Pitfalls
3
3. Pace of Insight
Generation
Mismatch
2. Lack of
Business Process
Integration
1. Predictive
Insights are
Insufficient
4. Inability to
Act on
Perishable
Insights
5. Failing to Learn
from Past
Experience
4. 4
3. Right Insight,
Right Time
2. Business
Process
Integration
1. Prescriptive
Insights
4. Software
Automation
5. Close the
Analytics Loop
Operationalizing Data Science: A Strategy for Success
5. The Right Tools and
Architecture
OPERATIONALIZING DATA SCIENCE
10. Meets user needs
Easy to
use
Smar
t
First
version of
the product
No missed opportunities
Laying the data foundation from the
start allows us to easily add smart
features
Iterative without losing the bigger
picture
Customers expect apps to be
personalised. The iterative process
allows the product to learn and
improve over time
Our products are smart from the start
11. MVM & Continuous Deployment of DS Models
Model Evaluation
Operationalization
Model Building
Feature Review
Scoping Data Review Feature Engineering
User Feedback
20. Digital Messaging App
Customer: A multinational banking and
financial service
Problem: Digital messaging app to provide
relevant information to the customers about their
finances at appropriate time
21. ● User Centric
● Persona Identification
● Product Features
Design + Data
22. Improving customer banking experience
Unusual Direct Debits Scheduled Direct Debits Future Insufficient Funds
23. Improving customer banking experience
Unusual Direct Debits Scheduled Direct Debits Future Insufficient Funds
25. ● Online Learning Model
● Personalised (each customer, DD company)
● 18G data flowing in everyday
Model Nuances
26. ● Begin the exploration with an end to end wiring discussions with devs
● Explore the direct debit transactions, identifying how the overall population behaves
● Minimum Viable Model: Median Deviation from Mean
Data Exploration
29. Parse, enrich, filter
transaction data
Customer
Information
DS Microservice to
create, score and
update UDD models
UDD Alert
~ 18GB Transaction Daily
Data
Daily
Transactions
UDD Micro-services Pipeline
30. Historical Direct
Debits (12M)
For each
DD that a
customer
has Mean,
Median Deviation
from Mean
Model Repo
New Direct Debit
Retrieve
required DD
model for the
customer Stable?
Yes
Within
Limit?
No
No
Yes
Update model
DS Python Micro-service
32. ● Product Team vs Siloed Data Science Team
● User Centric
● Extreme Programming Practices can be applied to DS & it help to ship features
faster
● It’s important to have a MVM up and running in production rather than waiting for the
perfect model
Key Takeaways
33. Other Resources
● Scoring as a service
● Operationalising DS Models on Pivotal Stack
● API First for Data Science
● Pairing for Data Scientists
● Test Driven Development for Data Science
● Continuous Integration for Data Science