More Related Content Similar to Learn How to Operationalize IoT Apps on Pivotal Cloud Foundry (20) More from VMware Tanzu (20) Learn How to Operationalize IoT Apps on Pivotal Cloud Foundry1. 1© Copyright 2016 Pivotal. All rights reserved.
Data Science-Powered Apps for
the Internet of Things
Chris Rawles1 and Jarrod Vawdrey2
1. Sr. Data Scientist in New York, New York
2. Sr. Data Scientist in Atlanta, Georgia
2. 2© Copyright 2016 Pivotal. All rights reserved.
‘By the year 2025, $4 to $11 trillion of
economic value could be created through
the Internet of Things.’
Michael Chui
Partner, McKinsey & Company
3. 3© Copyright 2016 Pivotal. All rights reserved.
IoT
Platform
Applications
Data Science
4. 4© Copyright 2016 Pivotal. All rights reserved.
New business models
Improve efficiencies
Personalized experiences
_____________________
Trillions $ Economic Value
5. 5© Copyright 2016 Pivotal. All rights reserved.
Today’s Speaker
Chris Rawles
Senior Data Scientist
6. 6© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1. A real-time data science app
A. The app: a live demonstration
B. How can a data scientist build a data science application?
C. Revisiting the app
2. Generalizing the framework: Solving new data science
challenges
A. Internet of Things – Creating a smart app to prevent oil spill disasters
B. Financial data - How can retail banks influence their cardholders’
behavior?
7. 7© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1. A real-time data science app
A. The app: a live demonstration
B. How can a data scientist build a data science application?
C. Revisiting the app
2. Generalizing the framework: Solving new data science
challenges
A. Internet of Things – Creating a smart app to prevent oil spill disasters
B. Financial data - How can retail banks influence their cardholders’
behavior?
9. 9© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1. A real-time data science app
A. The app: a live demonstration
B. How can a data scientist build a data science application?
C. Revisiting the app
2. Generalizing the framework: Solving new data science
challenges
A. Internet of Things – creating a smart app
B. Financial data - How can retail banks influence their cardholders’
behavior?
10. 10© Copyright 2016 Pivotal. All rights reserved.
Training
app
Model
Scoring as
a service
API Call
Model
Training as
a service
Sensor
app
Scoring
app
Dashboard
app
Data science workflow: Movement classification
1. Sensor + Dashboard
2. Redis
3. Training app
4. Scoring app
11. 11© Copyright 2016 Pivotal. All rights reserved.
here is my source code
run it on the cloud for me
- Onsi Fakhouri
@onsijoe
i do not care how
12. 12© Copyright 2016 Pivotal. All rights reserved.
cf push
! CF determines app type (Java, Python, Ruby, …)
! Installs necessary environment
! Provisions and binds data services
! Creates domain, routing, and load balancing
! Continual app health checks and restarts
13. 13© Copyright 2016 Pivotal. All rights reserved.
Data ingestion: Accelerometric data
! Accelerometric data streamed from
mobile phone at 15 Hz (15x / second)
! Other sensor data: gyroscopic data,
magnetometer data, lon/lat, etc.
Accelerometer axes
14. 14© Copyright 2016 Pivotal. All rights reserved.
! For real-time applications, low-latency data ingestion into
the data store is essential
! WebSocket protocol - socket.io
– Mobile phone " Webserver
– Webserver " Dashboard
! socket.io " redis
Data ingestion
Training
app
Sensor
app
15. 15© Copyright 2016 Pivotal. All rights reserved.
Data storage
! We are using a redis store for:
– Storing training data
– Model persistence
– Storing a micro-batch of scoring data
! Other storage systems include Pivotal Cloud Cache,
GemFire, HAWQ/Hadoop, Greenplum Database,
PostgreSQL, …
16. 16© Copyright 2016 Pivotal. All rights reserved.
Modeling
Scalable machine learning applications in Pivotal
Cloud Foundry
1. Training app
2. Scoring app
17. 17© Copyright 2016 Pivotal. All rights reserved.
Modeling – Training app
! Goal: build a data-driven model that learns accelerometric
motions associated with each activity
Feature Engineering
• Time-domain
transformations
• Fast Fourier Transform
analysis
Machine Learning
Classification Model
• Random Forest Model
using 2 second time
windows (30 samples)
…
Training data
Trained
model
18. 18© Copyright 2016 Pivotal. All rights reserved.
Model building
! 20 seconds per
training activity
! Two second moving
window on training
data
! Features: time-
domain summary
statistics and Fourier
transform coefficients
19. 19© Copyright 2016 Pivotal. All rights reserved.
Model training approaches
1. Near-real-time model training
– Use small batches to train model
2. Real-time model training
– Online machine learning algorithm : continually update model
using each new data point
3. Offline model training
– Build a model offline using batches
– Useful for models requiring finer model tuning and calibration
20. 20© Copyright 2016 Pivotal. All rights reserved.
Feature Engineering
• Time-domain
transformations
• Fast Fourier Transform
analysis
Machine Learning
Classification Model
• Random Forest Model
using 2 second time
windows (30 samples)
Trained model
Streaming input window
Model
Prediction
API Call
Model
prediction
PCF App:
Scoring app
• Real-time model scoring
• The dashboard initiates a request via
an API call and receives a model
prediction
{ "channel": "1234",
"label": ”walking",
”score": 0.746 }
21. 21© Copyright 2016 Pivotal. All rights reserved.
Scaling the model scoring application
$ cf scale –i 10 Scoring App
Scoring App
Scoring App
Scoring App
Horizontal scaling
22. 22© Copyright 2016 Pivotal. All rights reserved.
1. Application auto-scaling
– As the data grows, the model scales
2. Building a model factory–evaluate many models in production
3. Application autonomy
– The model application is independent of other applications = faster
development iterations
– Faster development = rapid feedback loop
4. Multiple applications can access model scoring app
Operationalizing scalable data science applications
Model scoring as a service
Why?
23. 23© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1. A real-time data science app
A. The app: a live demonstration
B. How can a data scientist build a data science application?
C. Revisiting the app
2. Generalizing the framework: Solving new data science
challenges
A. Internet of Things – creating a smart app
B. Financial data - How can retail banks influence their cardholders’
behavior?
25. 25© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1. A real-time data science app
A. The app: a live demonstration
B. How can a data scientist build a data science application?
C. Revisiting the app
2. Generalizing the framework: Solving new data science
challenges
A. Internet of Things – Creating a smart app to prevent oil spill disasters
B. Financial data - How can retail banks influence their cardholders’
behavior?
26. 26© Copyright 2016 Pivotal. All rights reserved.
Gene Sequencing
Smart Grids
COST TO SEQUENCE
ONE GENOME
HAS FALLEN FROM
$100M IN
2001
TO $10K IN 2011
TO $1K IN 2014
READING SMART METERS
EVERY 15 MINUTES IS
3000X MORE
DATA INTENSIVE
Stock Market
Social Media
FACEBOOK UPLOADS
250 MILLION
PHOTOS EACH DAY
In all industries billions of data points represent
opportunities for the Internet of Things
Oil Exploration
Video Surveillance
OIL RIGS GENERATE
25000
DATA POINTS
PER SECOND
Medical Imaging
Mobile Sensors
27. 27© Copyright 2016 Pivotal. All rights reserved.
How can we use data
to help prevent
accidents like the Macondo
Disaster ?
28. 28© Copyright 2016 Pivotal. All rights reserved. 28© Copyright 2016 Pivotal. All rights reserved.
…by creating a Smart Application
29. 29© Copyright 2016 Pivotal. All rights reserved.
Training
app
Model
Scoring as
a service
API Call
Model
Training as
a service
Sensor
app
Scoring
app
Dashboard
app
Data science workflow: Movement classification
30. 30© Copyright 2016 Pivotal. All rights reserved.
Training
app
Model
Scoring as
a service
API Call
Model
Training as
a service
Sensor
app
Scoring
app
Dashboard
app
Data science workflow: Creating a smart app to
prevent oil spill disasters • Alert operator
• Send signal to control system
to change operating
parameters
• Replace old machinery
• Shut down plant
31. 31© Copyright 2016 Pivotal. All rights reserved.
Training
app
Model
Scoring as
a service
API Call
Model
Training as
a service
Sensor
app
Scoring
app
Dashboard
app
Data science workflow: How can retail banks influence their
cardholders’ behavior? • Provide customized services
and promotions
• Next best offer
• Characterize and improve
customer satisfaction
32. 32© Copyright 2016 Pivotal. All rights reserved.
Blogs on Building Data Science Apps
Blogs
! Scoring-as-a-Service To Operationalize Algorithms For Real Time
! How to Scale a Machine Learning Model Using Pivotal Cloud Foundry
! Data Science How-To: Text Analytics as a Service
crawles@pivotal.io