SlideShare a Scribd company logo
1 of 33
Download to read offline
1© Copyright 2016 Pivotal. All rights reserved.
Data Science-Powered Apps for
the Internet of Things
Chris Rawles1 and Jarrod Vawdrey2
1. Sr. Data Scientist in New York, New York
2. Sr. Data Scientist in Atlanta, Georgia
2© Copyright 2016 Pivotal. All rights reserved.
‘By the year 2025, $4 to $11 trillion of
economic value could be created through
the Internet of Things.’
Michael Chui
Partner, McKinsey & Company
3© Copyright 2016 Pivotal. All rights reserved.
IoT
Platform
Applications
Data Science
4© Copyright 2016 Pivotal. All rights reserved.
New business models
Improve efficiencies
Personalized experiences
_____________________
Trillions $ Economic Value
5© Copyright 2016 Pivotal. All rights reserved.
Today’s Speaker
Chris Rawles
Senior Data Scientist
6© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1.  A real-time data science app
A.  The app: a live demonstration
B.  How can a data scientist build a data science application?
C.  Revisiting the app
2.  Generalizing the framework: Solving new data science
challenges
A.  Internet of Things – Creating a smart app to prevent oil spill disasters
B.  Financial data - How can retail banks influence their cardholders’
behavior?
7© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1.  A real-time data science app
A.  The app: a live demonstration
B.  How can a data scientist build a data science application?
C.  Revisiting the app
2.  Generalizing the framework: Solving new data science
challenges
A.  Internet of Things – Creating a smart app to prevent oil spill disasters
B.  Financial data - How can retail banks influence their cardholders’
behavior?
8© Copyright 2016 Pivotal. All rights reserved.
App
9© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1.  A real-time data science app
A.  The app: a live demonstration
B.  How can a data scientist build a data science application?
C.  Revisiting the app
2.  Generalizing the framework: Solving new data science
challenges
A.  Internet of Things – creating a smart app
B.  Financial data - How can retail banks influence their cardholders’
behavior?
10© Copyright 2016 Pivotal. All rights reserved.
Training

app
Model
Scoring as
a service
API Call
Model
Training as
a service
Sensor

app
Scoring

app
Dashboard

app
Data science workflow: Movement classification
1.  Sensor + Dashboard
2.  Redis
3.  Training app
4.  Scoring app
11© Copyright 2016 Pivotal. All rights reserved.
here is my source code
run it on the cloud for me
-  Onsi Fakhouri
@onsijoe
i do not care how
12© Copyright 2016 Pivotal. All rights reserved.
cf push
!  CF determines app type (Java, Python, Ruby, …)
!  Installs necessary environment
!  Provisions and binds data services
!  Creates domain, routing, and load balancing
!  Continual app health checks and restarts
13© Copyright 2016 Pivotal. All rights reserved.
Data ingestion: Accelerometric data
!  Accelerometric data streamed from
mobile phone at 15 Hz (15x / second)
!  Other sensor data: gyroscopic data,
magnetometer data, lon/lat, etc.
Accelerometer axes
14© Copyright 2016 Pivotal. All rights reserved.
!  For real-time applications, low-latency data ingestion into
the data store is essential
!  WebSocket protocol - socket.io
–  Mobile phone " Webserver
–  Webserver " Dashboard
!  socket.io " redis
Data ingestion
Training

app
Sensor

app
15© Copyright 2016 Pivotal. All rights reserved.
Data storage
!  We are using a redis store for:
–  Storing training data
–  Model persistence
–  Storing a micro-batch of scoring data
!  Other storage systems include Pivotal Cloud Cache,
GemFire, HAWQ/Hadoop, Greenplum Database,
PostgreSQL, …
16© Copyright 2016 Pivotal. All rights reserved.
Modeling
Scalable machine learning applications in Pivotal
Cloud Foundry
1.  Training app
2.  Scoring app
17© Copyright 2016 Pivotal. All rights reserved.
Modeling – Training app
!  Goal: build a data-driven model that learns accelerometric
motions associated with each activity
Feature Engineering
•  Time-domain
transformations
•  Fast Fourier Transform
analysis
Machine Learning
Classification Model
•  Random Forest Model
using 2 second time
windows (30 samples)
…
Training data
Trained
model
18© Copyright 2016 Pivotal. All rights reserved.
Model building
!  20 seconds per
training activity
!  Two second moving
window on training
data
!  Features: time-
domain summary
statistics and Fourier
transform coefficients
19© Copyright 2016 Pivotal. All rights reserved.
Model training approaches
1.  Near-real-time model training
–  Use small batches to train model
2.  Real-time model training
–  Online machine learning algorithm : continually update model
using each new data point
3.  Offline model training
–  Build a model offline using batches
–  Useful for models requiring finer model tuning and calibration
20© Copyright 2016 Pivotal. All rights reserved.
Feature Engineering
•  Time-domain
transformations
•  Fast Fourier Transform
analysis
Machine Learning
Classification Model
•  Random Forest Model
using 2 second time
windows (30 samples)
Trained model
Streaming input window
Model
Prediction
API Call
Model
prediction
PCF App:
Scoring app
•  Real-time model scoring
•  The dashboard initiates a request via
an API call and receives a model
prediction
{ "channel": "1234",
"label": ”walking",
”score": 0.746 }
21© Copyright 2016 Pivotal. All rights reserved.
Scaling the model scoring application
$	cf	scale	–i	10	 Scoring App
Scoring App
Scoring App
Scoring App
Horizontal scaling
22© Copyright 2016 Pivotal. All rights reserved.
1.  Application auto-scaling
–  As the data grows, the model scales
2.  Building a model factory–evaluate many models in production
3.  Application autonomy
–  The model application is independent of other applications = faster
development iterations
–  Faster development = rapid feedback loop
4.  Multiple applications can access model scoring app
Operationalizing scalable data science applications
Model scoring as a service
Why?
23© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1.  A real-time data science app
A.  The app: a live demonstration
B.  How can a data scientist build a data science application?
C.  Revisiting the app
2.  Generalizing the framework: Solving new data science
challenges
A.  Internet of Things – creating a smart app
B.  Financial data - How can retail banks influence their cardholders’
behavior?
24© Copyright 2016 Pivotal. All rights reserved.
App
25© Copyright 2016 Pivotal. All rights reserved.
Today’s talk
1.  A real-time data science app
A.  The app: a live demonstration
B.  How can a data scientist build a data science application?
C.  Revisiting the app
2.  Generalizing the framework: Solving new data science
challenges
A.  Internet of Things – Creating a smart app to prevent oil spill disasters
B.  Financial data - How can retail banks influence their cardholders’
behavior?
26© Copyright 2016 Pivotal. All rights reserved.
Gene Sequencing
Smart Grids
COST TO SEQUENCE
ONE GENOME
HAS FALLEN FROM
$100M IN
2001
TO $10K IN 2011
TO $1K IN 2014
READING SMART METERS
EVERY 15 MINUTES IS
3000X MORE
DATA INTENSIVE
Stock Market
Social Media
FACEBOOK UPLOADS
250 MILLION
PHOTOS EACH DAY
In all industries billions of data points represent
opportunities for the Internet of Things
Oil Exploration
Video Surveillance
OIL RIGS GENERATE
25000
DATA POINTS
PER SECOND
Medical Imaging
Mobile Sensors
27© Copyright 2016 Pivotal. All rights reserved.
How can we use data
to help prevent
accidents like the Macondo
Disaster ?
28© Copyright 2016 Pivotal. All rights reserved. 28© Copyright 2016 Pivotal. All rights reserved.
…by creating a Smart Application
29© Copyright 2016 Pivotal. All rights reserved.
Training

app
Model
Scoring as
a service
API Call
Model
Training as
a service
Sensor

app
Scoring

app
Dashboard

app
Data science workflow: Movement classification
30© Copyright 2016 Pivotal. All rights reserved.
Training

app
Model
Scoring as
a service
API Call
Model
Training as
a service
Sensor

app
Scoring

app
Dashboard

app
Data science workflow: Creating a smart app to
prevent oil spill disasters •  Alert operator
•  Send signal to control system
to change operating
parameters
•  Replace old machinery
•  Shut down plant
31© Copyright 2016 Pivotal. All rights reserved.
Training

app
Model
Scoring as
a service
API Call
Model
Training as
a service
Sensor

app
Scoring

app
Dashboard

app
Data science workflow: How can retail banks influence their
cardholders’ behavior? •  Provide customized services
and promotions
•  Next best offer
•  Characterize and improve
customer satisfaction
32© Copyright 2016 Pivotal. All rights reserved.
Blogs on Building Data Science Apps
Blogs
!  Scoring-as-a-Service To Operationalize Algorithms For Real Time
!  How to Scale a Machine Learning Model Using Pivotal Cloud Foundry
!  Data Science How-To: Text Analytics as a Service
crawles@pivotal.io
33© Copyright 2016 Pivotal. All rights reserved.

More Related Content

Similar to Learn How to Operationalize IoT Apps on Pivotal Cloud Foundry

Sean gately internet of things
Sean gately   internet of thingsSean gately   internet of things
Sean gately internet of things
ProductCamp SoCal
 

Similar to Learn How to Operationalize IoT Apps on Pivotal Cloud Foundry (20)

Data Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsData Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of Things
 
The attention economy and the internet
The attention economy and the internetThe attention economy and the internet
The attention economy and the internet
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learning
 
Making your mobile testing strategy future-proof
Making your mobile testing strategy future-proofMaking your mobile testing strategy future-proof
Making your mobile testing strategy future-proof
 
Microservices: The Future-Proof Framework for IoT
Microservices: The Future-Proof Framework for IoTMicroservices: The Future-Proof Framework for IoT
Microservices: The Future-Proof Framework for IoT
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
 
Entreprise mobility approach within digital transformation
Entreprise mobility approach within digital transformationEntreprise mobility approach within digital transformation
Entreprise mobility approach within digital transformation
 
Platform Economy - Tech Vision 2016 Trend 3
Platform Economy - Tech Vision 2016 Trend 3Platform Economy - Tech Vision 2016 Trend 3
Platform Economy - Tech Vision 2016 Trend 3
 
Platform Economy - Tech Vision 2016 Trend 3
Platform Economy - Tech Vision 2016 Trend 3Platform Economy - Tech Vision 2016 Trend 3
Platform Economy - Tech Vision 2016 Trend 3
 
HOW OPEN APIS WILL CHANGE THE FUTURE OF BANKING
HOW OPEN APIS WILL CHANGE THE FUTURE OF BANKINGHOW OPEN APIS WILL CHANGE THE FUTURE OF BANKING
HOW OPEN APIS WILL CHANGE THE FUTURE OF BANKING
 
CMOfinalpresentation.ppt
CMOfinalpresentation.pptCMOfinalpresentation.ppt
CMOfinalpresentation.ppt
 
IOT & Machine Learning
IOT & Machine LearningIOT & Machine Learning
IOT & Machine Learning
 
Cognitive Computing : Trends to Watch in 2016
Cognitive Computing:  Trends to Watch in 2016Cognitive Computing:  Trends to Watch in 2016
Cognitive Computing : Trends to Watch in 2016
 
A Report On Online Crime Reporting Guidance By - Prepared By Muhammad Shoaib
A Report On Online Crime Reporting Guidance By  - Prepared By Muhammad ShoaibA Report On Online Crime Reporting Guidance By  - Prepared By Muhammad Shoaib
A Report On Online Crime Reporting Guidance By - Prepared By Muhammad Shoaib
 
Android Application for Updation
Android Application for UpdationAndroid Application for Updation
Android Application for Updation
 
EDA-based IoT in Oil & Gas
EDA-based IoT in Oil & GasEDA-based IoT in Oil & Gas
EDA-based IoT in Oil & Gas
 
IRJET- Medicine Information Retrieval Application- Pharmaguide
IRJET- Medicine Information Retrieval Application- PharmaguideIRJET- Medicine Information Retrieval Application- Pharmaguide
IRJET- Medicine Information Retrieval Application- Pharmaguide
 
The Impact of IoT on Product Design
The Impact of IoT on Product DesignThe Impact of IoT on Product Design
The Impact of IoT on Product Design
 
CA Technologies Survive and Thrive in the Application Economy- August 2014
CA Technologies   Survive and Thrive in the Application Economy- August 2014CA Technologies   Survive and Thrive in the Application Economy- August 2014
CA Technologies Survive and Thrive in the Application Economy- August 2014
 
Sean gately internet of things
Sean gately   internet of thingsSean gately   internet of things
Sean gately internet of things
 

More from VMware Tanzu

More from VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Learn How to Operationalize IoT Apps on Pivotal Cloud Foundry

  • 1. 1© Copyright 2016 Pivotal. All rights reserved. Data Science-Powered Apps for the Internet of Things Chris Rawles1 and Jarrod Vawdrey2 1. Sr. Data Scientist in New York, New York 2. Sr. Data Scientist in Atlanta, Georgia
  • 2. 2© Copyright 2016 Pivotal. All rights reserved. ‘By the year 2025, $4 to $11 trillion of economic value could be created through the Internet of Things.’ Michael Chui Partner, McKinsey & Company
  • 3. 3© Copyright 2016 Pivotal. All rights reserved. IoT Platform Applications Data Science
  • 4. 4© Copyright 2016 Pivotal. All rights reserved. New business models Improve efficiencies Personalized experiences _____________________ Trillions $ Economic Value
  • 5. 5© Copyright 2016 Pivotal. All rights reserved. Today’s Speaker Chris Rawles Senior Data Scientist
  • 6. 6© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1.  A real-time data science app A.  The app: a live demonstration B.  How can a data scientist build a data science application? C.  Revisiting the app 2.  Generalizing the framework: Solving new data science challenges A.  Internet of Things – Creating a smart app to prevent oil spill disasters B.  Financial data - How can retail banks influence their cardholders’ behavior?
  • 7. 7© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1.  A real-time data science app A.  The app: a live demonstration B.  How can a data scientist build a data science application? C.  Revisiting the app 2.  Generalizing the framework: Solving new data science challenges A.  Internet of Things – Creating a smart app to prevent oil spill disasters B.  Financial data - How can retail banks influence their cardholders’ behavior?
  • 8. 8© Copyright 2016 Pivotal. All rights reserved. App
  • 9. 9© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1.  A real-time data science app A.  The app: a live demonstration B.  How can a data scientist build a data science application? C.  Revisiting the app 2.  Generalizing the framework: Solving new data science challenges A.  Internet of Things – creating a smart app B.  Financial data - How can retail banks influence their cardholders’ behavior?
  • 10. 10© Copyright 2016 Pivotal. All rights reserved. Training
 app Model Scoring as a service API Call Model Training as a service Sensor
 app Scoring
 app Dashboard
 app Data science workflow: Movement classification 1.  Sensor + Dashboard 2.  Redis 3.  Training app 4.  Scoring app
  • 11. 11© Copyright 2016 Pivotal. All rights reserved. here is my source code run it on the cloud for me -  Onsi Fakhouri @onsijoe i do not care how
  • 12. 12© Copyright 2016 Pivotal. All rights reserved. cf push !  CF determines app type (Java, Python, Ruby, …) !  Installs necessary environment !  Provisions and binds data services !  Creates domain, routing, and load balancing !  Continual app health checks and restarts
  • 13. 13© Copyright 2016 Pivotal. All rights reserved. Data ingestion: Accelerometric data !  Accelerometric data streamed from mobile phone at 15 Hz (15x / second) !  Other sensor data: gyroscopic data, magnetometer data, lon/lat, etc. Accelerometer axes
  • 14. 14© Copyright 2016 Pivotal. All rights reserved. !  For real-time applications, low-latency data ingestion into the data store is essential !  WebSocket protocol - socket.io –  Mobile phone " Webserver –  Webserver " Dashboard !  socket.io " redis Data ingestion Training
 app Sensor
 app
  • 15. 15© Copyright 2016 Pivotal. All rights reserved. Data storage !  We are using a redis store for: –  Storing training data –  Model persistence –  Storing a micro-batch of scoring data !  Other storage systems include Pivotal Cloud Cache, GemFire, HAWQ/Hadoop, Greenplum Database, PostgreSQL, …
  • 16. 16© Copyright 2016 Pivotal. All rights reserved. Modeling Scalable machine learning applications in Pivotal Cloud Foundry 1.  Training app 2.  Scoring app
  • 17. 17© Copyright 2016 Pivotal. All rights reserved. Modeling – Training app !  Goal: build a data-driven model that learns accelerometric motions associated with each activity Feature Engineering •  Time-domain transformations •  Fast Fourier Transform analysis Machine Learning Classification Model •  Random Forest Model using 2 second time windows (30 samples) … Training data Trained model
  • 18. 18© Copyright 2016 Pivotal. All rights reserved. Model building !  20 seconds per training activity !  Two second moving window on training data !  Features: time- domain summary statistics and Fourier transform coefficients
  • 19. 19© Copyright 2016 Pivotal. All rights reserved. Model training approaches 1.  Near-real-time model training –  Use small batches to train model 2.  Real-time model training –  Online machine learning algorithm : continually update model using each new data point 3.  Offline model training –  Build a model offline using batches –  Useful for models requiring finer model tuning and calibration
  • 20. 20© Copyright 2016 Pivotal. All rights reserved. Feature Engineering •  Time-domain transformations •  Fast Fourier Transform analysis Machine Learning Classification Model •  Random Forest Model using 2 second time windows (30 samples) Trained model Streaming input window Model Prediction API Call Model prediction PCF App: Scoring app •  Real-time model scoring •  The dashboard initiates a request via an API call and receives a model prediction { "channel": "1234", "label": ”walking", ”score": 0.746 }
  • 21. 21© Copyright 2016 Pivotal. All rights reserved. Scaling the model scoring application $ cf scale –i 10 Scoring App Scoring App Scoring App Scoring App Horizontal scaling
  • 22. 22© Copyright 2016 Pivotal. All rights reserved. 1.  Application auto-scaling –  As the data grows, the model scales 2.  Building a model factory–evaluate many models in production 3.  Application autonomy –  The model application is independent of other applications = faster development iterations –  Faster development = rapid feedback loop 4.  Multiple applications can access model scoring app Operationalizing scalable data science applications Model scoring as a service Why?
  • 23. 23© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1.  A real-time data science app A.  The app: a live demonstration B.  How can a data scientist build a data science application? C.  Revisiting the app 2.  Generalizing the framework: Solving new data science challenges A.  Internet of Things – creating a smart app B.  Financial data - How can retail banks influence their cardholders’ behavior?
  • 24. 24© Copyright 2016 Pivotal. All rights reserved. App
  • 25. 25© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1.  A real-time data science app A.  The app: a live demonstration B.  How can a data scientist build a data science application? C.  Revisiting the app 2.  Generalizing the framework: Solving new data science challenges A.  Internet of Things – Creating a smart app to prevent oil spill disasters B.  Financial data - How can retail banks influence their cardholders’ behavior?
  • 26. 26© Copyright 2016 Pivotal. All rights reserved. Gene Sequencing Smart Grids COST TO SEQUENCE ONE GENOME HAS FALLEN FROM $100M IN 2001 TO $10K IN 2011 TO $1K IN 2014 READING SMART METERS EVERY 15 MINUTES IS 3000X MORE DATA INTENSIVE Stock Market Social Media FACEBOOK UPLOADS 250 MILLION PHOTOS EACH DAY In all industries billions of data points represent opportunities for the Internet of Things Oil Exploration Video Surveillance OIL RIGS GENERATE 25000 DATA POINTS PER SECOND Medical Imaging Mobile Sensors
  • 27. 27© Copyright 2016 Pivotal. All rights reserved. How can we use data to help prevent accidents like the Macondo Disaster ?
  • 28. 28© Copyright 2016 Pivotal. All rights reserved. 28© Copyright 2016 Pivotal. All rights reserved. …by creating a Smart Application
  • 29. 29© Copyright 2016 Pivotal. All rights reserved. Training
 app Model Scoring as a service API Call Model Training as a service Sensor
 app Scoring
 app Dashboard
 app Data science workflow: Movement classification
  • 30. 30© Copyright 2016 Pivotal. All rights reserved. Training
 app Model Scoring as a service API Call Model Training as a service Sensor
 app Scoring
 app Dashboard
 app Data science workflow: Creating a smart app to prevent oil spill disasters •  Alert operator •  Send signal to control system to change operating parameters •  Replace old machinery •  Shut down plant
  • 31. 31© Copyright 2016 Pivotal. All rights reserved. Training
 app Model Scoring as a service API Call Model Training as a service Sensor
 app Scoring
 app Dashboard
 app Data science workflow: How can retail banks influence their cardholders’ behavior? •  Provide customized services and promotions •  Next best offer •  Characterize and improve customer satisfaction
  • 32. 32© Copyright 2016 Pivotal. All rights reserved. Blogs on Building Data Science Apps Blogs !  Scoring-as-a-Service To Operationalize Algorithms For Real Time !  How to Scale a Machine Learning Model Using Pivotal Cloud Foundry !  Data Science How-To: Text Analytics as a Service crawles@pivotal.io
  • 33. 33© Copyright 2016 Pivotal. All rights reserved.