SlideShare a Scribd company logo
1 of 21
From Intelligent Transportation in Madrid to Smart Homes
in Taipei: An IoT Data Analytics architecture applicable to
multiple real world use cases
Thursday, 23 June 2016
Adnan Akbar
Institute for Communication Systems (ICS)
5G Innovation Centre (5GIC)
University of Surrey, UK
Adnan.akbar@surrey.ac.uk
Joint work with:
Paula Ta-Shma, IBM Research
Michael Factor, IBM Research
Guy Hadash, IBM Research
Juan Sancho, ATOS
What is Internet of Things ?
• “Internet of Things is based on the vision of connecting everyday objects to internet to form a cyber-
physical system, where every object will be represented by its virtual representation enabling the
control of physical world remotely” (F. Mattern and C. Floerkemeier)
• Connecting Everyday Objects
– Physical things containing chips/ sensors
– capture and communicate all types of data
• Virtual Representation
• Control of Physical World
– interact with other devices, computing systems and the external environment, including people
Thursday, 23 June 2016
IoT Data Analytics
• More Data, More opportunities, But More Challenges for analyzing and extracting knowledge from
this data
Thursday, 23 June 2016
Which are the right set of tools ?
Which processing model
should be used to analyze
this data ?
Which analytic methods are
available to get more value
from this data ?
IoT Data
Which processing Model to use ?
Thursday, 23 June 2016
Batch Processing vs Event Processing or Real-time vs Historical
IoT Data
Batch
Processing
Event
Processing
Complex Event
Processing
Machine Learning
Statistical Methods
Hybrid
Solutions
Right combination of tools for IoT data ?
Thursday, 23 June 2016
Plethora of open source projects for storing and Processing Big data
SwiftSecor
Elasticsearch
Generic IoT Architecture – Data Flow
Thursday, 23 June 2016
Ingestion
1. Collect historical time series data
– Collect data from devices
– Aggregate into objects
– Index and/or partition
Secor
IoT
Swift
Generic IoT Architecture – Data Flow
Thursday, 23 June 2016
Historical Data Access and Analytics
Secor
Swift
2. Learn patterns in data
– May be time/location dependent
– Generate thresholds, classifiers etc.
Generic IoT Architecture – Data Flow
Thursday, 23 June 2016
Real-Time Data Analytics
IoT
Secor
CEP
Swift
3. Apply what was learned on
real time data stream
– Take action
Proposed Solution: A Lambda Architecture for IoT
1) Ingestion
2) Historical Data Analytics (Batch Processing)
3) Real-time Data Analytics (Event Processing)
Thursday, 23 June 2016
A generic IoT Analytics architecture
IoT
CEP
Secor
Swift
Green Flows: Real
time
Purple Flows: Batch
Use Case 1: Intelligent Transportation System for Madrid Council
• Problem
• Over 3000 traffic sensors deployed through city of Madrid
• EMT needs to staff control rooms where employees manually analyze Madrid traffic sensor output.
This can be slow and costly.
• Objective
• Improve customer satisfaction and reduce costs by responding more efficiently and quickly to real-
time traffic problems
• Approach
• Ingest data from up to 3000 sensors in to our architecture, learn patterns from historical data,
apply it in real-time data using CEP and React by alerting drivers, calling emergency vehicles,
rerouting buses, modifying traffic lights, etc
Thursday, 23 June 2016
Today Tomorrow
IoT Architecture – Madrid Traffic – Ingestion Flow
Aim: Collect historical timeseries data for analysis
– Continuously collect data from up to 3000 Madrid council traffic sensors via web service
• Data includes traffic speeds and intensities, updated every 5 mins
– Push the messages to Kafka
– Use Secor to aggregate multiple messages into a single Swift object
• According to policy, e.g., every 60 mins
• Possibly partition the data, e.g. according to date
• Convert to Parquet format
• Annotate with metadata, e.g., min/max speed, start/end time
– Index Swift objects according to their metadata using ElasticSearch
Secor
Swift
IoT
Thursday, 23 June 2016
IoT Architecture – Madrid Traffic – Data Access
Aim: Access data efficiently and cost effectively
– Store IoT data in OpenStack Swift object storage
• Open source, low cost deployment, and highly scalable
– Parquet data is accessible via Spark SQL
– Optimized predicate pushdown
• Custom Spark SQL external data source driver
• Uses object metadata indexes
• Searches for Swift objects whose min/max values overlap requested ranges
Get all data for morning traffic:
SELECT codigo, intensidad, velocidad FROM
madridtraffic
WHERE tf >= '08:00:00' AND tf <= '12:00:00'
Brute force method
13245 Swift requests
Optimized predicate pushdown
616 Swift requests
21.5 times improvement
Swift
Thursday, 23 June 2016
IoT Architecture – Madrid Traffic – Machine Learning
Aim: Learn to differentiate between ‘good’ and ‘bad’ traffic
– Depends on context
• Time (morning/evening), Day (weekday/weekend)
• Location
– Use Spark MLlib k-means clustering
– Produce threshold values for real-time decision making
– Re-run algorithm when quality of clusters decreases
• Can use silhouette index to measure quality Swift
Thursday, 23 June 2016
IoT Architecture – Madrid Traffic – Machine Learning
Event Detection:
• Use Spark MLlib k-means
clustering to separate data
into 2 clusters
• Find the midpoint between
the 2 cluster centres
• Use this midpoint to
generate the thresholds
• Repeat for each context e.g.
time period (morning,
afternoon, evening, night)
Anomaly Detection:
• Use a single cluster and
define an anomaly to be
further than a certain
distance from the cluster
centre
Morning Traffic on Weekdays
Thursday, 23 June 2016
IoT Architecture – Madrid Traffic –
Real Time Decision Making
Aim: Respond in real time to traffic conditions
– Use Complex Event Processing (CEP) approach
• Rule based
• Process events record by record
• CEP rules are typically defined manually but in many cases it is difficult
to get them right
– We automate this process and make it smart
CEP
IoT
Prediction
Proactive approach:
• Use Spark streaming
linear regression to
predict traffic behavior
(e.g. speed, intensity) for
near future
• Apply CEP on predicted
data
• Respond pro-actively to
predicted events such as
traffic congestion
– e.g. EMT can
proactively re-
route buses
Thursday, 23 June 2016
Use Case 2: Taipei Smart Homes
Thursday, 23 June 2016
Smart plugs
Home Gateway
Real-time monitoring, control, and report of home
appliances energy usage
• Taipei test scenario
comprised of fifty 50
volunteer
households
• Installed with Smart
Energy kit (incl.
home gateway,
smart plugs, and
smart strips)
• Real-time Energy
usage
Goal: Real time Monitoring of Appliances in order to detect anomalies
Taipei Smart Homes
• Example of Anomalies
• Short circuit of a device
• Devices being operated at unusual times
• An Anomaly at night might not be an anomaly at daytime
• Same Architecture is used for monitoring Energy data
• Only difference lies in the type of Analytics and Rules
• Historical Data Analytics
• Learn normal patterns from historical data
• Use CEP rules to detect the deviation from normal
• Different Models for different context
• Time of a day (Morning, Afternoon, Evening, Night)
• Weekday or weekend
• Winter or summer
• Rainy or sunny
Thursday, 23 June 2016
Real-Time Anomaly detection using COSMOS Data Analytics Architecture
CEP
Secor
Swift
Node-
Red
7
……
PC/monitor
……
istrip
Refrigerator
sensor
Fan / Lighting
Real-time warning
messages
Thursday, 23 June 2016
COSMOS Data Analytics
Our Architecture Applies to Many IoT Use cases
• Healthcare
• Healthcare patient monitoring/alert/response
• Logistics
• Monitoring of sensitive goods
• Social Media
• Event detection if high number of posts detected as compared to normal behavior
• Insurance
• Driver behavior and location monitoring
• Transportation
• Connected vehicles, engine diagnostics, automated service scheduling
Thursday, 23 June 2016
COSMOS
Funding: EU FP7 at level of 2PY x 3 years
Started: Sept 2013
Coordinator: ATOS
Technical partners: University of Surrey, IBM, NTUA, Siemens, ATOS
Use Case Partners: Hildebrand/Camden, EMT Madrid Bus Transport/Madrid Council, III Taiwan – Smart
Cities use cases
Project Vision: Enable ‘things’ to interact with each other based on shared experience, trust, reputation etc.
Thursday, 23 June 2016
Thank you.
Any Questions ?
Thursday, 23 June 2016
For more details, Email: adnan.akbar@surrey.ac.uk

More Related Content

Similar to COSMOS Data Analytics Architecture

General introduction to IoTCrawler
General introduction to IoTCrawlerGeneral introduction to IoTCrawler
General introduction to IoTCrawlerIoTCrawler
 
A Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
A Web of Things Based Eco-System for Urban Computing - Towards Smarter CitiesA Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
A Web of Things Based Eco-System for Urban Computing - Towards Smarter CitiesAndreas Kamilaris
 
MDM-2013, Milan, Italy, 6 June, 2013
MDM-2013, Milan, Italy, 6 June, 2013MDM-2013, Milan, Italy, 6 June, 2013
MDM-2013, Milan, Italy, 6 June, 2013Charith Perera
 
Internet of Things & Big Data
Internet of Things & Big DataInternet of Things & Big Data
Internet of Things & Big DataArun Rajput
 
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...Luigi Vanfretti
 
A Data-driven Approach for Internet of Things Applications: Methods and Case ...
A Data-driven Approach for Internet of Things Applications: Methods and Case ...A Data-driven Approach for Internet of Things Applications: Methods and Case ...
A Data-driven Approach for Internet of Things Applications: Methods and Case ...Suparna De
 
AI & IoT in the development of smart cities
AI & IoT in the development of smart citiesAI & IoT in the development of smart cities
AI & IoT in the development of smart citiesRaunak Mundada
 
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...Fatima Qayyum
 
Data processing in Cyber-Physical Systems
Data processing in Cyber-Physical SystemsData processing in Cyber-Physical Systems
Data processing in Cyber-Physical SystemsBob Marcus
 
Machine Learning in the Real World
Machine Learning in the Real WorldMachine Learning in the Real World
Machine Learning in the Real WorldSrinath Perera
 
Data Ingestion At Scale (CNECCS 2017)
Data Ingestion At Scale (CNECCS 2017)Data Ingestion At Scale (CNECCS 2017)
Data Ingestion At Scale (CNECCS 2017)Jeffrey Sica
 
ISWC 2016 Tutorial: Semantic Web of Things M3 framework & FIESTA-IoT EU project
ISWC 2016 Tutorial: Semantic Web of Things  M3 framework & FIESTA-IoT EU projectISWC 2016 Tutorial: Semantic Web of Things  M3 framework & FIESTA-IoT EU project
ISWC 2016 Tutorial: Semantic Web of Things M3 framework & FIESTA-IoT EU projectFIESTA-IoT
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteRoger Barga
 
IoT Mobility Forensics
IoT Mobility ForensicsIoT Mobility Forensics
IoT Mobility ForensicsSabidur Rahman
 
Systemof insight
Systemof insightSystemof insight
Systemof insightsuresh sood
 
IoT (Internet of Things)
IoT (Internet of Things)IoT (Internet of Things)
IoT (Internet of Things)TusharSoam
 
2016.07.05 Talk @Ciência 2016, Lisbon
2016.07.05 Talk @Ciência 2016, Lisbon2016.07.05 Talk @Ciência 2016, Lisbon
2016.07.05 Talk @Ciência 2016, LisbonAna Aguiar
 
Research and Testbeds in Cyber-Physical Systems
Research and Testbeds in Cyber-Physical SystemsResearch and Testbeds in Cyber-Physical Systems
Research and Testbeds in Cyber-Physical SystemsBob Marcus
 

Similar to COSMOS Data Analytics Architecture (20)

General introduction to IoTCrawler
General introduction to IoTCrawlerGeneral introduction to IoTCrawler
General introduction to IoTCrawler
 
A Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
A Web of Things Based Eco-System for Urban Computing - Towards Smarter CitiesA Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
A Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
 
MDM-2013, Milan, Italy, 6 June, 2013
MDM-2013, Milan, Italy, 6 June, 2013MDM-2013, Milan, Italy, 6 June, 2013
MDM-2013, Milan, Italy, 6 June, 2013
 
Internet of Things & Big Data
Internet of Things & Big DataInternet of Things & Big Data
Internet of Things & Big Data
 
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
 
A Data-driven Approach for Internet of Things Applications: Methods and Case ...
A Data-driven Approach for Internet of Things Applications: Methods and Case ...A Data-driven Approach for Internet of Things Applications: Methods and Case ...
A Data-driven Approach for Internet of Things Applications: Methods and Case ...
 
AI & IoT in the development of smart cities
AI & IoT in the development of smart citiesAI & IoT in the development of smart cities
AI & IoT in the development of smart cities
 
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
 
Data processing in Cyber-Physical Systems
Data processing in Cyber-Physical SystemsData processing in Cyber-Physical Systems
Data processing in Cyber-Physical Systems
 
Machine Learning in the Real World
Machine Learning in the Real WorldMachine Learning in the Real World
Machine Learning in the Real World
 
Cosmos_IoT_Week_TV_0
Cosmos_IoT_Week_TV_0Cosmos_IoT_Week_TV_0
Cosmos_IoT_Week_TV_0
 
Data Ingestion At Scale (CNECCS 2017)
Data Ingestion At Scale (CNECCS 2017)Data Ingestion At Scale (CNECCS 2017)
Data Ingestion At Scale (CNECCS 2017)
 
ISWC 2016 Tutorial: Semantic Web of Things M3 framework & FIESTA-IoT EU project
ISWC 2016 Tutorial: Semantic Web of Things  M3 framework & FIESTA-IoT EU projectISWC 2016 Tutorial: Semantic Web of Things  M3 framework & FIESTA-IoT EU project
ISWC 2016 Tutorial: Semantic Web of Things M3 framework & FIESTA-IoT EU project
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 Keynote
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
IoT Mobility Forensics
IoT Mobility ForensicsIoT Mobility Forensics
IoT Mobility Forensics
 
Systemof insight
Systemof insightSystemof insight
Systemof insight
 
IoT (Internet of Things)
IoT (Internet of Things)IoT (Internet of Things)
IoT (Internet of Things)
 
2016.07.05 Talk @Ciência 2016, Lisbon
2016.07.05 Talk @Ciência 2016, Lisbon2016.07.05 Talk @Ciência 2016, Lisbon
2016.07.05 Talk @Ciência 2016, Lisbon
 
Research and Testbeds in Cyber-Physical Systems
Research and Testbeds in Cyber-Physical SystemsResearch and Testbeds in Cyber-Physical Systems
Research and Testbeds in Cyber-Physical Systems
 

Recently uploaded

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Recently uploaded (20)

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

COSMOS Data Analytics Architecture

  • 1. From Intelligent Transportation in Madrid to Smart Homes in Taipei: An IoT Data Analytics architecture applicable to multiple real world use cases Thursday, 23 June 2016 Adnan Akbar Institute for Communication Systems (ICS) 5G Innovation Centre (5GIC) University of Surrey, UK Adnan.akbar@surrey.ac.uk Joint work with: Paula Ta-Shma, IBM Research Michael Factor, IBM Research Guy Hadash, IBM Research Juan Sancho, ATOS
  • 2. What is Internet of Things ? • “Internet of Things is based on the vision of connecting everyday objects to internet to form a cyber- physical system, where every object will be represented by its virtual representation enabling the control of physical world remotely” (F. Mattern and C. Floerkemeier) • Connecting Everyday Objects – Physical things containing chips/ sensors – capture and communicate all types of data • Virtual Representation • Control of Physical World – interact with other devices, computing systems and the external environment, including people Thursday, 23 June 2016
  • 3. IoT Data Analytics • More Data, More opportunities, But More Challenges for analyzing and extracting knowledge from this data Thursday, 23 June 2016 Which are the right set of tools ? Which processing model should be used to analyze this data ? Which analytic methods are available to get more value from this data ? IoT Data
  • 4. Which processing Model to use ? Thursday, 23 June 2016 Batch Processing vs Event Processing or Real-time vs Historical IoT Data Batch Processing Event Processing Complex Event Processing Machine Learning Statistical Methods Hybrid Solutions
  • 5. Right combination of tools for IoT data ? Thursday, 23 June 2016 Plethora of open source projects for storing and Processing Big data SwiftSecor Elasticsearch
  • 6. Generic IoT Architecture – Data Flow Thursday, 23 June 2016 Ingestion 1. Collect historical time series data – Collect data from devices – Aggregate into objects – Index and/or partition Secor IoT Swift
  • 7. Generic IoT Architecture – Data Flow Thursday, 23 June 2016 Historical Data Access and Analytics Secor Swift 2. Learn patterns in data – May be time/location dependent – Generate thresholds, classifiers etc.
  • 8. Generic IoT Architecture – Data Flow Thursday, 23 June 2016 Real-Time Data Analytics IoT Secor CEP Swift 3. Apply what was learned on real time data stream – Take action
  • 9. Proposed Solution: A Lambda Architecture for IoT 1) Ingestion 2) Historical Data Analytics (Batch Processing) 3) Real-time Data Analytics (Event Processing) Thursday, 23 June 2016 A generic IoT Analytics architecture IoT CEP Secor Swift Green Flows: Real time Purple Flows: Batch
  • 10. Use Case 1: Intelligent Transportation System for Madrid Council • Problem • Over 3000 traffic sensors deployed through city of Madrid • EMT needs to staff control rooms where employees manually analyze Madrid traffic sensor output. This can be slow and costly. • Objective • Improve customer satisfaction and reduce costs by responding more efficiently and quickly to real- time traffic problems • Approach • Ingest data from up to 3000 sensors in to our architecture, learn patterns from historical data, apply it in real-time data using CEP and React by alerting drivers, calling emergency vehicles, rerouting buses, modifying traffic lights, etc Thursday, 23 June 2016 Today Tomorrow
  • 11. IoT Architecture – Madrid Traffic – Ingestion Flow Aim: Collect historical timeseries data for analysis – Continuously collect data from up to 3000 Madrid council traffic sensors via web service • Data includes traffic speeds and intensities, updated every 5 mins – Push the messages to Kafka – Use Secor to aggregate multiple messages into a single Swift object • According to policy, e.g., every 60 mins • Possibly partition the data, e.g. according to date • Convert to Parquet format • Annotate with metadata, e.g., min/max speed, start/end time – Index Swift objects according to their metadata using ElasticSearch Secor Swift IoT Thursday, 23 June 2016
  • 12. IoT Architecture – Madrid Traffic – Data Access Aim: Access data efficiently and cost effectively – Store IoT data in OpenStack Swift object storage • Open source, low cost deployment, and highly scalable – Parquet data is accessible via Spark SQL – Optimized predicate pushdown • Custom Spark SQL external data source driver • Uses object metadata indexes • Searches for Swift objects whose min/max values overlap requested ranges Get all data for morning traffic: SELECT codigo, intensidad, velocidad FROM madridtraffic WHERE tf >= '08:00:00' AND tf <= '12:00:00' Brute force method 13245 Swift requests Optimized predicate pushdown 616 Swift requests 21.5 times improvement Swift Thursday, 23 June 2016
  • 13. IoT Architecture – Madrid Traffic – Machine Learning Aim: Learn to differentiate between ‘good’ and ‘bad’ traffic – Depends on context • Time (morning/evening), Day (weekday/weekend) • Location – Use Spark MLlib k-means clustering – Produce threshold values for real-time decision making – Re-run algorithm when quality of clusters decreases • Can use silhouette index to measure quality Swift Thursday, 23 June 2016
  • 14. IoT Architecture – Madrid Traffic – Machine Learning Event Detection: • Use Spark MLlib k-means clustering to separate data into 2 clusters • Find the midpoint between the 2 cluster centres • Use this midpoint to generate the thresholds • Repeat for each context e.g. time period (morning, afternoon, evening, night) Anomaly Detection: • Use a single cluster and define an anomaly to be further than a certain distance from the cluster centre Morning Traffic on Weekdays Thursday, 23 June 2016
  • 15. IoT Architecture – Madrid Traffic – Real Time Decision Making Aim: Respond in real time to traffic conditions – Use Complex Event Processing (CEP) approach • Rule based • Process events record by record • CEP rules are typically defined manually but in many cases it is difficult to get them right – We automate this process and make it smart CEP IoT Prediction Proactive approach: • Use Spark streaming linear regression to predict traffic behavior (e.g. speed, intensity) for near future • Apply CEP on predicted data • Respond pro-actively to predicted events such as traffic congestion – e.g. EMT can proactively re- route buses Thursday, 23 June 2016
  • 16. Use Case 2: Taipei Smart Homes Thursday, 23 June 2016 Smart plugs Home Gateway Real-time monitoring, control, and report of home appliances energy usage • Taipei test scenario comprised of fifty 50 volunteer households • Installed with Smart Energy kit (incl. home gateway, smart plugs, and smart strips) • Real-time Energy usage Goal: Real time Monitoring of Appliances in order to detect anomalies
  • 17. Taipei Smart Homes • Example of Anomalies • Short circuit of a device • Devices being operated at unusual times • An Anomaly at night might not be an anomaly at daytime • Same Architecture is used for monitoring Energy data • Only difference lies in the type of Analytics and Rules • Historical Data Analytics • Learn normal patterns from historical data • Use CEP rules to detect the deviation from normal • Different Models for different context • Time of a day (Morning, Afternoon, Evening, Night) • Weekday or weekend • Winter or summer • Rainy or sunny Thursday, 23 June 2016
  • 18. Real-Time Anomaly detection using COSMOS Data Analytics Architecture CEP Secor Swift Node- Red 7 …… PC/monitor …… istrip Refrigerator sensor Fan / Lighting Real-time warning messages Thursday, 23 June 2016 COSMOS Data Analytics
  • 19. Our Architecture Applies to Many IoT Use cases • Healthcare • Healthcare patient monitoring/alert/response • Logistics • Monitoring of sensitive goods • Social Media • Event detection if high number of posts detected as compared to normal behavior • Insurance • Driver behavior and location monitoring • Transportation • Connected vehicles, engine diagnostics, automated service scheduling Thursday, 23 June 2016
  • 20. COSMOS Funding: EU FP7 at level of 2PY x 3 years Started: Sept 2013 Coordinator: ATOS Technical partners: University of Surrey, IBM, NTUA, Siemens, ATOS Use Case Partners: Hildebrand/Camden, EMT Madrid Bus Transport/Madrid Council, III Taiwan – Smart Cities use cases Project Vision: Enable ‘things’ to interact with each other based on shared experience, trust, reputation etc. Thursday, 23 June 2016
  • 21. Thank you. Any Questions ? Thursday, 23 June 2016 For more details, Email: adnan.akbar@surrey.ac.uk

Editor's Notes

  1. I will start with the brief introduction about IoT, I will not go into details as I assume that everyone here is quite familiar with the term. You will find many defs of IoT but this one is my personal fav, It has three main parts, first one is connecting everyday objects which is any physical entity. It can be your shoe, your fridge or your bus. Where every object will have its own virtual representation where its properties will be exposed using sensors. And the last but not the least is control of physical world. In order to control physical order, you need to understand the context and meaning from the data measured by these objects. We have heard in last 2 days that no of connected devices is increasing and so that the data generated by these devices. Data is not only increasing in size but complexity and data is of no value until high level knowledge is extracted from it in order to control the physical world. So what really is the Internet of Things?   It is made up of physical objects (“things”) that have chips, sensors embedded in them that allow the sensing, capturing and communication of all types of data. These devices are then linked through both wired and wireless networks to the Internet.  Advanced  “things” have actuators embedded into them as well, giving them the capability to interact with other devices, computing systems and the external environment, including people. IoT takes this one step further – Actuation Quantity of data and quality of solution (actuation) Sensors have existed for a long time, think how many sensors you need to send a rocket into space, but today this is not rocket science, what is happening is that sensors are becoming commodities, leading to adoption on a massive scale, enabling new applications to be possible e.g. placing large numbers of sensors in agricultural fields to measure soil humidity and nutrient levels
  2. The advent of IoT has resulted in a trend towards more innovative and automated applications. Data is not only increasing in size but in complexity as well and data itself is of no value until high level knowledge is extracted from it. And when we talk about extracting high level knowledge , there are three main questions surrounding it.
  3. But in IoT data is generated in the form of real-time events which form complex patterns where each complex pattern represent a unique event. These unique events must be interpreted with minimal time latency in order to apply them for decision making in the context of current situation. The need for processing, analyzing and inferring from these complex patterns in near real-time forms the basis of a research area called Complex Event Processing (CEP) [4]. The Research area of CEP includes processing, analyzing and correlating event streams from different data sources to infer more complex events in near real-time
  4. Kafka, In our architecture, we have used apache Kafka as the message broker for real-time generated events. It is also an open source tool for real-time publishing and subscribing of messages or data. It provides a scalable architecture for high throughput data feeds with very low latency. What makes kafka unique on other available systems is its persistent nature to hold the messages for a set amount of time in the form of a log (ordered set of messages). Secor is an open source tool which takes multiple msgs from kafka topic,,aggregates them together, and stores them into object storage. Up Until now it only supported amazon s3 as its object storage but we added support for open stack swift as well. Openstack swift: The OpenStack Object Store project, known as Swift, offers cloud storage software so that data can be stored and retrieved efficiently with a simple API. It's built for scale and optimized for durability, availability, and concurrency across the entire data set. Swift is ideal for storing unstructured Iot data that can grow without bound. Parquet: Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model or programming language. and elastic search definitions As its IoT data, like traffic readings from Madrid, it will have json object, time stamp. Ids etc. semi strucrted data. In order to use spark sql, we want to store it in parquet form.
  5. Apache Spark: Apache Spark™ is a fast and general engine for large-scale data processing. Spark SQL is Apache Spark's module for working with structured data.
  6. Object storage (openstack swift) as a long term repository for IoT data Scalable and relatively low cost By adding metadata to describe what is contained in each object and metadata search we can access it efficiently Databases are often overkill for what is needed by analytics Secor works according to defined policy. We can define to create a new object when the size reaches 1 MB, or alternatively time based policy i.e. to create a new object every 60 mins. That’s how the swift object look like.actually it’s a flat name space but the object name has slashes inside them and this is wht basically the partition data looks like, systems such as hive and is supported by Spark SQL. We are using a parquet data format which is nice for IoT data, you can do column based compression. Or if you are interested in reading only specific columns, you can do it in parquet format. We extended secor in order to support it for converting data in parquet format. We also extended secor by allowing annotation of meta data with objects. In swift when you create an object, you can also annotate it with meta data.
  7. Can depend on other elements of context like weather etc. Note: table is for one location only
  8. Same architecture can be used to detect events which in this case will be good and bad traffic. And the same architecture can be used to detect anomalies which might be an accident or a congestion. For detecting anomaly, we use a single cluster and if the new point point is further away from a centre, we classify it as an anomaly.
  9. Importance of responding in time