SlideShare a Scribd company logo
1 of 34
1© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Building a Data Analytics
PaaS for Smart Cities
Smiti Sharma, EMC Virtustream
Keith Manthey, EMC ETD
BRDC
2© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Intelligent Communities
Cities and Regions that use technology not just to
save money or make things work better, but also to
create high-quality employment, increase citizen
participation and become great places to live and
work.
ICF – Intelligent Community Forum
3© Copyright 2015 EMC Corporation. All rights reserved.
“Smart Cities that use Big Data are neither about intuition nor about
looking back and analyzing what went wrong and could be better.
They spot patterns. They look forward. They predict potential crisis situations.
They find what could be better and make it better.
Smart cities don’t guess.
Theyaresure!
4© Copyright 2015 EMC Corporation. All rights reserved. 4© Copyright 2015 EMC Corporation. All rights reserved.
VISION FOR CITIES OF THE FUTURE
Become an innovative city
SAFE
Anticipate risks and
protect people and
information
EFFICIENT
Optimized use of
city resources
SEAMLESS
Integrated daily
life services
IMPACTFUL
Enriched life and
business experiences
for all
5© Copyright 2015 EMC Corporation. All rights reserved. 5© Copyright 2015 EMC Corporation. All rights reserved.
IMPLICATIONS TO THE CITY
Empower the city, citizens, visitors, and businesses
IMPROVE
Quality of
urban living
CREATE
Efficient city
and transparent
government
DEVELOP
Vital
economy
REDUCE
Environmental
impact
ADDRESS
Infrastructure,
buildings and
urban planning
IMPROVE
Tourism,
recreation,
and city image
6© Copyright 2015 EMC Corporation. All rights reserved. 6© Copyright 2015 EMC Corporation. All rights reserved.
“BIG DATA” ENABLESCITIES OF THE FUTURE
Any data-set that cannot be processed with traditional systems
Social Networks, UGCPublic records
Location DataInternet of things
Emerging Data Sources
Unstructured Data
Dark Data
Structured Data
Traditional Data Sources
7© Copyright 2015 EMC Corporation. All rights reserved. 7© Copyright 2015 EMC Corporation. All rights reserved.
BUILD SMART CITY: THE PROBLEM
Understand the city data challenge
Geo
Distributed
Data
Source
Satellite-borne
Imaging Device
Airborne
Imaging
Device
Webcam
Environmental
Monitor
Health
Monitor
Traffic
monitor
Industrial
Process
Monitor
Data Center
Centralized Storage
and Analytics Systems
City Network
DATA CHALLENGE
There are massive endpoints for these systems.
How to manage massive and heterogeneous data
becomes an enormous challenge.
Diverse data sources requires normalization
and standardization to address Data
Orchestration and integration.
DATA USAGE CHALLENGE
How could we create some innovative business
to use these data to create more value
and fully use current investment?
8© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Architecture
9© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Guiding Principles
Agile
Open
Portable
Extensible Modular/Ftl.
Blocks
Analytics
Driven
Software
Defined
10© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Architecture
Ingestion Layer
Spring XD
Transformation Layer
Python
/Transformed_Files
KPIs
Métricas
exploration
Maps & Graphs
Visualization Layer
API
Open Data
Data Integration Layer
Python
/Transformed_Files
Schema and
Instance
Alignment
Data Sources
GUIs, Dashboard that access the
underlying databases and promote an
excellent User experience
Data modelling, metrics , ETL
mechanisms, definitions and
variable selectionProprietary and
Open Data
sources
APIs to expose data
Analytics
Data mining prediction
11© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Ingestion
Ingestion Layer
Spring XD
Data Sources
12© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Transformation
Transformation Layer
Python
/Transformed_Files
Data Cleaning Conversion
13© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Integration
Schema
Integration
Instance
Alignment
Integration Layer
Python
/Transformed_Files
Schema and
Instance
Alignment
14© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Integration
• Schema DB (INPUT)
• Schema Matching (Algorithms &
Heuristics)
• Suggest Attribute Mappings
(OUTPUT : SEMI-SUPERVISED)
• Instances of DB tables & Integration Rules (INPUT)
• Deduplication, Record Consolidation (Algorithms & Heuristics)
• Instance alignment using 2 phase-pass algorithm to avoid
duplicate insertion in a semi supervised data integration
tool)
• Attribute name similarity: fuzzy string comparisons
(cosine similarity)
• Levenshtein similarity: Categorical/String Data
• http://pgsimilarity.projects.pgfoundry.org/
• List of deduplicated instances (OUTPUT - SEMI-SUPERVISED)
Schema
Integration
Instance
Alignment
Integration Layer
Python
/Transformed_Files
Schema and
Instance
Alignment
15© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Integration
Schema
Integration
Instance
Alignment
Camada de Integração
Python
/Transformed_Files
Schema and
Instance
Alignment
Deduplication
Similarity Join
Mapeamento de
atributos (Inserir)
Mapeamento de
atributos (Selecionar)
Cosine
Levenshtein
16© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Visualization
KPIs
Métricas
exploration
Maps & Graphs
Visualization Layer
17© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
API implementation
API
Open Data
18© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Example use Case
Transportation
19© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
• Available Data:
– City bus movement information from on board devices (lat-
long, time, date, bus line, bus ID)
• Goals:
– Predict the time of the arrival in a bus stop
• Challenges
– Lack of data in certain areas of the city
– GPS precision
Prediction of bus arrival
20© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Architecture
GPS Ônibus Gemfire XD
Routes & Bus Stop
Data Lake
Streaming
Scheduler
Lazy-write
GPS
21© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
• Use each bus stop as a node, and the street as
edges
Transportation Network
𝑥𝑖 𝑥𝑗
𝑎𝑖𝑗 = +1
𝑎𝑖𝑗 = −1
𝑋 = 𝑥1, … , 𝑥 𝑁 , 𝑥𝑖 = 𝑙𝑎𝑡 𝑖 , 𝑙𝑜𝑛𝑔(𝑖)
𝐸 = 𝑒1, … , 𝑒 𝑀 , 𝑒 𝑘 = (𝑥𝑖, 𝑥𝑗)
22© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
• The goal is to find, for each 𝑒𝑗, an estimation of the
average speed in a instant 𝑡, 𝑣 𝑒𝑗, 𝑡 .
• Default model - estimate the velocity in each edge,
using historical data from the last month.
– Different hourly models for each day of the week
• Online Model - Use real-time date to calculate the
speed.
The model
23© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Average speed(km/h)
Default Model
26© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Prediction of Bus Arrival
27© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
• Based on the information of last use case,
extrapolate to verify the quality of the service
• Need to identify each bus trip, to evaluate the time
interval between two buses of the same line, at each
bus stop.
Another use case - Auditing
28© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Screen shot - Auditing
29© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Data Quality Issues
Route A
Route B
Bus GPS
30© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Extending
PaaS for
Smart Cities
31© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Real-time
Dashboard
Personal
Dashboard
Government
Transactional
Applications
Commercial
Big Data
Application
Government
Big Data
Application
Commercial
Transactional
Applications
Unified
Control
Center
Application
Layer
Security
Rules
Payment
Gateway
Trust
Authentication
Identity
Management
Locations &
Mapping
Platform as
a Service
Data
Governance
DATA ANALYTICS TOOLS Historic & Predictive/DATA APIs
Transactiona
l Data Store
Data
Transformation
Unstructured
Data
Structured
Data
City
Semantics
Audit
Open Standards Data Ingestion Interfaces and Storage
CITY IoT INFRASTRUCTURE CITY DATA SOURCES CITY ICT INFRASTRUCTURE
Government
Devices
Commercia
l Devices
Utility
Devices
Personal
Devices
IoT Data Aggregation
Governmen
t Systems
Social
Media
Commercial
Systems
Archived
Data
Fixed &
Wireless
Networks
Cloud
Services
Enablement
Layer
Data
Orchestration
Layer
Infrastructure
Layer
SECURITY
Smart City Platform requirements
32© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Real-time
Dashboard
Personal
Dashboard
Government
Transactional
Applications
Commercial
Big Data
Application
Government
Big Data
Application
Commercial
Transactional
Applications
Unified
Control
Center
Application
Layer
Security
Rules
Payment
Gateway
Trust
Authentication
Identity
Management
Locations &
Mapping
Platform as
a Service
Data
Governance
DATA ANALYTICS TOOLS Historic & Predictive/DATA APIs
Transactiona
l Data Store
Data
Transformation
Unstructured
Data
Structured
Data
City
Semantics
Audit
Open Standards Data Ingestion Interfaces and Storage
CITY IoT INFRASTRUCTURE CITY DATA SOURCES CITY ICT INFRASTRUCTURE
Government
Devices
Commercia
l Devices
Utility
Devices
Personal
Devices
IoT Data Aggregation
Governmen
t Systems
Social
Media
Commercial
Systems
Archived
Data
Fixed &
Wireless
Networks
Cloud
Services
Enablement
Layer
Data
Orchestration
Layer
Infrastructure
Layer
SECURITY
High level Smart City Platform components
PCF Pivotal Cloud Foundry
E M C S T O R A G E
IISILON and./or
CLOUD NATIVE SOFTWARE DEFINED STORAGE
V M W A R E v R e a l i z e C l o u d S u i t e & B I G D A T A
E X T E N S I O N S
P I V O T A L B I G D A T A S U I T E
A D V A N C E D A N A L Y T I C SA P P L I C A T I O N S
A T S C A L E
D A T A
P R O C E S S I N G
GREENPLUM
DATABASE
HAWQ
SPRING XD SPARK
REDIS
RABBITMQ
GEMFIRE
H A D O O P
33© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Pivotal GPDB Delivers
 Massively Parallel
Analytics Performance
 In-Database Analytical
Extensions
 Industry-Leading Load
Speed
 Rich SQL with Schema
Agnosticism
 Industry-Leading
Workload Mgmt.
 SAS Acceleration
Options
 Parallel Co-Processing
with Hadoop
 No-Forklift Scalability
 Multi-Level
Redundancy
 Rich, Easy-to-Use
Administration Tools
 Big Data Backup
 Comprehensive
Security
 Software-only or DCA
34© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Simple to manage
Single file system, single volume, global namespace
Massively scalable
Scales from 16 TB to over 50 PB in a single cluster
200GB/s throughput, 3.75M IOPS
Unmatched efficiency
Over 80% storage utilization, automated tiering and SmartDedupe
Enterprise data protection
Efficient backup and disaster recovery, and N+1 thru N+4 redundancy
Robust security and compliance options
RBAC, Access Zones, WORM data security, File System Auditing
Data At Rest Encryption with SEDs, STIG hardening
CAC/PIV Smartcard authentication, FIPS OpenSSL support
Operational flexibility
Multi-protocol support including NFS, SMB, HTTP, FTP and HDFS
Object and Cloud computing including OpenStack Swift
Isilon Scale-Out NAS
35© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Lots of Little Files
Hadoop Impact on Telemetry
AKA - Small Files Problem for Hadoop
Rio Smart Sensors - ESRI
NameNode = 512 GB for RAM
Each file eats away 1K in RAM
512GB / 1K = At Most 500M Files
assuming no other processes on the
box.
Rio has 12.5K sensors for the 2016
Olympics. Assuming each sensor
sent a file every minute, 18M files in
1 day.
EMC believes in storing Metadata
on SSD. This allows a scale out
for the NameNode to get around
the limitations of file growth on
the scale-up NameNode.
Building a Data Analytics PaaS for Smart Cities

More Related Content

What's hot

Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsSnapLogic
 
SnapLogic Live: Enabling the Citizen Integrator
SnapLogic Live: Enabling the Citizen IntegratorSnapLogic Live: Enabling the Citizen Integrator
SnapLogic Live: Enabling the Citizen IntegratorSnapLogic
 
Global Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use Cases
Global Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use CasesGlobal Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use Cases
Global Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use CasesSanjay Sharma
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...Flink Forward
 
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDBReal-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDBVoltDB
 
Hadoop for Humans: Introducing SnapReduce 2.0
Hadoop for Humans: Introducing SnapReduce 2.0Hadoop for Humans: Introducing SnapReduce 2.0
Hadoop for Humans: Introducing SnapReduce 2.0SnapLogic
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...DataWorks Summit
 
Postgres Vision 2018: How to Consume your Database Platform On-premises
Postgres Vision 2018: How to Consume your Database Platform On-premisesPostgres Vision 2018: How to Consume your Database Platform On-premises
Postgres Vision 2018: How to Consume your Database Platform On-premisesEDB
 
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...SnapLogic
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
The Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackSnapLogic
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningKai Wähner
 
IoT meets AI in the Clouds
IoT meets AI in the CloudsIoT meets AI in the Clouds
IoT meets AI in the CloudsDr. Mirko Kämpf
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseGanesan Narayanasamy
 
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big DataPowering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big DataDataWorks Summit
 
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...Dataconomy Media
 
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...SnapLogic
 

What's hot (20)

Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIs
 
SnapLogic Live: Enabling the Citizen Integrator
SnapLogic Live: Enabling the Citizen IntegratorSnapLogic Live: Enabling the Citizen Integrator
SnapLogic Live: Enabling the Citizen Integrator
 
The Life of an Internet of Things Electron
The Life of an Internet of Things ElectronThe Life of an Internet of Things Electron
The Life of an Internet of Things Electron
 
Global Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use Cases
Global Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use CasesGlobal Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use Cases
Global Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use Cases
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
 
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDBReal-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
 
Hadoop for Humans: Introducing SnapReduce 2.0
Hadoop for Humans: Introducing SnapReduce 2.0Hadoop for Humans: Introducing SnapReduce 2.0
Hadoop for Humans: Introducing SnapReduce 2.0
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...
 
Postgres Vision 2018: How to Consume your Database Platform On-premises
Postgres Vision 2018: How to Consume your Database Platform On-premisesPostgres Vision 2018: How to Consume your Database Platform On-premises
Postgres Vision 2018: How to Consume your Database Platform On-premises
 
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
The Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management Stack
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
 
IoT meets AI in the Clouds
IoT meets AI in the CloudsIoT meets AI in the Clouds
IoT meets AI in the Clouds
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the Enterprise
 
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big DataPowering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
 
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
 
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...
 

Similar to Building a Data Analytics PaaS for Smart Cities

Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info
 
EMC's IT Transformation Journey ( EMC Forum 2014 )
EMC's IT Transformation Journey ( EMC Forum 2014 )EMC's IT Transformation Journey ( EMC Forum 2014 )
EMC's IT Transformation Journey ( EMC Forum 2014 )EMC
 
Vitaly Kozlovsky
Vitaly KozlovskyVitaly Kozlovsky
Vitaly Kozlovskytanyuuuuha
 
Powering Dynamic M2M Event Processing with OSGi - W Bowers
Powering Dynamic M2M Event Processing with OSGi - W BowersPowering Dynamic M2M Event Processing with OSGi - W Bowers
Powering Dynamic M2M Event Processing with OSGi - W Bowersmfrancis
 
Digital Transformation. Examples from Automotive Industry
Digital Transformation. Examples from Automotive IndustryDigital Transformation. Examples from Automotive Industry
Digital Transformation. Examples from Automotive IndustryBoost40
 
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQMassively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQInMobi Technology
 
End-to-End and e-Business Value from the Telematics Reference Implementation ...
End-to-End and e-Business Value from the Telematics Reference Implementation ...End-to-End and e-Business Value from the Telematics Reference Implementation ...
End-to-End and e-Business Value from the Telematics Reference Implementation ...mfrancis
 
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...Eduardo Pelegri-Llopart
 
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...sparktc
 
How Spark Enables the Internet of Things- Paula Ta-Shma
How Spark Enables the Internet of Things- Paula Ta-ShmaHow Spark Enables the Internet of Things- Paula Ta-Shma
How Spark Enables the Internet of Things- Paula Ta-ShmaSpark Summit
 
Cloud Infrastructure and Services (CIS) - Webinar
Cloud Infrastructure and Services (CIS) - WebinarCloud Infrastructure and Services (CIS) - Webinar
Cloud Infrastructure and Services (CIS) - WebinarEMC
 
Petit Déjeuner Expert Aproged 3ème Plateforme par Alain Le Corre / EMC
Petit Déjeuner Expert Aproged 3ème Plateforme par Alain Le Corre / EMCPetit Déjeuner Expert Aproged 3ème Plateforme par Alain Le Corre / EMC
Petit Déjeuner Expert Aproged 3ème Plateforme par Alain Le Corre / EMCAproged
 
Multi Smart Parking System
Multi Smart Parking SystemMulti Smart Parking System
Multi Smart Parking SystemIRJET Journal
 
Robert Harrison, WMG - IIoT and Industry 4.0 in Automation Systems Engineering
Robert Harrison, WMG - IIoT and Industry 4.0 in Automation Systems EngineeringRobert Harrison, WMG - IIoT and Industry 4.0 in Automation Systems Engineering
Robert Harrison, WMG - IIoT and Industry 4.0 in Automation Systems EngineeringWMG, University of Warwick
 
Iit 1782 designing for the internet of things (io t) v4 gb
Iit 1782 designing for the internet of things (io t) v4 gbIit 1782 designing for the internet of things (io t) v4 gb
Iit 1782 designing for the internet of things (io t) v4 gbGraham Bleakley
 
IRJET- Smart Parking System in Multi-Storey Buildings
IRJET- Smart Parking System in Multi-Storey BuildingsIRJET- Smart Parking System in Multi-Storey Buildings
IRJET- Smart Parking System in Multi-Storey BuildingsIRJET Journal
 
Cloud Native Applications - DevOps, EMC and Cloud Foundry
Cloud Native Applications - DevOps, EMC and Cloud FoundryCloud Native Applications - DevOps, EMC and Cloud Foundry
Cloud Native Applications - DevOps, EMC and Cloud FoundryBob Sokol
 
Emmebrochure 4
Emmebrochure 4Emmebrochure 4
Emmebrochure 4tedy2629
 
A tool to enable cities embrasse Smart Mobility
A tool to enable cities embrasse Smart MobilityA tool to enable cities embrasse Smart Mobility
A tool to enable cities embrasse Smart MobilityAlexandre Darcherif
 

Similar to Building a Data Analytics PaaS for Smart Cities (20)

Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
 
EMC's IT Transformation Journey ( EMC Forum 2014 )
EMC's IT Transformation Journey ( EMC Forum 2014 )EMC's IT Transformation Journey ( EMC Forum 2014 )
EMC's IT Transformation Journey ( EMC Forum 2014 )
 
Vitaly Kozlovsky
Vitaly KozlovskyVitaly Kozlovsky
Vitaly Kozlovsky
 
Powering Dynamic M2M Event Processing with OSGi - W Bowers
Powering Dynamic M2M Event Processing with OSGi - W BowersPowering Dynamic M2M Event Processing with OSGi - W Bowers
Powering Dynamic M2M Event Processing with OSGi - W Bowers
 
Digital Transformation. Examples from Automotive Industry
Digital Transformation. Examples from Automotive IndustryDigital Transformation. Examples from Automotive Industry
Digital Transformation. Examples from Automotive Industry
 
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQMassively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
 
End-to-End and e-Business Value from the Telematics Reference Implementation ...
End-to-End and e-Business Value from the Telematics Reference Implementation ...End-to-End and e-Business Value from the Telematics Reference Implementation ...
End-to-End and e-Business Value from the Telematics Reference Implementation ...
 
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
What is IoT and how Modulus and Pacific can Help - Featuring Node.js and Roll...
 
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
 
Smart & Safer Cities by Richard Knight
Smart & Safer Cities by Richard KnightSmart & Safer Cities by Richard Knight
Smart & Safer Cities by Richard Knight
 
How Spark Enables the Internet of Things- Paula Ta-Shma
How Spark Enables the Internet of Things- Paula Ta-ShmaHow Spark Enables the Internet of Things- Paula Ta-Shma
How Spark Enables the Internet of Things- Paula Ta-Shma
 
Cloud Infrastructure and Services (CIS) - Webinar
Cloud Infrastructure and Services (CIS) - WebinarCloud Infrastructure and Services (CIS) - Webinar
Cloud Infrastructure and Services (CIS) - Webinar
 
Petit Déjeuner Expert Aproged 3ème Plateforme par Alain Le Corre / EMC
Petit Déjeuner Expert Aproged 3ème Plateforme par Alain Le Corre / EMCPetit Déjeuner Expert Aproged 3ème Plateforme par Alain Le Corre / EMC
Petit Déjeuner Expert Aproged 3ème Plateforme par Alain Le Corre / EMC
 
Multi Smart Parking System
Multi Smart Parking SystemMulti Smart Parking System
Multi Smart Parking System
 
Robert Harrison, WMG - IIoT and Industry 4.0 in Automation Systems Engineering
Robert Harrison, WMG - IIoT and Industry 4.0 in Automation Systems EngineeringRobert Harrison, WMG - IIoT and Industry 4.0 in Automation Systems Engineering
Robert Harrison, WMG - IIoT and Industry 4.0 in Automation Systems Engineering
 
Iit 1782 designing for the internet of things (io t) v4 gb
Iit 1782 designing for the internet of things (io t) v4 gbIit 1782 designing for the internet of things (io t) v4 gb
Iit 1782 designing for the internet of things (io t) v4 gb
 
IRJET- Smart Parking System in Multi-Storey Buildings
IRJET- Smart Parking System in Multi-Storey BuildingsIRJET- Smart Parking System in Multi-Storey Buildings
IRJET- Smart Parking System in Multi-Storey Buildings
 
Cloud Native Applications - DevOps, EMC and Cloud Foundry
Cloud Native Applications - DevOps, EMC and Cloud FoundryCloud Native Applications - DevOps, EMC and Cloud Foundry
Cloud Native Applications - DevOps, EMC and Cloud Foundry
 
Emmebrochure 4
Emmebrochure 4Emmebrochure 4
Emmebrochure 4
 
A tool to enable cities embrasse Smart Mobility
A tool to enable cities embrasse Smart MobilityA tool to enable cities embrasse Smart Mobility
A tool to enable cities embrasse Smart Mobility
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Recently uploaded (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Building a Data Analytics PaaS for Smart Cities

  • 1. 1© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Building a Data Analytics PaaS for Smart Cities Smiti Sharma, EMC Virtustream Keith Manthey, EMC ETD BRDC
  • 2. 2© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Intelligent Communities Cities and Regions that use technology not just to save money or make things work better, but also to create high-quality employment, increase citizen participation and become great places to live and work. ICF – Intelligent Community Forum
  • 3. 3© Copyright 2015 EMC Corporation. All rights reserved. “Smart Cities that use Big Data are neither about intuition nor about looking back and analyzing what went wrong and could be better. They spot patterns. They look forward. They predict potential crisis situations. They find what could be better and make it better. Smart cities don’t guess. Theyaresure!
  • 4. 4© Copyright 2015 EMC Corporation. All rights reserved. 4© Copyright 2015 EMC Corporation. All rights reserved. VISION FOR CITIES OF THE FUTURE Become an innovative city SAFE Anticipate risks and protect people and information EFFICIENT Optimized use of city resources SEAMLESS Integrated daily life services IMPACTFUL Enriched life and business experiences for all
  • 5. 5© Copyright 2015 EMC Corporation. All rights reserved. 5© Copyright 2015 EMC Corporation. All rights reserved. IMPLICATIONS TO THE CITY Empower the city, citizens, visitors, and businesses IMPROVE Quality of urban living CREATE Efficient city and transparent government DEVELOP Vital economy REDUCE Environmental impact ADDRESS Infrastructure, buildings and urban planning IMPROVE Tourism, recreation, and city image
  • 6. 6© Copyright 2015 EMC Corporation. All rights reserved. 6© Copyright 2015 EMC Corporation. All rights reserved. “BIG DATA” ENABLESCITIES OF THE FUTURE Any data-set that cannot be processed with traditional systems Social Networks, UGCPublic records Location DataInternet of things Emerging Data Sources Unstructured Data Dark Data Structured Data Traditional Data Sources
  • 7. 7© Copyright 2015 EMC Corporation. All rights reserved. 7© Copyright 2015 EMC Corporation. All rights reserved. BUILD SMART CITY: THE PROBLEM Understand the city data challenge Geo Distributed Data Source Satellite-borne Imaging Device Airborne Imaging Device Webcam Environmental Monitor Health Monitor Traffic monitor Industrial Process Monitor Data Center Centralized Storage and Analytics Systems City Network DATA CHALLENGE There are massive endpoints for these systems. How to manage massive and heterogeneous data becomes an enormous challenge. Diverse data sources requires normalization and standardization to address Data Orchestration and integration. DATA USAGE CHALLENGE How could we create some innovative business to use these data to create more value and fully use current investment?
  • 8. 8© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Architecture
  • 9. 9© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Guiding Principles Agile Open Portable Extensible Modular/Ftl. Blocks Analytics Driven Software Defined
  • 10. 10© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Architecture Ingestion Layer Spring XD Transformation Layer Python /Transformed_Files KPIs Métricas exploration Maps & Graphs Visualization Layer API Open Data Data Integration Layer Python /Transformed_Files Schema and Instance Alignment Data Sources GUIs, Dashboard that access the underlying databases and promote an excellent User experience Data modelling, metrics , ETL mechanisms, definitions and variable selectionProprietary and Open Data sources APIs to expose data Analytics Data mining prediction
  • 11. 11© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Ingestion Ingestion Layer Spring XD Data Sources
  • 12. 12© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Transformation Transformation Layer Python /Transformed_Files Data Cleaning Conversion
  • 13. 13© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Integration Schema Integration Instance Alignment Integration Layer Python /Transformed_Files Schema and Instance Alignment
  • 14. 14© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Integration • Schema DB (INPUT) • Schema Matching (Algorithms & Heuristics) • Suggest Attribute Mappings (OUTPUT : SEMI-SUPERVISED) • Instances of DB tables & Integration Rules (INPUT) • Deduplication, Record Consolidation (Algorithms & Heuristics) • Instance alignment using 2 phase-pass algorithm to avoid duplicate insertion in a semi supervised data integration tool) • Attribute name similarity: fuzzy string comparisons (cosine similarity) • Levenshtein similarity: Categorical/String Data • http://pgsimilarity.projects.pgfoundry.org/ • List of deduplicated instances (OUTPUT - SEMI-SUPERVISED) Schema Integration Instance Alignment Integration Layer Python /Transformed_Files Schema and Instance Alignment
  • 15. 15© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Integration Schema Integration Instance Alignment Camada de Integração Python /Transformed_Files Schema and Instance Alignment Deduplication Similarity Join Mapeamento de atributos (Inserir) Mapeamento de atributos (Selecionar) Cosine Levenshtein
  • 16. 16© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Visualization KPIs Métricas exploration Maps & Graphs Visualization Layer
  • 17. 17© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. API implementation API Open Data
  • 18. 18© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Example use Case Transportation
  • 19. 19© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. • Available Data: – City bus movement information from on board devices (lat- long, time, date, bus line, bus ID) • Goals: – Predict the time of the arrival in a bus stop • Challenges – Lack of data in certain areas of the city – GPS precision Prediction of bus arrival
  • 20. 20© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Architecture GPS Ônibus Gemfire XD Routes & Bus Stop Data Lake Streaming Scheduler Lazy-write GPS
  • 21. 21© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. • Use each bus stop as a node, and the street as edges Transportation Network 𝑥𝑖 𝑥𝑗 𝑎𝑖𝑗 = +1 𝑎𝑖𝑗 = −1 𝑋 = 𝑥1, … , 𝑥 𝑁 , 𝑥𝑖 = 𝑙𝑎𝑡 𝑖 , 𝑙𝑜𝑛𝑔(𝑖) 𝐸 = 𝑒1, … , 𝑒 𝑀 , 𝑒 𝑘 = (𝑥𝑖, 𝑥𝑗)
  • 22. 22© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. • The goal is to find, for each 𝑒𝑗, an estimation of the average speed in a instant 𝑡, 𝑣 𝑒𝑗, 𝑡 . • Default model - estimate the velocity in each edge, using historical data from the last month. – Different hourly models for each day of the week • Online Model - Use real-time date to calculate the speed. The model
  • 23. 23© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Average speed(km/h) Default Model
  • 24. 26© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Prediction of Bus Arrival
  • 25. 27© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. • Based on the information of last use case, extrapolate to verify the quality of the service • Need to identify each bus trip, to evaluate the time interval between two buses of the same line, at each bus stop. Another use case - Auditing
  • 26. 28© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Screen shot - Auditing
  • 27. 29© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Data Quality Issues Route A Route B Bus GPS
  • 28. 30© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Extending PaaS for Smart Cities
  • 29. 31© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Real-time Dashboard Personal Dashboard Government Transactional Applications Commercial Big Data Application Government Big Data Application Commercial Transactional Applications Unified Control Center Application Layer Security Rules Payment Gateway Trust Authentication Identity Management Locations & Mapping Platform as a Service Data Governance DATA ANALYTICS TOOLS Historic & Predictive/DATA APIs Transactiona l Data Store Data Transformation Unstructured Data Structured Data City Semantics Audit Open Standards Data Ingestion Interfaces and Storage CITY IoT INFRASTRUCTURE CITY DATA SOURCES CITY ICT INFRASTRUCTURE Government Devices Commercia l Devices Utility Devices Personal Devices IoT Data Aggregation Governmen t Systems Social Media Commercial Systems Archived Data Fixed & Wireless Networks Cloud Services Enablement Layer Data Orchestration Layer Infrastructure Layer SECURITY Smart City Platform requirements
  • 30. 32© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Real-time Dashboard Personal Dashboard Government Transactional Applications Commercial Big Data Application Government Big Data Application Commercial Transactional Applications Unified Control Center Application Layer Security Rules Payment Gateway Trust Authentication Identity Management Locations & Mapping Platform as a Service Data Governance DATA ANALYTICS TOOLS Historic & Predictive/DATA APIs Transactiona l Data Store Data Transformation Unstructured Data Structured Data City Semantics Audit Open Standards Data Ingestion Interfaces and Storage CITY IoT INFRASTRUCTURE CITY DATA SOURCES CITY ICT INFRASTRUCTURE Government Devices Commercia l Devices Utility Devices Personal Devices IoT Data Aggregation Governmen t Systems Social Media Commercial Systems Archived Data Fixed & Wireless Networks Cloud Services Enablement Layer Data Orchestration Layer Infrastructure Layer SECURITY High level Smart City Platform components PCF Pivotal Cloud Foundry E M C S T O R A G E IISILON and./or CLOUD NATIVE SOFTWARE DEFINED STORAGE V M W A R E v R e a l i z e C l o u d S u i t e & B I G D A T A E X T E N S I O N S P I V O T A L B I G D A T A S U I T E A D V A N C E D A N A L Y T I C SA P P L I C A T I O N S A T S C A L E D A T A P R O C E S S I N G GREENPLUM DATABASE HAWQ SPRING XD SPARK REDIS RABBITMQ GEMFIRE H A D O O P
  • 31. 33© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Pivotal GPDB Delivers  Massively Parallel Analytics Performance  In-Database Analytical Extensions  Industry-Leading Load Speed  Rich SQL with Schema Agnosticism  Industry-Leading Workload Mgmt.  SAS Acceleration Options  Parallel Co-Processing with Hadoop  No-Forklift Scalability  Multi-Level Redundancy  Rich, Easy-to-Use Administration Tools  Big Data Backup  Comprehensive Security  Software-only or DCA
  • 32. 34© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Simple to manage Single file system, single volume, global namespace Massively scalable Scales from 16 TB to over 50 PB in a single cluster 200GB/s throughput, 3.75M IOPS Unmatched efficiency Over 80% storage utilization, automated tiering and SmartDedupe Enterprise data protection Efficient backup and disaster recovery, and N+1 thru N+4 redundancy Robust security and compliance options RBAC, Access Zones, WORM data security, File System Auditing Data At Rest Encryption with SEDs, STIG hardening CAC/PIV Smartcard authentication, FIPS OpenSSL support Operational flexibility Multi-protocol support including NFS, SMB, HTTP, FTP and HDFS Object and Cloud computing including OpenStack Swift Isilon Scale-Out NAS
  • 33. 35© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Lots of Little Files Hadoop Impact on Telemetry AKA - Small Files Problem for Hadoop Rio Smart Sensors - ESRI NameNode = 512 GB for RAM Each file eats away 1K in RAM 512GB / 1K = At Most 500M Files assuming no other processes on the box. Rio has 12.5K sensors for the 2016 Olympics. Assuming each sensor sent a file every minute, 18M files in 1 day. EMC believes in storing Metadata on SSD. This allows a scale out for the NameNode to get around the limitations of file growth on the scale-up NameNode.