SlideShare a Scribd company logo
1 of 19
1© Cloudera, Inc. All rights reserved.
Smarter Decisions in Less Time
Operational Analytics with Cloudera
2© Cloudera, Inc. All rights reserved.
Trends in the Market
“5 percent more productive and 6
percent more profitable than other
companies.”
Source: McKinsey & Company
1 millisecond advantage can be
worth $100 million to a major
brokerage firm.
Source: Information Week
80% of CEOs said analytics is a
strategic objective for 2015.
Source: PWC CEO Survey
Data Driven Pays Cost of Latency Strategic Direction
Trends Driving Change
3© Cloudera, Inc. All rights reserved.
Trends in the Market
Trends Driving Change
Operational analytic applications
increase data returns.
4© Cloudera, Inc. All rights reserved.
Operational Analytics (OA) Overview
Automating analytics pipelines providing individuals the
right information at the right time.
(AKA: Recommendation engines, CEP solutions, Rules Engines)
5© Cloudera, Inc. All rights reserved.
Operational Analytics (OA) Overview
Serving product, content, or
services recommendations
(Wibi, Oryx)
Flagging outliers based on past
behaviors
(SIEM, Blueprint)
Running aggregates for large
scale report serving
(SAS)
Recommendation Engine Event Detection Model Scoring
6© Cloudera, Inc. All rights reserved.
Operationalizing Reports, Models, or Rules
Recommendation
Engine
Event
Detection
Model
Scoring
Point Solutions
Custom Development 3rd Party
Data Discovery
& Analytics
7© Cloudera, Inc. All rights reserved.
Custom Development Use Cases
Recommendation
Engine
Event
Detection
Model
Scoring
Fraud Detection
Spam Filter
Marketing Alerts
Embedded Analytics
Analytic Aggregates
Reports
Next Best Offer
Content Rec
Services Rec
8© Cloudera, Inc. All rights reserved.
The Process of Operational Analytics
Data Discovery
Advanced Analytics
Data Volumes
Stream & Batch Processing
Data
Generation
Operational
Analytics
Flow
Optimize Analytic
Function
Processing
Respond to Data
Feed Data
Application
Act and
Measure
Model Flexibility
Scalability
Embedded Analytics
Reports
9© Cloudera, Inc. All rights reserved.
Operational Analytic Needs
Scale Embed Analytics
Enterprise Data Warehouse
DataData Sources
ETL
Structured
Unstructured
Database
ELT
Store & Process
Traditional Architecture
Archive
Serve
Action
Online Model
f (D1, DN)
Structured
Unstructured
Machine
Drill Down
Human
API
Ingest
Little Latency
Process
10© Cloudera, Inc. All rights reserved.
Challenges with Traditional Operational Analytic
1) Limited Data 3) Analytic Latency2) No Granularity
Enterprise Data Warehouse
DataData Sources
ETL
Structured
Unstructured
Database
ELT
Store & Process
Traditional Architecture
Archive
Serve
Action
Online Model
Process
f (D1, DN)
Structured
Unstructured
Machine
Drill Down
Human
API
Ingest1
2
1
3
11© Cloudera, Inc. All rights reserved.
A New Way Forward
1) Data Scale 3) Little Latency2) Granularity
Enterprise Data Warehouse
DataData Sources
ETL
Structured
Unstructured
Enterprise
Data Hub
ELT
Store & Process
Modern Architecture
Serve
Action
Process
f (D1, DN) Structured
Unstructured
Machine
Drill Down
Human
API
Ingest
1
1
2
3
Online Model
12© Cloudera, Inc. All rights reserved.
Opower Customer Story
13© Cloudera, Inc. All rights reserved.
Opower Overview
The Company
• Serving 95+ utilities in 9 countries
• Over 5TWh saved to date
• 40% of US household data under management totaling 300
billion reads
Our DNA
• Behavioral science software
• Data analytics
• Consumer marketing
• User-centric design
A Software as a Service Customer Engagement Platform
14© Cloudera, Inc. All rights reserved.
Opower’s Personalized Insights
Neighbor comparisons Usage trend analysis
15© Cloudera, Inc. All rights reserved.
Initial Hadoop Architecture
1
2
3
Ingest performance
Complex query paths
1
3
2
Challenges
Multiple workloads
16© Cloudera, Inc. All rights reserved.
Modern Hadoop Architecture
Offline Data Discovery & AnalyticsOnline Operational Analytics
Ingest Performance
Workload separation3
1 2
Improvements
Entity-centric HBase schema2 1
3
17© Cloudera, Inc. All rights reserved.
Insight Creation Environments
Insight Delivery
Insight Calculation
Online Operational Analytics Offline Data Discovery & Analytics
Hive BI
Raw
MR
Batch Tools
HDFS
Reporting
External
Feeds
HBase Export
Non-product
Insights
18© Cloudera, Inc. All rights reserved.
What does this mean to end users?
Batch Analytic Calculations Individual Insight Query Latency
Pre-Hadoop Modern Hadoop
Hours
12
24
48
Hours
Days
Pre-Hadoop
Seconds
1
2
3
~10ms
3 secs
Analytic Development Time
Pre-Hadoop
Months
1
3
5
Weeks
Months
Modern HadoopModern Hadoop
Thank you.

More Related Content

What's hot

Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera, Inc.
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseCloudera, Inc.
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Cloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Cloudera, Inc.
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Seeling Cheung
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldCloudera, Inc.
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataCloudera, Inc.
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and ManufacturingCloudera, Inc.
 
Breakout: Data Discovery with Hadoop
Breakout: Data Discovery with HadoopBreakout: Data Discovery with Hadoop
Breakout: Data Discovery with HadoopCloudera, Inc.
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Cloudera, Inc.
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchCloudera, Inc.
 
The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)Cloudera, Inc.
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Cloudera, Inc.
 
Data Offload for the Chief Data Officer – how to move data onto Hadoop withou...
Data Offload for the Chief Data Officer – how to move data onto Hadoop withou...Data Offload for the Chief Data Officer – how to move data onto Hadoop withou...
Data Offload for the Chief Data Officer – how to move data onto Hadoop withou...DataWorks Summit
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 

What's hot (20)

Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
Breakout: Data Discovery with Hadoop
Breakout: Data Discovery with HadoopBreakout: Data Discovery with Hadoop
Breakout: Data Discovery with Hadoop
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

 
Data Offload for the Chief Data Officer – how to move data onto Hadoop withou...
Data Offload for the Chief Data Officer – how to move data onto Hadoop withou...Data Offload for the Chief Data Officer – how to move data onto Hadoop withou...
Data Offload for the Chief Data Officer – how to move data onto Hadoop withou...
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 

Viewers also liked

Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresDATAVERSITY
 
Data Analytics on Healthcare Fraud
Data Analytics on Healthcare FraudData Analytics on Healthcare Fraud
Data Analytics on Healthcare FraudNicholas Szeto
 
Introduction to Cloudera Search Training
Introduction to Cloudera Search TrainingIntroduction to Cloudera Search Training
Introduction to Cloudera Search TrainingCloudera, Inc.
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scaleLooker
 
Complex Analytics with NoSQL Data Store in Real Time
Complex Analytics with NoSQL Data Store in Real TimeComplex Analytics with NoSQL Data Store in Real Time
Complex Analytics with NoSQL Data Store in Real TimeNati Shalom
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceHortonworks
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpointSarah Hallum
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014Stratebi
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

Viewers also liked (18)

Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
 
Data Analytics on Healthcare Fraud
Data Analytics on Healthcare FraudData Analytics on Healthcare Fraud
Data Analytics on Healthcare Fraud
 
Healthcare fraud detection
Healthcare fraud detectionHealthcare fraud detection
Healthcare fraud detection
 
Introduction to Cloudera Search Training
Introduction to Cloudera Search TrainingIntroduction to Cloudera Search Training
Introduction to Cloudera Search Training
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
 
Complex Analytics with NoSQL Data Store in Real Time
Complex Analytics with NoSQL Data Store in Real TimeComplex Analytics with NoSQL Data Store in Real Time
Complex Analytics with NoSQL Data Store in Real Time
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
Cisco OpenSOC
Cisco OpenSOCCisco OpenSOC
Cisco OpenSOC
 
Chapter 10-DATA ANALYSIS & PRESENTATION
Chapter 10-DATA ANALYSIS & PRESENTATIONChapter 10-DATA ANALYSIS & PRESENTATION
Chapter 10-DATA ANALYSIS & PRESENTATION
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similar to Breakout: Operational Analytics with Hadoop

Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccionFran Navarro
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSetsWebinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSetsKinetica
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsInside Analysis
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopCloudera, Inc.
 
SOUG Day - autonomous what is next
SOUG Day - autonomous what is nextSOUG Day - autonomous what is next
SOUG Day - autonomous what is nextThomas Teske
 
Big Data for Product Managers
Big Data for Product ManagersBig Data for Product Managers
Big Data for Product ManagersPentaho
 
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...Precisely
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester WebinarCloudera, Inc.
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data SnapLogic
 
Splunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
Splunk Webinar: IT Operations Demo für Troubleshooting & DashboardingSplunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
Splunk Webinar: IT Operations Demo für Troubleshooting & DashboardingGeorg Knon
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoopDr. Wilfred Lin (Ph.D.)
 
Dell Digital Transformation Through AI and Data Analytics Webinar
Dell Digital Transformation Through AI and  Data Analytics WebinarDell Digital Transformation Through AI and  Data Analytics Webinar
Dell Digital Transformation Through AI and Data Analytics WebinarBill Wong
 

Similar to Breakout: Operational Analytics with Hadoop (20)

CS-Op Analytics
CS-Op AnalyticsCS-Op Analytics
CS-Op Analytics
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSetsWebinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old Constraints
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
 
SOUG Day - autonomous what is next
SOUG Day - autonomous what is nextSOUG Day - autonomous what is next
SOUG Day - autonomous what is next
 
Big Data for Product Managers
Big Data for Product ManagersBig Data for Product Managers
Big Data for Product Managers
 
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
Splunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
Splunk Webinar: IT Operations Demo für Troubleshooting & DashboardingSplunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
Splunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
 
Dell Digital Transformation Through AI and Data Analytics Webinar
Dell Digital Transformation Through AI and  Data Analytics WebinarDell Digital Transformation Through AI and  Data Analytics Webinar
Dell Digital Transformation Through AI and Data Analytics Webinar
 
Oracle GoldenGate
Oracle GoldenGate Oracle GoldenGate
Oracle GoldenGate
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Breakout: Operational Analytics with Hadoop

  • 1. 1© Cloudera, Inc. All rights reserved. Smarter Decisions in Less Time Operational Analytics with Cloudera
  • 2. 2© Cloudera, Inc. All rights reserved. Trends in the Market “5 percent more productive and 6 percent more profitable than other companies.” Source: McKinsey & Company 1 millisecond advantage can be worth $100 million to a major brokerage firm. Source: Information Week 80% of CEOs said analytics is a strategic objective for 2015. Source: PWC CEO Survey Data Driven Pays Cost of Latency Strategic Direction Trends Driving Change
  • 3. 3© Cloudera, Inc. All rights reserved. Trends in the Market Trends Driving Change Operational analytic applications increase data returns.
  • 4. 4© Cloudera, Inc. All rights reserved. Operational Analytics (OA) Overview Automating analytics pipelines providing individuals the right information at the right time. (AKA: Recommendation engines, CEP solutions, Rules Engines)
  • 5. 5© Cloudera, Inc. All rights reserved. Operational Analytics (OA) Overview Serving product, content, or services recommendations (Wibi, Oryx) Flagging outliers based on past behaviors (SIEM, Blueprint) Running aggregates for large scale report serving (SAS) Recommendation Engine Event Detection Model Scoring
  • 6. 6© Cloudera, Inc. All rights reserved. Operationalizing Reports, Models, or Rules Recommendation Engine Event Detection Model Scoring Point Solutions Custom Development 3rd Party Data Discovery & Analytics
  • 7. 7© Cloudera, Inc. All rights reserved. Custom Development Use Cases Recommendation Engine Event Detection Model Scoring Fraud Detection Spam Filter Marketing Alerts Embedded Analytics Analytic Aggregates Reports Next Best Offer Content Rec Services Rec
  • 8. 8© Cloudera, Inc. All rights reserved. The Process of Operational Analytics Data Discovery Advanced Analytics Data Volumes Stream & Batch Processing Data Generation Operational Analytics Flow Optimize Analytic Function Processing Respond to Data Feed Data Application Act and Measure Model Flexibility Scalability Embedded Analytics Reports
  • 9. 9© Cloudera, Inc. All rights reserved. Operational Analytic Needs Scale Embed Analytics Enterprise Data Warehouse DataData Sources ETL Structured Unstructured Database ELT Store & Process Traditional Architecture Archive Serve Action Online Model f (D1, DN) Structured Unstructured Machine Drill Down Human API Ingest Little Latency Process
  • 10. 10© Cloudera, Inc. All rights reserved. Challenges with Traditional Operational Analytic 1) Limited Data 3) Analytic Latency2) No Granularity Enterprise Data Warehouse DataData Sources ETL Structured Unstructured Database ELT Store & Process Traditional Architecture Archive Serve Action Online Model Process f (D1, DN) Structured Unstructured Machine Drill Down Human API Ingest1 2 1 3
  • 11. 11© Cloudera, Inc. All rights reserved. A New Way Forward 1) Data Scale 3) Little Latency2) Granularity Enterprise Data Warehouse DataData Sources ETL Structured Unstructured Enterprise Data Hub ELT Store & Process Modern Architecture Serve Action Process f (D1, DN) Structured Unstructured Machine Drill Down Human API Ingest 1 1 2 3 Online Model
  • 12. 12© Cloudera, Inc. All rights reserved. Opower Customer Story
  • 13. 13© Cloudera, Inc. All rights reserved. Opower Overview The Company • Serving 95+ utilities in 9 countries • Over 5TWh saved to date • 40% of US household data under management totaling 300 billion reads Our DNA • Behavioral science software • Data analytics • Consumer marketing • User-centric design A Software as a Service Customer Engagement Platform
  • 14. 14© Cloudera, Inc. All rights reserved. Opower’s Personalized Insights Neighbor comparisons Usage trend analysis
  • 15. 15© Cloudera, Inc. All rights reserved. Initial Hadoop Architecture 1 2 3 Ingest performance Complex query paths 1 3 2 Challenges Multiple workloads
  • 16. 16© Cloudera, Inc. All rights reserved. Modern Hadoop Architecture Offline Data Discovery & AnalyticsOnline Operational Analytics Ingest Performance Workload separation3 1 2 Improvements Entity-centric HBase schema2 1 3
  • 17. 17© Cloudera, Inc. All rights reserved. Insight Creation Environments Insight Delivery Insight Calculation Online Operational Analytics Offline Data Discovery & Analytics Hive BI Raw MR Batch Tools HDFS Reporting External Feeds HBase Export Non-product Insights
  • 18. 18© Cloudera, Inc. All rights reserved. What does this mean to end users? Batch Analytic Calculations Individual Insight Query Latency Pre-Hadoop Modern Hadoop Hours 12 24 48 Hours Days Pre-Hadoop Seconds 1 2 3 ~10ms 3 secs Analytic Development Time Pre-Hadoop Months 1 3 5 Weeks Months Modern HadoopModern Hadoop

Editor's Notes

  1. McKinsey & Company: http://www.mckinsey.com/insights/business_technology/getting_the_cmo_and_cio_to_work_as_partners “Wall Street’s quest to process data at the speed of light,” Information Week, April 21, 2007
  2. McKinsey & Company: http://www.mckinsey.com/insights/business_technology/getting_the_cmo_and_cio_to_work_as_partners “Wall Street’s quest to process data at the speed of light,” Information Week, April 21, 2007
  3. Only talking about numerical and categorical data. Not talking text or image because of limitations within Hbase around 1-2G for each entity. Recommendations is a list of content that based on the entity behavior fits the next best action Anomaly detection is the inverse of recommendations. They should fit into these recommendations based on their behaviors, but they are outside of the norm. Signal alert. Model scoring are numerical aggregates that you can send to the right system
  4. Only talking about numerical and categorical data. Not talking text or image because of limitations within Hbase around 1-2G for each entity. Recommendations is a list of content that based on the entity behavior fits the next best action Anomaly detection is the inverse of recommendations. They should fit into these recommendations based on their behaviors, but they are outside of the norm. Signal alert. Model scoring are numerical aggregates that you can send to the right system
  5. Limited Data – Can’t bring in unstructured data, historic data is moved offline because the system can’t scale. Drill down performance – Individual insight drilldowns take time, ad-hoc queries steal resources from operational analytics workload Analytic Latency – Processing data volumes at speed is inefficient on traditional systems, latency can’t occur (hurts business or customer relationship)
  6. Data Scale – Bring in structured and unstructured data, keep data online for deeper drill downs Ad-hoc Queries – Perform drill downs without compromising system performance Little Latency – Scale data processing to meet analytics SLAs.
  7. Opower Intro: Who is Opower and what does Opower do? Produce energy insights to help utilities and customers manage energy consumption. 100+millions of meter reads received daily. Millions of individual insight calculations routinely created, from simple trending analytics, to more advanced forecasting/prediction. Energy saved: 5+TW hours, $500M energy bill savings, >6 billion lbs CO2 Product lines: Consumer engagement Energy efficiency Demand Response Hadoop-based insights are a critical portion of each of these product lines. Transition: Some example hadoop-based insights:
  8. Two example of Opower’s personalized insights that use hadoop components: neighbor comparisons and unusual usage alerts Energy usage is stored in HBase, along with insights derived from the energy usage. Billions of energy usage data reads are stored in HBase. Insights are served directly from HBase. Unusual usage alerts were the first use case for HBase/hadoop. We sold a deal that required us to generate “unusual usage alerts” at a scale we had yet hit UUA are email or phone messages we send to let customers know if they are trending towards higher than usual energy usage We also project the bill for them and can let them know if they are going to pay more than expected Transition: The initial architecture we built to calculate and deliver this insight
  9. Hadoop has been used in production at Opower since 2012. Overview of end-to-end architecture: Data is copied from single-tenant mysql databases into HBase. MySQL is single tenant (one DB per opower client), and we have > 100 MySQL dbs in production. Batch clients read from HBase. Other workloads running on the cluster as an attempt to eliminate the need to support clusters for separate workloads. Sqoop is a mapreduce job that reads data from mysql and outputs to some other source, like hive+hdfs or in our case HBase. Challenges: Sqoop ingest introduced a lot of memory pressure on region servers and traffic on mysql read slaves. Need to take caution to not introduce excessive MySQL load from sqoop queries, as the databases are serving other critical apps Queries required longer multi-row scans and aggregations. Lot’s of tuning was necessary, such as increase in region file sizes, memstore sizes, heap size. Disable major compactions, HDFS short-circuit. Composite row keys with timestamps in them, thinking about Hbase more like a relational table than a big sorted map We had supporting data in single-tenanted because we were sqooping it over from the mysql databases Because of how we designed the schema, we needed multiple tables to store the data Single-tenanted tables adds operational overhead and difficulty in tracking bottlenecks in the process Initial support of ad-hoc MR jobs via Hive was quickly removed due to unmanageable load This architecture has been successful, but difficult to scale. The hbase schema was difficult to extend to support new insights and there was no story for offline analytics and experimentation. Transition: V2 (the modern) Opower hadoop architecture addresses these issues
  10. Overview/walk through of the major components. Usage data is collected from the utility and directly ingested into HBase via bulkloading MR jobs. [Explain bulkloading] Data is stored in an Entity-centric table, where each entity is a single hbase row containing the energy usage history for a household, and any derived analytics from that energy, such as bill forecasts and neighbor comparisons. MapReduce jobs will periodically referesh these analytics, but some are also refreshed on-demand in a streaming fashion, as insights are queried. Data is replicated to the data warehouse cluster via a combination of HBase replication (for direct puts) and as an HFile distcp step during the intial bulkload ingest (not pictured). Full, multi-tenant datasets are now available to be analyzed in the data warehouse, which has enabled new off-line analytics such as product eligibility calculations, and a general test-bed experimenting with new insights. There is no longer a need to painstakingly collect data from multiple sources or worry about crashing a mysql slave when running a full table scan. Improvements: Write path performance via bulkloading. less GC pressure in the region server, no memstore flushes. fewer RPC’s/round-trips to the databsae. Simultaneous bulkloading via distcp into the data warehouse hbase instance, so the data warehouse has fresh data. Entity-centric HBase schema provides ability to add new analytics/insights in a scalable manner. Data used to derive a personalized insight is stored in a single HBase row, providing data locality for scans and eliminating hbase overhead of multi-row traversals and aggregations. Secondary analyics were moved to data warehouse, reducing the memory pressure and task contention on the service cluster. MR jobs on the service cluster are specific to generation of personalized insights served at low-latency. The new architecture has worked, but there are still areas we want to improve, such as automation and ETL tooling that will make it easier to load new datasets and create new insights. Transition: This new architecture enables two distinct environments for creating new data insights
  11. Product calculations are built as producer-style mapreduce jobs – reading and writing to the same HBase row. For example, a trend in energy usage for the current bill period will be derived from the usage data present in the row and used to forecast the customers energy consumption and spending for the current period. Insights are accessed by a service query layer. An template HBase service container can be easily extended to create service API’s for different insight products. Service client applications are used by reporting pipelines and embedded web components. Offline analysis and experimentation occurs in the data warehouse. Hive, BI tools (platfora, datameer), and raw mapreduce jobs are used to create aggregate reports, and non-product analytics such as customer program eligibility. These tools are also used for ad-hoc analysis of full energy usage datasets, such as electric car charging trends or the impact of the super bowl on energy consumption. In the future we look to link the two systems, enabling analytics developed offline to be ‘promoted’ to product calculations. Transition: What’s been the result of a switch to hadoop architecture?
  12. Batch analytics calculated via the producer pattern are much more amenable to the MR parallelization and take advantage of HBase row locality. Run time reduced significantly. Some jobs could be multi-tenant, which are easier to operate. Individual insight query latency dropped from several seconds to ~10ms. Our performance tests measure at the 99.999% point on the latency tail, so average time is even faster. Query latency has been critical for SOA-model SLA’s, since multiple external services will access this data in real time. Analytic development time is faster, although it could still be improved. Development speedups came from adding a data warehouse cluster for development and experimentation, which used more analyst friendly tools like hive and scalding. Also, the entity-based schema used in production is more amenable to adding new data. Transition: We’ve had some success but encountered challenges along the way. Here are some lessons we learned: