SlideShare a Scribd company logo
1 of 32
Information-Driven
Manufacturing
Capture Value from Manufacturing Data with an Enterprise
Data Hub
Speaker name // Speaker title
2© Cloudera, Inc. All rights reserved.
Trends in Manufacturing
Everything that can be
measured will be measured.
Only increasing...
Continuous Improvement in
cost and efficiency in all areas of
manufacturing operation
Now, more than ever, Quality is
a top concern both from
consumer, dealer and
regulatory standpoint
Instrumentation Efficency Quality
NEED BETTER PICTURE
3© Cloudera, Inc. All rights reserved.
Manufacturers are collecting data at an
exponential rate, yet struggle to derive value
from all that data...
4© Cloudera, Inc. All rights reserved.
:
Manufacturing Enterprise Data HUB
Provides the ability to store, analyze all
the data and quickly uncover new
insights, derive value to all phases of the
process from initial design to final
delivery.
5© Cloudera, Inc. All rights reserved.
Manufacturing Enterprise Data Hub Overview
Keep all the data, whether its
people generated, machine
generated or external.
Statistical and machine learning
analyses using advanced
analytic tools on all the data
(Spark, R, Python,SAS, Matlab)
Access to all the data from the
enterprise and manufacturing at
your fingertips, consolidate silos
(Self Service BI, Search)
Keep all the data Advanced Analytics Leverage all the data
6© Cloudera, Inc. All rights reserved.
Where Is the Manufacturing Data?
Mapping and Consolidation Are the Tip of the Iceberg for Big Data
Devices &
Sensors
• Device Readings
• Device Performance
• Device Diagnostics
• Battery / Power
Consumption
• Software Logs
• Environmental
Interactions
• R&D
• Quality / Testing
Plant &
Operations
• MES
• Sensors
• Video / Surveillance
• Line Productivity
• Machines
• Staffing / Scheduling
• Quality data
Supply Chain &
Inventory
• ERP
• Supplier / Manufacturer
• Orders / Receivables
• Commodity Supplies /
Prices
• Chargebacks
• Scorecards
• Delivery Metrics
Marketing
& CRM
• Transactions
• Accounts
• Warranties /
Aftermarket
• Customer Service Logs
• Campaigns /
Promotions
• Website / SEO
• Affiliates / Merchants
• Surveys
• Competitive
Intelligence
Public & Trade
• Market Intelligence
• Policy / Regulation
• Demographic / Census
• Psychographic
• Inflation / Macroeconomic
• Gas Prices
• Labor Statistics
• Social / Search
• Public Health Data
• Clinical Studies
• Store Schematics
• Journals / Editorial
• Seismic / Speculation
7© Cloudera, Inc. All rights reserved.
A Traditional Architecture: What have we tried
Access Data Experiment FastAnalyze Data
Enterprise Data Warehouse
ImplementData Sources
ETLStructured
Unstructured
Ingest
Storage #1, 2, N
ELT
Store & Process
Traditional Architecture
EDW
Archive
ETL
Access Data
Analyze Data
Search
Serve
Serve
Serve
Optimize
Implement
Custom
Application
Point
Solution
ELT
ELT
Statistical
Machine
Learning
SQL
Filter?
Filter?
Filter?
Filter?
Machine Data
Ingest
8© Cloudera, Inc. All rights reserved.
Enterprise Data Warehouse
ImplementData Sources
ETLStructured
Unstructured
Ingest
Storage #1, 2, N
ELT
Store & Process
Traditional Architecture
EDW
Archive
ETL
Access Data
Analyze Data
Search
Serve
Serve
Serve
Optimize
Implement
Custom
Application
Point
Solution
ELT
ELT
Statistical
Machine
Learning
SQL
Challenges with Traditional Architectures
1) Limited Data 2) Long Time to Value
1
2
2
3) Sub Optimal Decisions
3
Filter? Filter?
Filter?
Filter?
Machine Data
Ingest
Filter?
9© Cloudera, Inc. All rights reserved.
The New Way Forward
1) Unlimited Data Access 2) Reduce Time to Value 3) Decision on all data
Enterprise Data Warehouse
ImplementData Sources
ELT
Store & Process
Modern Architecture
Access Data
Analyze Data
Optimize
Implement
Custom
Application
Point
Solution
Statistical
Machine
Learning
SQL
Structured
Unstructured
EDW
ETL
Serve
ETL
Active
Ingest
Ingest
EDH
Archive
Load
Cloudera
ELT
3
2
2
3
1
Search
Machine Data
ETL
Active
Ingest
10© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights
reserved.
Overview on Data Flow in Cloudera EDH
10
3rd party or
public
Network
Equipment
Traditional
RDMBS
EDW
Ffffffffff
Event Based, Near Real Time
• Flume
• Spark Streaming
• Kafka (coming soon)
SQL / Relational
• Sqoop – SQL Import including
Metadata
Web Services/API/Cmd line
• Put/Store Copy/Move files
• NFS Gateway
HUE Web GUI
• User Upload
• User Copy/Move/Rename
Third Part Integrations
Ingest/Storage Process/Transformation
WORKLOAD MANAGEMENT / Yarn (Resource Management) & Oozie (Workflow Engine)
Hadoop File System (HDFS) / Distributed File Storage
ELT, ETL, Transform, Cleanse, Pre-aggregate,
analyze etc.
SQL / Relational
• Hive (Batch SQL)
• Impala (Interactive SQL)
Map Reduce – Java based distributed
processing
• Machine Learning libraries
• Pig – scripting language to perform Map
Reduce
Spark – In Memory distributed processing
• Java, Scala or Python
Third Party Integrations
SQL / Relational (ODBC/JDBC)
• Hive (Batch SQL)
• Impala (Interactive SQL)
Web Services/API/Cmd line
• Get/Move files
• NFS Gateway
HUE Web GUI
• User Download
• User Copy/Move/Rename
Search Index
• Solr Search, full featured with Facets,
NLP, etc.
Third Part Integrations
Raw Data Insight and Value to User
Publish/Consume
11© Cloudera, Inc. All rights reserved.
AUTHENTICATION
Guarding access to the
system, its data, and its
various systems
LDAP
Kerberos RPC
PROTECTION
Encryption for data at
rest or in motion with
full key management
Cloudera Navigator:
Encrypt & Key Trustee
AUTHORIZATION
Controlling who or
what has access to a
resource or service
POSIX Permissions
Apache Sentry
AUDIT
Capture a complete
and immutable record
of all activity
Cloudera Navigator
SIEM Tools
Security Important?
Cloudera Enterprise Data Hub provides Enterprise-Grade Security, Audit and
Regulatory Compliance
Governing Access to and Management
of All Data-at-Rest and Data-in-Motion
• Cloudera Manager and Navigator
automate protections for Hadoop and
related projects
• Perimeter security
• Role-based access control
• The only complete policy-based
management of sensitive data
• Data lineage and discoverability
12© Cloudera, Inc. All rights reserved.
Core Benefits of a Manufacturing Enterprise Data Hub
©2014 Cloudera, Inc. All rights reserved.
• Full-Fidelity Active Archive
• Any and All Kinds of Data
• Accelerate Time to Insight (Scale)
• Unlock Agility and Exploration
• Consolidate Silos for 360o View
• Enable Pervasive Analytics across the
entire Value Chain (Design to Post
Sales Delivery and Warranty)
13© Cloudera, Inc. All rights reserved.
What value is there in Manufacturing Data Hub?
• What product issues are paramount?
• What are technology trends?
• Efficient Parts Utilization—what is the best
part for my design?
• Is all my machine data being utilized?
Design, R&D, PD, Engineering
Hadoop
Cloudera
Secure
Scalable
Flexible
Open
Production, Quality, Manufacturing
• Diagnose Production problems
• What is the cause? People, Parts, Process,
Suppliers?
• Plant inventory
• Resource utilization
• Is all my shop floor data being analyzed?
Supply Chain, Purchasing
• Who are my best Suppliers?
• Who are my worst Suppliers?
• Consolidated view of the Supply Chain?
• Supply Chain disruption impact analysis?
• Consolidated Purchasing (360 Supplier view)
Manufacturing Data Hub
Delivery, Warranty, Support, Service
• Review Customer 360
• Analyze Product Launch information
• Detect Emerging Warranty Issues
• Decrease Correction Times
• Increased Accuracy of Warranty Forecast
• Knowledge base for After Delivery Service
Ask Bigger Questions of all the data
14© Cloudera, Inc. All rights reserved.
Customer Story
15© Cloudera, Inc. All rights reserved.
About Vehicle Manufacturer
What do we do -
Manufacture, Sell and Service Vehicles
Who is this Manufacturer
A worldwide leading Manufacturer of Vehicles
16© Cloudera, Inc. All rights reserved.
Our Objectives
Store and Analyze worldwide data
from Dealers, Customers and Vehicles
Better, Deeper
Analysis
Smarter Predictions, Earlier Detection
17© Cloudera, Inc. All rights reserved.
The Pre-Hadoop Environment
Parts
Suppliers
Dealers
1 Difficult to connect to multiple sources
1
BI/RDBMS/DW
Challenge
Claims
Machine Data
IDLE
Vehicle Data
WHY?
• Volume
• Too much to store, let alone
query
• Variety
• Different formats, not all Table
Based data
?
?
?
18© Cloudera, Inc. All rights reserved.
The Pre-Hadoop Environment
Parts
Suppliers
Dealers
2 Impossible to analyze all that data
2
BI/RDBMS/DW
Another Challenge
Claims
Machine Data
IDLE
Vehicle Data
WHY?
• It wasn’t even in one system.
• Different workloads (Macro vs.
Micro Analysis)
Advanced
Analytics
19© Cloudera, Inc. All rights reserved.
Vehicle Manufacturer Modern Hadoop Architecture
Complete storage of data
(structured and unstructured)
1
Improvements
Process
1
Store
HDFS, HBase
Claims
Machine Data
IDLE
Vehicle Data
Sqoop
Flume, KafkaCopy (XML)
20© Cloudera, Inc. All rights reserved.
Vehicle Manufacturer Modern Hadoop Architecture
Process Data as needed2
Improvements
Process
2
Store
HDFS, HBase
Claims
Machine Data
IDLE
Vehicle Data
Sqoop
Flume, KafkaCopy (XML)
MR,Pig, Spark, ETL Tools on Hadoop
Process
Complete storage of data
(structured and unstructured)
1
21© Cloudera, Inc. All rights reserved.
Vehicle Manufacturer Modern Hadoop Architecture
Analysis and Large Scale Ad-Hoc
Queries
3
Improvements
Process
3
Store
HDFS, HBase
Claims Machine Data
IDLE
Vehicle Data
Sqoop Flume, KafkaCopy (XML)
MR,Pig, Spark, ETL Tools on Hadoop
Process
HUE
Discover
Impala Solr
Access
BI
Hive
Process Data as needed2
Complete storage of data
(structured and unstructured)
1
22© Cloudera, Inc. All rights reserved.
Vehicle Manufacturer Modern Hadoop Architecture
Analysis and Large Scale Ad-Hoc
Queries
3
Improvements
Process
4
Store
HDFS, HBase
Claims Machine Data
IDLE
Vehicle Data
Sqoop Flume, KafkaCopy (XML)
MR, Pig, Spark, ETL Tools on Hadoop
Process
HUE
Discover
Impala Solr
Access
BI
Hive
Process Data as needed2
Complete storage of data
(structured and unstructured)
1
R MLlibSpark
4
Advanced Analytics
23© Cloudera, Inc. All rights reserved.
Business and Technical ROI
Technology ROI
Business ROI
Proactive Quality Assurance
Build machine learning algorithms that identify production anomalies prior to field testing and find
performance flaws that could not be identified in R&D.
Predictive Intervention
Combine data streaming from machine data (vehicles, plant floor), diagnostics, and
product/engineering data to proactively avoid or address issues and deploy upgrades.
Merge together storage systems for simpler management – Active Archive – Retire Legacy Systems
Unified access to disparate, Siloed data – Retire single use systems
Scale affordably – Grow without destroying the budget
Flexible and Agile – IT can focus on Solutions for the Business vs. being a Data Plumber
24© Cloudera, Inc. All rights reserved. 24
What happened at the parts supplier that caused a spike in support calls
during the past 30 minutes for devices manufactured in Birmingham?
How many devices were returned last month?
Reduce the time to QC issue resolution from weeks to hours.
Drive $15 to $25 million annual savings for each manufacturer.
© 2014 Cloudera, Inc. All rights reserved.
25© Cloudera, Inc. All rights reserved.
Can we predict which chips have the highest likelihood of failure
and intervene to proactively prevent manufacturing issues?
Which chips most commonly failed last week?
Analyses now executable on hundreds of thousands of units in just seconds.
60x faster data reload and 300% query speedup enable real-time debugging.
25© 2014 Cloudera, Inc. All rights reserved.
26© Cloudera, Inc. All rights reserved.
Thank you
27© Cloudera, Inc. All rights reserved.
Cloudera Snapshot
Founded 2008, by former employees of
Employees Today ~ 850
World Class Support 24x7 Global Staff
Pro-active & Predictive Support Programs
Mission Critical Thousands of Enterprise Users
Over 500+ Paying Subscription Customers
The Largest Ecosystem Over 1450+ Partners
Cloudera University Over 100,000+ Trained
Open Source Leaders Cloudera Employees are Leading Developers & Contributors
Total Capital Raised $1B+ (from Intel, Google, Dell, T. Rowe Price, Accel, Greylock)
Mission Help Organizations Leverage the Power of
All Their Data to Ask Bigger Questions.
28© Cloudera, Inc. All rights reserved.
Expanding Data Requires A New Approach
What we do
Copy Data to Applications
What we should do
Bring Applications to Data
Data
Information-centric
businesses use all Data:
Multi-structured,
Internal & external data
of all types
App
App
App
Process-centric
businesses use:
• Structured data mainly
• Internal data only
• “Important” data only
• Multiple copies of data
App
App
App
Data
Data
Data
Data
29© Cloudera, Inc. All rights reserved.
Hadoop Changes the Game: Storage & Compute Together
©2014 Cloudera, Inc. All rights
The Hadoop WayThe Old Way
$30,000+ per TB
Expensive & Unattainable
• Hard to scale
• Network is a bottleneck
• Only handles relational data
• Difficult to add new fields & data types
Expensive, Special purpose, “Reliable” Servers
Expensive Licensed Software
Network
Data Storage
(SAN, NAS)
Compute
(RDBMS, EDW)
$300-$1,000 per TB
Affordable & Attainable
• Scales out forever
• No bottlenecks
• Easy to ingest any data
• Agile data access
Commodity “Unreliable” Servers
Hybrid Open Source Software
Compute
(CPU)
Memory Storage
(Disk)
z
z
30© Cloudera, Inc. All rights reserved.
Enabling the “App Store” of Big Data (Large Ecosystem)
Data
Systems
Enterprise Data Hub
Security and Administration
Unlimited Storage
Process Discover Model Serve
Applications
System Integration
Infrastructure
More than 1,450 partners
ensure compatibility with existing
investments, lower skill barriers, and
help maximize value from your data.Operational
Tools
31© Cloudera, Inc. All rights reserved.
WEB/MOBILE APPLICATIONS
ONLINE SERVING
SYSTEM
ENTERPRISE DATA
WAREHOUSE
ENTERPRISE
REPORTINGBI / ANALYTICSMACHINE
LEARNING
CONVERGED
APPLICATIONS
CLOUDERA
MANAGER
META DATA /
ETL TOOLS
ENTERPRISE DATA HUB
The Modern Information Architecture
Data Architects System Operators Engineers Data Scientists Analysts Business Users
Customers & End Users
SYS LOGS WEB LOGS FILES RDBMS
32© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights
A High Level View of the Journey
Not
Only
SQL
Agile
Exploration
ETL
Acceleration
Operational Efficiency
(Faster, Bigger, Cheaper)
Transformative Applications
(New Business Value)
Cheap
Storage
BusinessIT
EDW
Optimization
Pervasive
Analytics

More Related Content

What's hot

Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariDataWorks Summit
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
 
Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterconfluent
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar ZecevicDataScienceConferenc1
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsKai Wähner
 
Streaming architecture patterns
Streaming architecture patternsStreaming architecture patterns
Streaming architecture patternshadooparchbook
 
Fifth Elephant Apache Atlas Talk
Fifth Elephant Apache Atlas TalkFifth Elephant Apache Atlas Talk
Fifth Elephant Apache Atlas TalkVimal Sharma
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - RangerIsheeta Sanghi
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks FundamentalsDalibor Wijas
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data PipelineJesus Rodriguez
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Simplilearn
 
Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)James Serra
 

What's hot (20)

Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data LakesIntegrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache Ambari
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matter
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Hive
HiveHive
Hive
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and Logistics
 
Streaming architecture patterns
Streaming architecture patternsStreaming architecture patterns
Streaming architecture patterns
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Fifth Elephant Apache Atlas Talk
Fifth Elephant Apache Atlas TalkFifth Elephant Apache Atlas Talk
Fifth Elephant Apache Atlas Talk
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - Ranger
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
 
Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)
 
Apache Kafka
Apache Kafka Apache Kafka
Apache Kafka
 

Viewers also liked

Predictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive IndustryPredictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive IndustryMatouš Havlena
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudInside Analysis
 
One on One with Wayne Eckerson
One on One with Wayne EckersonOne on One with Wayne Eckerson
One on One with Wayne EckersonInside Analysis
 
Data as a Product by Wayne Eckerson
Data as a Product by Wayne EckersonData as a Product by Wayne Eckerson
Data as a Product by Wayne EckersonZoomdata
 
Big data and its impact on SOA
Big data and its impact on SOABig data and its impact on SOA
Big data and its impact on SOADemed L'Her
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014cdmaxime
 
Digital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open SourceDigital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open Sourcesrisatish ambati
 
Business driven BI - Self-service Techniques
Business driven BI - Self-service TechniquesBusiness driven BI - Self-service Techniques
Business driven BI - Self-service TechniquesEckerson Group
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesboorad
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareMapR Technologies
 
Big Data Startups - Top Visualization and Data Analytics Startups
Big Data Startups - Top Visualization and Data Analytics StartupsBig Data Startups - Top Visualization and Data Analytics Startups
Big Data Startups - Top Visualization and Data Analytics Startupswallesplace
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityRTTS
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecasesudhakara st
 

Viewers also liked (18)

Wayne Eckerson: Secrets of Analytical Leaders
Wayne Eckerson: Secrets of Analytical LeadersWayne Eckerson: Secrets of Analytical Leaders
Wayne Eckerson: Secrets of Analytical Leaders
 
Predictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive IndustryPredictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive Industry
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the Cloud
 
One on One with Wayne Eckerson
One on One with Wayne EckersonOne on One with Wayne Eckerson
One on One with Wayne Eckerson
 
Data as a Product by Wayne Eckerson
Data as a Product by Wayne EckersonData as a Product by Wayne Eckerson
Data as a Product by Wayne Eckerson
 
SOA & Big Data
SOA & Big DataSOA & Big Data
SOA & Big Data
 
Big data and its impact on SOA
Big data and its impact on SOABig data and its impact on SOA
Big data and its impact on SOA
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
 
Digital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open SourceDigital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open Source
 
Business driven BI - Self-service Techniques
Business driven BI - Self-service TechniquesBusiness driven BI - Self-service Techniques
Business driven BI - Self-service Techniques
 
Going MAD: A Framework For Delivering Pervasive BI Solutions
Going MAD: A Framework For Delivering Pervasive BI SolutionsGoing MAD: A Framework For Delivering Pervasive BI Solutions
Going MAD: A Framework For Delivering Pervasive BI Solutions
 
Business Intelligence In The Cloud
Business Intelligence In The CloudBusiness Intelligence In The Cloud
Business Intelligence In The Cloud
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
 
Big Data Startups - Top Visualization and Data Analytics Startups
Big Data Startups - Top Visualization and Data Analytics StartupsBig Data Startups - Top Visualization and Data Analytics Startups
Big Data Startups - Top Visualization and Data Analytics Startups
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data Quality
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 

Similar to Hadoop and Manufacturing

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Cloudera, Inc.
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessCloudera, Inc.
 
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the DataSeeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the DataCloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformCloudera, Inc.
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...jdijcks
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedCloudera, Inc.
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseCloudera, Inc.
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...DataStax Academy
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic IntelAPAC
 
Secure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game ChangersSecure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game ChangersCloudera, Inc.
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 

Similar to Hadoop and Manufacturing (20)

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
 
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the DataSeeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the Data
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: Exposed
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
Secure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game ChangersSecure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game Changers
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 

Recently uploaded

Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
How To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROHow To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROmotivationalword821
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 

Recently uploaded (20)

Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
How To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROHow To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTRO
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 

Hadoop and Manufacturing

  • 1. Information-Driven Manufacturing Capture Value from Manufacturing Data with an Enterprise Data Hub Speaker name // Speaker title
  • 2. 2© Cloudera, Inc. All rights reserved. Trends in Manufacturing Everything that can be measured will be measured. Only increasing... Continuous Improvement in cost and efficiency in all areas of manufacturing operation Now, more than ever, Quality is a top concern both from consumer, dealer and regulatory standpoint Instrumentation Efficency Quality NEED BETTER PICTURE
  • 3. 3© Cloudera, Inc. All rights reserved. Manufacturers are collecting data at an exponential rate, yet struggle to derive value from all that data...
  • 4. 4© Cloudera, Inc. All rights reserved. : Manufacturing Enterprise Data HUB Provides the ability to store, analyze all the data and quickly uncover new insights, derive value to all phases of the process from initial design to final delivery.
  • 5. 5© Cloudera, Inc. All rights reserved. Manufacturing Enterprise Data Hub Overview Keep all the data, whether its people generated, machine generated or external. Statistical and machine learning analyses using advanced analytic tools on all the data (Spark, R, Python,SAS, Matlab) Access to all the data from the enterprise and manufacturing at your fingertips, consolidate silos (Self Service BI, Search) Keep all the data Advanced Analytics Leverage all the data
  • 6. 6© Cloudera, Inc. All rights reserved. Where Is the Manufacturing Data? Mapping and Consolidation Are the Tip of the Iceberg for Big Data Devices & Sensors • Device Readings • Device Performance • Device Diagnostics • Battery / Power Consumption • Software Logs • Environmental Interactions • R&D • Quality / Testing Plant & Operations • MES • Sensors • Video / Surveillance • Line Productivity • Machines • Staffing / Scheduling • Quality data Supply Chain & Inventory • ERP • Supplier / Manufacturer • Orders / Receivables • Commodity Supplies / Prices • Chargebacks • Scorecards • Delivery Metrics Marketing & CRM • Transactions • Accounts • Warranties / Aftermarket • Customer Service Logs • Campaigns / Promotions • Website / SEO • Affiliates / Merchants • Surveys • Competitive Intelligence Public & Trade • Market Intelligence • Policy / Regulation • Demographic / Census • Psychographic • Inflation / Macroeconomic • Gas Prices • Labor Statistics • Social / Search • Public Health Data • Clinical Studies • Store Schematics • Journals / Editorial • Seismic / Speculation
  • 7. 7© Cloudera, Inc. All rights reserved. A Traditional Architecture: What have we tried Access Data Experiment FastAnalyze Data Enterprise Data Warehouse ImplementData Sources ETLStructured Unstructured Ingest Storage #1, 2, N ELT Store & Process Traditional Architecture EDW Archive ETL Access Data Analyze Data Search Serve Serve Serve Optimize Implement Custom Application Point Solution ELT ELT Statistical Machine Learning SQL Filter? Filter? Filter? Filter? Machine Data Ingest
  • 8. 8© Cloudera, Inc. All rights reserved. Enterprise Data Warehouse ImplementData Sources ETLStructured Unstructured Ingest Storage #1, 2, N ELT Store & Process Traditional Architecture EDW Archive ETL Access Data Analyze Data Search Serve Serve Serve Optimize Implement Custom Application Point Solution ELT ELT Statistical Machine Learning SQL Challenges with Traditional Architectures 1) Limited Data 2) Long Time to Value 1 2 2 3) Sub Optimal Decisions 3 Filter? Filter? Filter? Filter? Machine Data Ingest Filter?
  • 9. 9© Cloudera, Inc. All rights reserved. The New Way Forward 1) Unlimited Data Access 2) Reduce Time to Value 3) Decision on all data Enterprise Data Warehouse ImplementData Sources ELT Store & Process Modern Architecture Access Data Analyze Data Optimize Implement Custom Application Point Solution Statistical Machine Learning SQL Structured Unstructured EDW ETL Serve ETL Active Ingest Ingest EDH Archive Load Cloudera ELT 3 2 2 3 1 Search Machine Data ETL Active Ingest
  • 10. 10© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights reserved. Overview on Data Flow in Cloudera EDH 10 3rd party or public Network Equipment Traditional RDMBS EDW Ffffffffff Event Based, Near Real Time • Flume • Spark Streaming • Kafka (coming soon) SQL / Relational • Sqoop – SQL Import including Metadata Web Services/API/Cmd line • Put/Store Copy/Move files • NFS Gateway HUE Web GUI • User Upload • User Copy/Move/Rename Third Part Integrations Ingest/Storage Process/Transformation WORKLOAD MANAGEMENT / Yarn (Resource Management) & Oozie (Workflow Engine) Hadoop File System (HDFS) / Distributed File Storage ELT, ETL, Transform, Cleanse, Pre-aggregate, analyze etc. SQL / Relational • Hive (Batch SQL) • Impala (Interactive SQL) Map Reduce – Java based distributed processing • Machine Learning libraries • Pig – scripting language to perform Map Reduce Spark – In Memory distributed processing • Java, Scala or Python Third Party Integrations SQL / Relational (ODBC/JDBC) • Hive (Batch SQL) • Impala (Interactive SQL) Web Services/API/Cmd line • Get/Move files • NFS Gateway HUE Web GUI • User Download • User Copy/Move/Rename Search Index • Solr Search, full featured with Facets, NLP, etc. Third Part Integrations Raw Data Insight and Value to User Publish/Consume
  • 11. 11© Cloudera, Inc. All rights reserved. AUTHENTICATION Guarding access to the system, its data, and its various systems LDAP Kerberos RPC PROTECTION Encryption for data at rest or in motion with full key management Cloudera Navigator: Encrypt & Key Trustee AUTHORIZATION Controlling who or what has access to a resource or service POSIX Permissions Apache Sentry AUDIT Capture a complete and immutable record of all activity Cloudera Navigator SIEM Tools Security Important? Cloudera Enterprise Data Hub provides Enterprise-Grade Security, Audit and Regulatory Compliance Governing Access to and Management of All Data-at-Rest and Data-in-Motion • Cloudera Manager and Navigator automate protections for Hadoop and related projects • Perimeter security • Role-based access control • The only complete policy-based management of sensitive data • Data lineage and discoverability
  • 12. 12© Cloudera, Inc. All rights reserved. Core Benefits of a Manufacturing Enterprise Data Hub ©2014 Cloudera, Inc. All rights reserved. • Full-Fidelity Active Archive • Any and All Kinds of Data • Accelerate Time to Insight (Scale) • Unlock Agility and Exploration • Consolidate Silos for 360o View • Enable Pervasive Analytics across the entire Value Chain (Design to Post Sales Delivery and Warranty)
  • 13. 13© Cloudera, Inc. All rights reserved. What value is there in Manufacturing Data Hub? • What product issues are paramount? • What are technology trends? • Efficient Parts Utilization—what is the best part for my design? • Is all my machine data being utilized? Design, R&D, PD, Engineering Hadoop Cloudera Secure Scalable Flexible Open Production, Quality, Manufacturing • Diagnose Production problems • What is the cause? People, Parts, Process, Suppliers? • Plant inventory • Resource utilization • Is all my shop floor data being analyzed? Supply Chain, Purchasing • Who are my best Suppliers? • Who are my worst Suppliers? • Consolidated view of the Supply Chain? • Supply Chain disruption impact analysis? • Consolidated Purchasing (360 Supplier view) Manufacturing Data Hub Delivery, Warranty, Support, Service • Review Customer 360 • Analyze Product Launch information • Detect Emerging Warranty Issues • Decrease Correction Times • Increased Accuracy of Warranty Forecast • Knowledge base for After Delivery Service Ask Bigger Questions of all the data
  • 14. 14© Cloudera, Inc. All rights reserved. Customer Story
  • 15. 15© Cloudera, Inc. All rights reserved. About Vehicle Manufacturer What do we do - Manufacture, Sell and Service Vehicles Who is this Manufacturer A worldwide leading Manufacturer of Vehicles
  • 16. 16© Cloudera, Inc. All rights reserved. Our Objectives Store and Analyze worldwide data from Dealers, Customers and Vehicles Better, Deeper Analysis Smarter Predictions, Earlier Detection
  • 17. 17© Cloudera, Inc. All rights reserved. The Pre-Hadoop Environment Parts Suppliers Dealers 1 Difficult to connect to multiple sources 1 BI/RDBMS/DW Challenge Claims Machine Data IDLE Vehicle Data WHY? • Volume • Too much to store, let alone query • Variety • Different formats, not all Table Based data ? ? ?
  • 18. 18© Cloudera, Inc. All rights reserved. The Pre-Hadoop Environment Parts Suppliers Dealers 2 Impossible to analyze all that data 2 BI/RDBMS/DW Another Challenge Claims Machine Data IDLE Vehicle Data WHY? • It wasn’t even in one system. • Different workloads (Macro vs. Micro Analysis) Advanced Analytics
  • 19. 19© Cloudera, Inc. All rights reserved. Vehicle Manufacturer Modern Hadoop Architecture Complete storage of data (structured and unstructured) 1 Improvements Process 1 Store HDFS, HBase Claims Machine Data IDLE Vehicle Data Sqoop Flume, KafkaCopy (XML)
  • 20. 20© Cloudera, Inc. All rights reserved. Vehicle Manufacturer Modern Hadoop Architecture Process Data as needed2 Improvements Process 2 Store HDFS, HBase Claims Machine Data IDLE Vehicle Data Sqoop Flume, KafkaCopy (XML) MR,Pig, Spark, ETL Tools on Hadoop Process Complete storage of data (structured and unstructured) 1
  • 21. 21© Cloudera, Inc. All rights reserved. Vehicle Manufacturer Modern Hadoop Architecture Analysis and Large Scale Ad-Hoc Queries 3 Improvements Process 3 Store HDFS, HBase Claims Machine Data IDLE Vehicle Data Sqoop Flume, KafkaCopy (XML) MR,Pig, Spark, ETL Tools on Hadoop Process HUE Discover Impala Solr Access BI Hive Process Data as needed2 Complete storage of data (structured and unstructured) 1
  • 22. 22© Cloudera, Inc. All rights reserved. Vehicle Manufacturer Modern Hadoop Architecture Analysis and Large Scale Ad-Hoc Queries 3 Improvements Process 4 Store HDFS, HBase Claims Machine Data IDLE Vehicle Data Sqoop Flume, KafkaCopy (XML) MR, Pig, Spark, ETL Tools on Hadoop Process HUE Discover Impala Solr Access BI Hive Process Data as needed2 Complete storage of data (structured and unstructured) 1 R MLlibSpark 4 Advanced Analytics
  • 23. 23© Cloudera, Inc. All rights reserved. Business and Technical ROI Technology ROI Business ROI Proactive Quality Assurance Build machine learning algorithms that identify production anomalies prior to field testing and find performance flaws that could not be identified in R&D. Predictive Intervention Combine data streaming from machine data (vehicles, plant floor), diagnostics, and product/engineering data to proactively avoid or address issues and deploy upgrades. Merge together storage systems for simpler management – Active Archive – Retire Legacy Systems Unified access to disparate, Siloed data – Retire single use systems Scale affordably – Grow without destroying the budget Flexible and Agile – IT can focus on Solutions for the Business vs. being a Data Plumber
  • 24. 24© Cloudera, Inc. All rights reserved. 24 What happened at the parts supplier that caused a spike in support calls during the past 30 minutes for devices manufactured in Birmingham? How many devices were returned last month? Reduce the time to QC issue resolution from weeks to hours. Drive $15 to $25 million annual savings for each manufacturer. © 2014 Cloudera, Inc. All rights reserved.
  • 25. 25© Cloudera, Inc. All rights reserved. Can we predict which chips have the highest likelihood of failure and intervene to proactively prevent manufacturing issues? Which chips most commonly failed last week? Analyses now executable on hundreds of thousands of units in just seconds. 60x faster data reload and 300% query speedup enable real-time debugging. 25© 2014 Cloudera, Inc. All rights reserved.
  • 26. 26© Cloudera, Inc. All rights reserved. Thank you
  • 27. 27© Cloudera, Inc. All rights reserved. Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 850 World Class Support 24x7 Global Staff Pro-active & Predictive Support Programs Mission Critical Thousands of Enterprise Users Over 500+ Paying Subscription Customers The Largest Ecosystem Over 1450+ Partners Cloudera University Over 100,000+ Trained Open Source Leaders Cloudera Employees are Leading Developers & Contributors Total Capital Raised $1B+ (from Intel, Google, Dell, T. Rowe Price, Accel, Greylock) Mission Help Organizations Leverage the Power of All Their Data to Ask Bigger Questions.
  • 28. 28© Cloudera, Inc. All rights reserved. Expanding Data Requires A New Approach What we do Copy Data to Applications What we should do Bring Applications to Data Data Information-centric businesses use all Data: Multi-structured, Internal & external data of all types App App App Process-centric businesses use: • Structured data mainly • Internal data only • “Important” data only • Multiple copies of data App App App Data Data Data Data
  • 29. 29© Cloudera, Inc. All rights reserved. Hadoop Changes the Game: Storage & Compute Together ©2014 Cloudera, Inc. All rights The Hadoop WayThe Old Way $30,000+ per TB Expensive & Unattainable • Hard to scale • Network is a bottleneck • Only handles relational data • Difficult to add new fields & data types Expensive, Special purpose, “Reliable” Servers Expensive Licensed Software Network Data Storage (SAN, NAS) Compute (RDBMS, EDW) $300-$1,000 per TB Affordable & Attainable • Scales out forever • No bottlenecks • Easy to ingest any data • Agile data access Commodity “Unreliable” Servers Hybrid Open Source Software Compute (CPU) Memory Storage (Disk) z z
  • 30. 30© Cloudera, Inc. All rights reserved. Enabling the “App Store” of Big Data (Large Ecosystem) Data Systems Enterprise Data Hub Security and Administration Unlimited Storage Process Discover Model Serve Applications System Integration Infrastructure More than 1,450 partners ensure compatibility with existing investments, lower skill barriers, and help maximize value from your data.Operational Tools
  • 31. 31© Cloudera, Inc. All rights reserved. WEB/MOBILE APPLICATIONS ONLINE SERVING SYSTEM ENTERPRISE DATA WAREHOUSE ENTERPRISE REPORTINGBI / ANALYTICSMACHINE LEARNING CONVERGED APPLICATIONS CLOUDERA MANAGER META DATA / ETL TOOLS ENTERPRISE DATA HUB The Modern Information Architecture Data Architects System Operators Engineers Data Scientists Analysts Business Users Customers & End Users SYS LOGS WEB LOGS FILES RDBMS
  • 32. 32© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights A High Level View of the Journey Not Only SQL Agile Exploration ETL Acceleration Operational Efficiency (Faster, Bigger, Cheaper) Transformative Applications (New Business Value) Cheap Storage BusinessIT EDW Optimization Pervasive Analytics

Editor's Notes

  1. The manufacturing sector was an early and intensive user of data to drive quality and efficiency, adopting information technology and automation to design, build, and distribute products since the dawn of the computer era. In the 1990s, manufacturing companies racked up impressive annual productivity gains because of both operational improvements that increased the efficiency of their manufacturing processes and improvements in the quality of products they manufactured. For example, advanced manufactured products such as computers became much more powerful. Manufacturers also optimized their global footprints by placing sites in, or outsourcing production to, low-cost regions. But despite such advances, manufacturing, arguably more than most other sectors, faces the challenge of generating significant productivity improvement in industries that have already become relatively efficient. We believe that big data can underpin another substantial wave of gains. These gains will come from improved efficiency in design and production, further improvements in product quality, and better meeting customer needs through more precisely targeted products and effective promotion and distribution. For example, big data can help manufacturers reduce product development time by 20 to 50 percent and eliminate defects prior to production through simulation and testing. Using real-time data, companies can also manage demand planning across extended enterprises and global supply chains, while reducing defects and rework within production plants. Overall, big data provides a means to achieve dramatic improvements in the management of the complex, global, extended value chains that are becoming prevalent in manufacturing and to meet customers’ needs in innovative and more precise ways, such as through collaborative product development based on customer data.
  2. No individual record is particularly valuable, but having every record opens the door to extreme value. This sector generates data from a multitude of sources, from instrumented production machinery (process control), to supply chain management systems, to systems that monitor the performance of products that have already been sold (e.g., during a single cross-country flight, a Boeing 737 generates 240 terabytes of data). And the amount of data generated will continue to grow exponentially. The number of RFID tags sold globally is projected to rise from 12 million in 2011 to 209 billion in 2021. IT systems installed along the value chain to monitor the extended enterprise are creating additional stores of increasingly complex data, which currently tends to reside only in the IT system where it is generated. Manufacturers will also begin to combine data from different systems including, for example, computer-aided design, computer-aided engineering, computer-aided manufacturing, collaborative product development management, and digital manufacturing, and across organizational boundaries in, for instance, end-to-end supply chain data.
  3. Key takeaway: It is not just a BI or analytics challenge, it is the way that data is managed. Keeping 3 main high level objectives of an architecture built for Data Discovery in mind- accessing data, analyzing data, and experimenting and iterating fast- we can examine a traditional architecture and see where organizations might run into issues. Questions for customer: Does this look like your architecture? What limitations are you “living with” today?
  4. Limited Data Access Data siloes Archived or deleted data No unstructured data Only SQL Long Time to Value Resource intensive ad-hoc ELT, CONVERT TO TABLES (SQL) Inflexible Adding dimensions takes months Slow large scale queries Sub-Optimal Decisions Limits on data sets Guessing? Missing Critical items Frustrated USERS!
  5. Key takeaway: An EDH provides the foundation to change the way you collect and manage data in order to provide your analyst what they need in less time. No Filter, No missing data! ETL on the fly: Talk to schema-on-write vs schema-on-read (http://www.slideshare.net/awadallah/schemaonread-vs-schemaonwrite). 1) Unlimited Data Access (Active archive, Scalable storage, Unstructured data) 2) Reduce Time to Value (ETL on the fly, Parallel processing, Complete data access, flexible-any schema, any file) 3) Best Decisions (Decisions on all the data)
  6. Pulling from the “Insights Section”
  7. Why Hadoop slide content: Even with primarily relational systems, it involved hundreds of sources Getting a BI tool to connect to so many sources is … not fun More times than not, we needed to understand a subset or aggregate of this data - not all of the data! Can use Pig to process, extract, filter the data Can use Hive - a SQL like query language - to query my data
  8. Why Hadoop slide content: Even with primarily relational systems, it involved hundreds of sources Getting a BI tool to connect to so many sources is … not fun More times than not, we needed to understand a subset or aggregate of this data - not all of the data! Can use Pig to process, extract, filter the data Can use Hive - a SQL like query language - to query my data
  9. Why Hadoop slide content: Even with primarily relational systems, it involved hundreds of sources Getting a BI tool to connect to so many sources is … not fun More times than not, we needed to understand a subset or aggregate of this data - not all of the data! Can use Pig to process, extract, filter the data Can use Hive - a SQL like query language - to query my data
  10. Why Hadoop slide content: Even with primarily relational systems, it involved hundreds of sources Getting a BI tool to connect to so many sources is … not fun More times than not, we needed to understand a subset or aggregate of this data - not all of the data! Can use Pig to process, extract, filter the data Can use Hive - a SQL like query language - to query my data
  11. Link to account record in SFDC: https://na6.salesforce.com/0018000000y2EIt?srPos=0&srKp=001 Omneo, a Division of Camstar, drives $15 to $25 million in annual savings for electronics manufacturers based on its ability to address supply chain issues in near real time. Background: Today’s consumers have high expectations for the products we use everyday, particularly when it comes to our devices. We want new products to come out faster, at lower prices, with more capabilities than before. But we also demand increased reliability. Camstar, a 30-year veteran in the enterprise manufacturing and supply chain space, saw this trend and identified an opportunity. Challenge: Electronic device manufacturers are responsible for delivering millions of products, each comprised of hundreds of components that are sourced from all over the globe, put together, and pushed through distribution channels to customers. There’s a large margin for error. Camstar set out to address this by spinning off a division called Omneo, who set out to build 360-degree view into supply chain and product quality. Solution: After evaluating IBM Netezza, Infobright, Cassandra, MongoDB, and Hadoop, Omneo decided to try out Hadoop based on 3 main factors: Scalability to grow with customers’ needs over time Flexibility to meet the needs of diverse customers and data sets in a multi-tenant environment Low TCO for an efficient big data solution The team downloaded Cloudera Express since it was easy and no one had any prior experience with the technology. After a few months of demonstrating promising results, Omneo decided to perform a TCO analysis of Cloudera vs. IBM Netezza and their legacy (Oracle) data warehouse. Cloudera’s costs came in 75% lower per TB than IBM Netezza and 90% lower per TB than the incumbent. But before moving forward with a Cloudera Enterprise subscription, the team compared the different Hadoop vendors. They ultimately decided to move forward with Cloudera due to 4 main factors: Long-term company strategy and viability Ease of use and maturity of Cloudera Manager Enterprise-grade support Dedication to open source Omneo has deployed a multi-tenant enterprise data hub from Cloudera as the platform behind its supply chain cloud solution, which ingests machine data and existing system data from throughout the manufacturing process, including from clients’ factory data, supplier data, field services, after-market repairs, and re-manufacturing data. The company uses MapReduce to transform and manipulate data into any structure needed; HBase to access specific records in real time; and Cloudera Search to rapidly index all raw data in a way that makes sense for customers. Results: Omneo’s supply chain SaaS delivers a 360-degree view of the supply chain process in seconds, allowing manufacturers to access their data in different ways, on the fly. If something happens at any supplier that drives a sudden increase in quality issues, they can figure out where the issue stems from and why in minutes or hours. In traditional environments, these investigations would take weeks or months. Instead of spending time trying to pinpoint challenges, manufacturers can spend their time resolving them. Omneo’s clients report total annual savings between $15-25 million each, conservatively.  
  12. AMD improves yield predictions with a Cloudera-powered engineering data warehouse. Background: Advanced Micro Devices (AMD) is a multinational semiconductor manufacturer that designs and builds graphics cards and microprocessors powering millions of the world's personal computers, tablets, gaming consoles, embedded devices, and cloud servers. All of the world’s leading PC and major video game console manufacturers have AMD technology inside. AMD relies on manufacturing test data to ensure product quality and perform engineering analysis in order to improve upon its world-class product designs. Challenge: The company wanted to empower its engineers by giving them access to larger data sets at faster speeds. But the incumbent environment only stored less than 30% of available data elements, was built with several different integration tools, had many integration steps and relied on a large IT team to support and maintain this system. In 2011, there was an environment outage that took weeks to recover, so AMD initiated an Engineering Data Warehouse (EngDW) project to find a more agile, cost-effective solution and a simpler, more robust way to store, process, and fetch larger amounts of data for AMD’s engineers. Solution: The semiconductor manufacturer replaced its legacy engineering data warehouse with the Dell Cloudera Solution for Apache Hadoop. AMD runs a 34-node production cluster today, which collects data throughout the manufacturing process. Hundreds of millions of new digital and parametric test readings are loaded to the cluster every day. At the heart of the EngDW project are CDH and HBase. A custom query engine reads from HBase to put the test measurements in the hands of the company’s engineers. Results: AMD's decision to move from an RDBMS to a Hadoop platform that uses Cloudera on Dell servers powered by AMD Opteron processors has resulted in orders of magnitude performance improvement, in terms of both data loads and analytics. Query times have been reduced by up to 300%, running on larger data sets than before. 99% of all queries execute in 15 minutes or less, with a median execution time of just 23 seconds. Queries on hundreds of thousands of units execute two orders of magnitude faster than before. Data reloads at a rate of three months per day, whereas it used to take a full day to reload 1.5 days’ data—that’s 60X faster. Not only has AMD's EngDW project brought significant performance benefits, but it delivers greater functionality and value as well. Query results on EngDW now have an unlimited row limit, compared to the previous limit of just 100,000 rows (which had been set to ensure queries would return results in a given period of time). The EngDW project's Hadoop-based cluster allows AMD to store more than 90% of available data elements spanning 1.5-plus years’ history, whereas the previous system stored less than 30% of data available for only three to four months’ history. Now that AMD engineers can access greater amounts of test data in higher detail and at faster speeds, they can apply insights to debug and make continuous improvements to ensure their products meet customer needs. AMD has also significantly reduced the TCO of its EngDW through lower vendor support costs for relational database management software, less vendor support for data integration tools and software, fewer steps and tools needed for data integration, less vendor support for high-end storage arrays (external SAN storage), and a smaller IT support staff needed for end-to-end management.  
  13. B+
  14. Today we're in the middle of a shift in how businesses use information. In the past, you'd define a set of business processes, build applications around each of them, and then go about gathering, conforming, and merging the necessary data sets to support those applications. From an infrastructure perspective, you'd be bringing the data over to the compute, often in relational databases. But you'd be leaving quite a lot on the table. The modern realities of business demand a new approach. Today companies need, more than ever, to become information-driven, but given the amount and diversity of information available, and the rate of change in business, it's simply unsustainable to keep moving around and transforming huge volumes of data.
  15. Pricing Data: Cloudera: HW + SW per-year list prices for Basic thru EDH at various configs Old Way: Various sources. One of note: - Cowen / Goldmacher coverage initiation of Teradata, June 17, 2013 - List price of high-end appliance (which he thinks is more comparable to our solution) is $57K/TB + maintenance for an annual cost of $39K/TB - Prices have likely decreased, but we estimate they are still in excess of $30K/TB/year - List price of their low-end appliance is $12K/TB + maint or $8K per year
  16. Cloudera partners more broadly and deeply across the Hadoop ecosystem than any other vendor. With over 1200 partners and counting, our partnerships offer: Compatibility with your existing tools and skills 160+ certified on Cloudera 5, including all 12 of the 12 Gartner Business Intelligence Magic Quadrant leaders Flexible deployment options On-premises Public, private, or hybrid cloud Appliances and engineered systems Partnerships you can trust Deep engineering relationships Comprehensive certification program