SlideShare a Scribd company logo
1 of 35
Download to read offline
We’ll get started soon… 
Q&A box is available for your questions 
Webinar will be recorded for future viewing 
Thank you for joining! 
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Deliver the Data Lake (demo/deep dive) 
…using HDP and Red Hat JBoss Data Virtualization 
Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
We do Hadoop.
Your speakers… 
Raghu Thiagarajan, Dir, Partner Product Management, Hortonworks 
Kimberly Palko, Principal Product Manager, Red Hat 
Kenny Peeples, Principal Technical Marketing Manager, Red Hat 
Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
An architectural shift towards an HDP Data Lake 
Unlocking the Data Lake 
SCALE SCOPE 
RDBMS 
MPP 
EDW 
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Data Lake 
Enabled by YARN 
• Single data repository, 
shared infrastructure 
• Multiple biz apps 
accessing all the data 
• Enable a shift from 
reactive to proactive 
interactions 
• Gain new insight across 
the entire enterprise 
New Analytic Apps 
or IT Optimization 
HDP 2.1 
Governance 
& Integration 
Security 
Operations 
Data Access 
YARN 
Data Management
What is a Data Lake? 
Architectural Pattern in the Data Center 
Uses Hadoop to deliver deeper insight across a large, broad, diverse set 
of data efficiently 
§ Multipurpose, Open PLATFORM for Data (NOT a database) 
§ Land all data in a single place and interact with it in many ways 
§ Allows for the ecosystem to provide higher level services (SAS, SAP, Microsoft for Streaming, 
MPP, In-memory, etc..) 
§ First class data management capabilities (metadata management, security, transformation 
pipelines, replication, retention, etc..) 
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP Data Lake Solution Architecture 
Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced 
Security 
Step 4: Schedule and Orchestrate 
Step 3: Transform, Aggregate & Materialize 
STORM 
JMS 
Step 1:Extract & Load 
NFS 
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HIVE PIG Cascading 
(table & user-defined metadata) 
Step 2: Model/Apply Metadata 
compute 
& 
storage 
HCATALOG 
. . . 
SolR 
Storm 
. . . 
. . 
compute 
& 
storage 
. 
. 
YARN 
AMBARI 
Data Lake HDP Grid 
Use Case Type 1: 
Materialize & 
Exchange 
Interactive 
Hive Server 
(Tez/Stinger) 
Stream Processing, 
Real-time Search, 
MPI, etc. 
YARN Apps 
Opens up Many 
New Use Cases 
Query/ 
Analytics/Reporting 
Tools 
Tableau, Excel, 
Microstrategy 
Datameer, Platfora, 
Business Objects 
Use Case Type 2: 
Explore/Visualize 
FALCON (Data pipeline & flow management) 
SOURCE DATA 
Click Stream 
Sales 
Transactions 
Product Data 
Marketing/ 
Inventory 
Social Data 
EDW 
NFS 
Apache Argus (Unified Access Controls and Audit) 
(data processing) 
Exchange 
HBase 
Client 
Sqoop/Hive 
Downstream 
Data Sources 
OLTP 
HBase 
EDW 
(Teradata) 
MR2 Graph 
SAS 
Ingestion 
SQOOP 
FLUME 
Web HDFS 
REST 
HTTP 
Streamin 
g 
TEZ 
Mahout
HDP Data Lake Solution Architecture + Virtual Data Mart 
Manage Steps 1-4: Data Management with Falcon, Security with HDP 
Advanced Security 
Step 4: Schedule and Orchestrate 
HIVE PIG Cascadin 
g 
Step 3: Transform, Aggregate & Materialize 
(table & user-defined metadata) 
Step 2: Model/Apply Metadata 
compute 
& 
storage 
STORM 
JMS 
Step 1:Extract & Load 
NFS 
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HCATALOG 
. . . 
SolR 
Storm 
. . . 
. . 
compute 
& 
storage 
. 
. 
YARN 
AMBARI 
Data Lake HDP Grid 
Use Case Type 1: 
Materialize & 
Exchange 
Interactive 
Hive Server 
(Tez/Stinger) 
Stream 
Processing, 
Real-time Search, 
MPI, etc. 
YARN Apps 
Opens up Many 
New Use Cases 
Query/ 
Analytics/ 
Reporting Tools 
Tableau, Excel, 
Microstrategy 
Datameer, 
Platfora, Business 
Objects 
Use Case Type 2: 
Explore/Visualize 
FALCON (Data pipeline & flow management) 
SOURCE DATA 
Click Stream 
Sales 
Transactions 
Product Data 
Marketing/ 
Inventory 
Social Data 
EDW 
NFS 
Apache Argus (Unified Access Controls and Audit) 
(data processing) 
Exchange 
HBase 
Client 
Sqoop/Hive 
Downstream 
Data Sources 
OLTP 
HBase 
EDW 
(Teradata) 
MR2 Graph 
SAS 
Ingestion 
SQOOP 
FLUME 
Web HDFS 
REST 
HTTP 
Streami 
ng 
TEZ 
Mahout 
Dept Base Virtual Database (VDB) 
Team 1 
VDB 
Team2 
VDB 
View1 View2
Yarn allows for new processing engines 
Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced 
STORM 
JMS 
Step 1:Extract & Load 
NFS 
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Security 
Step 4: Schedule and Orchestrate 
HIVE PIG Cascading 
Step 3: Transform, Aggregate & Materialize 
(table & user-defined metadata) 
Step 2: Model/Apply Metadata 
compute 
& 
storage 
HCATALOG 
. . . 
SolR 
Storm 
. . . 
. . 
compute 
& 
storage 
. 
. 
YARN 
AMBARI 
Data Lake HDP Grid 
Use Case Type 1: 
Materialize & 
Exchange 
Interactive 
Hive Server 
(Tez/Stinger) 
Stream Processing, 
Real-time Search, 
MPI, etc. 
YARN Apps 
Opens up Many New 
Use Cases 
Query/ 
Analytics/Reporting 
Tools 
Tableau, Excel, 
Microstrategy 
Datameer, Platfora, 
Business Objects 
Use Case Type 2: 
Explore/Visualize 
FALCON (Data pipeline & flow management) 
SOURCE DATA 
Click Stream 
Sales 
Transactions 
Product Data 
Marketing/ 
Inventory 
Social Data 
EDW 
NFS 
Apache Argus (Unified Access Controls and Audit) 
(data processing) 
Exchange 
HBase 
Client 
Sqoop/Hive 
Downstream 
Data Sources 
OLTP 
HBase 
EDW 
(Teradata) 
MR2 Graph 
SAS 
Ingestion 
SQOOP 
FLUME 
Web HDFS 
REST 
HTTP 
Streamin 
g 
TEZ 
Mahout
Falcon enables Governance of Data Pipelines 
Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced 
STORM 
JMS 
Step 1:Extract & Load 
NFS 
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Security 
Step 4: Schedule and Orchestrate 
HIVE PIG Cascading 
Step 3: Transform, Aggregate & Materialize 
(table & user-defined metadata) 
Step 2: Model/Apply Metadata 
compute 
& 
storage 
HCATALOG 
. . . 
SolR 
Storm 
. . . 
. . 
compute 
& 
storage 
. 
. 
YARN 
AMBARI 
Data Lake HDP Grid 
Use Case Type 1: 
Materialize & 
Exchange 
Interactive 
Hive Server 
(Tez/Stinger) 
Stream Processing, 
Real-time Search, 
MPI, etc. 
YARN Apps 
Opens up Many New 
Use Cases 
Query/ 
Analytics/Reporting 
Tools 
Tableau, Excel, 
Microstrategy 
Datameer, Platfora, 
Business Objects 
Use Case Type 2: 
Explore/Visualize 
FALCON (Data pipeline & flow management) 
SOURCE DATA 
Click Stream 
Sales 
Transactions 
Product Data 
Marketing/ 
Inventory 
Social Data 
EDW 
NFS 
Apache Argus (Unified Access Controls and Audit) 
(data processing) 
Exchange 
HBase 
Client 
Sqoop/Hive 
Downstream 
Data Sources 
OLTP 
HBase 
EDW 
(Teradata) 
MR2 Graph 
SAS 
Ingestion 
SQOOP 
FLUME 
Web HDFS 
REST 
HTTP 
Streamin 
g 
TEZ 
Mahout
Apache Falcon: Data Governance in the Lake 
Falcon Adds the required data governance features 
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Data pipeline 
Raw Clean Prep 
Defined in 
Adds the required data governance 
Auto generate & 
orchestrate 
Multiple complex Oozie workflows 
Job1 
Job2 JobN 
Job3 
Job4 Job7 Job6 JobN 
Job1 
Job2 JobN 
Job3 
Job4 Job7 Job6 JobN 
Other Hadoop 
ecosystem tools 
Eg. DistCp 
features 
DEFINITION 
Replication | Retention 
Eviction | Late data 
MONITORING 
TRACING 
Audit | Lineage 
Tagging
Mashing up diverse data types in the Data Lake 
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Virtual Data Marts with Red Hat JBoss 
Data Virtualization and Hortonworks HDP 
Kimberly Palko 
Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Supply and Integration Solution 
Data Virtualization sits in front of multiple data 
sources and 
ü allows them to be treated a single source 
ü delivering the desired data 
ü in the required form 
ü at the right time 
ü to any application and/or user. 
THINK VIRTUAL MACHINE FOR DATA 
Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Easy Access to Big Data 
Hive 
Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
• Reporting tool accesses the 
data virtualization server via rich 
SQL dialect 
• The data virtualization server 
translates rich SQL dialect to 
HiveQL 
• Hive translates HiveQL to 
MapReduce 
• MapReduce runs MR job on big 
data 
MapReduce 
HDFS 
Analytical 
Reporting 
Tool 
Data 
Virtualization 
Server 
Hadoop 
Big Data
Use Case 1: Combine data from 
Hadoop with traditional data 
sources 
Problem: 
Data from new data sources like social media, 
clickstream and sensors needs to be combined 
with data from traditional sources to get the full 
value. 
Solution: 
Leverage JBoss Data Virtualization to mashup 
new data in Hadoop with data in traditional data 
sources without moving or copying any data and 
access it through a variety of BI tools and SOA 
technologies. 
Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Data 
can 
be 
accessed 
by 
mul/ple 
tools 
and 
methods 
already 
in-­‐house 
Consume 
Compose 
Connect 
JBoss Data 
Virtualization 
Hive 
SOURCE 
1: 
Hive/Hadoop 
contains 
data 
from 
new 
data 
sources 
like 
social 
media, 
clickstream 
and 
sensor 
data 
SOURCE 
2: 
Tradi/onal 
rela/onal 
databases 
in 
the 
enterprise
Use Case 2: Federating across 
Geographically Distributed 
Hadoop Clusters 
Problem: 
Geographically distributed Hadoop clusters contains 
sensitive data like patient records or customer 
identification that cannot be accessed by other 
regions due to regulatory policy. IT needs access to 
all data, but users can only access the data in their 
region. 
Solution: 
Leverage JBoss Data Virtualization to provide Row 
Level Security and Masking of columns while 
federating across Hadoop clusters. 
Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Data 
can 
be 
accessed 
by 
mul/ple 
tools 
and 
methods 
already 
in-­‐house 
Consume 
Compose 
Connect 
JBoss Data 
Virtualization 
Hive 
Hadoop 
cluster 
in 
one 
geographic 
region 
Hive 
Hadoop 
cluster 
in 
a 
second 
geographic 
region
Data for entire organization in Hadoop Data Lake 
Problem: How does IT control access and give business users just the 
data they need? 
- Does every line of business have access to everyone’s data? 
- How do business users get access to the data they need in a 
simple (even self-service) way? 
Hadoop Data Lake 
HR Employee 
Files Server 
Marketing 
Clickstream 
Data Finance 
Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Expense 
Reports 
Logs 
Sales 
Transactions 
Customer 
Twitter Sentiment Accounts 
Data
Secure, Self-Service Virtual Data Marts for Hadoop 
Solution: Use JBoss Data Virtualization to create virtual data marts 
on top of a Hadoop cluster 
- Lines of Business get access to the data they need in a simple manner 
- IT maintains the process and control it needs 
- All data remains in the data lake, nothing is copied or moved 
Marketing Finance IT 
Marketing 
Clickstream Data 
Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Hadoop Data Lake 
HR Employee Files Sales Transactions 
Finance 
Customer 
Expense 
Reports 
Twitter Sentiment Accounts 
Data 
Sales 
Server Logs
Optional hierarchical data architectures with virtual data mart 
Can be combined with security features like user role access and row and 
column masking 
Team2 
VDB 
Dept Base Virtual Database (VDB) 
Team 1 
VDB 
Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
View1 View2
Want most recent data in an operational data store 
Problem: All the legacy and archived data is in the Hadoop data lake. 
We want to access the most recent, up to the minute, operational data 
often and quickly. 
Marketing 
Clickstream Data 
Hadoop Data Lake 
Historical Data 
Finance 
Expense 
Reports 
Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HR Employee Files Server 
Logs 
Sales Transactions 
Customer 
Accounts 
Twitter Sentiment Data
Caching For Faster Performance – Materialized View 
Query 1 Query 2 
Virtual Database (VDB) 
Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Cached or Materialized 
View 1 
View 1 
• Same cached view for multiple 
queries 
• Refreshed automatically or manually 
• Cache repository can be any 
supported data source
Want most recent data in an operational data store 
Solution: Use JBoss Data Virtualization to integrate up to the minute data from 
multiple diverse data sources that can be quickly queried. 
- Use HDP for all data older than today. 
- Use JDV to materialize the data in HDP for faster access and to combine with operational VDB 
Materialized 
View 
Operational VDB Historical Data 
with up to the 
minute data 
Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Marketing 
Clickstream Data 
Hadoop Data Lake 
HR Employee 
Files 
Finance 
Expense 
Reports 
Server 
Logs 
Sales 
Transactions 
Customer 
Accounts 
Twitter Sentiment 
Data 
Nightly 
Transfer from 
Data Sources
Demonstration 
Virtual Data Marts 
Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
with 
Hadoop Data Lake 
Kenny Peeples
Use Case 3 - Overview 
Objexcxtivxe : 
–Purpose oriented data views for 
functional teams over a rich variety of 
semi-structured and structured data 
Problem: 
–Data Lakes have large volumes of 
consolidated clickstream data, product 
and customer data that need to be 
constrained for multi-departmental use. 
Solution: 
–Leverage HDP to mashup Clickstream 
analysis data with product and customer 
data on HDP to answer 
- Leverage Jboss Data Virt to provide 
Virtual data marts for each of Marketing 
and Product teams to ….. 
Page 29 © Hortonworks Inc. 2011 – 2014. All Rights RHesOerRveTdO NWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Use Case 3 - Architecture 
APPLICATIONS 
Business 
Analy/cs 
Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Custom 
Applica/ons 
Packaged 
Applica/ons 
DATA 
SYSTEM 
SOURCES 
Emerging 
Sources 
(Sensor, 
Sen/ment, 
Geo, 
Unstructured) 
Exis/ng 
Sources 
(CRM, 
ERP, 
Clickstream, 
Logs) 
HDP 2.1 
Governance 
& Integration 
Security 
Operations 
Data Access 
VIRTUAL 
DATA 
MART 
Data Management
Use Case 3 - Resources 
• GUIDE 
How to guide: https://github.com/DataVirtualizationByExample/HortonworksUseCase3 
Tutorial: Available soon 
• VIDEOS: 
http://vimeo.com/user16928011/hwxuc3configuration 
http://vimeo.com/user16928011/hwxuc3run 
http://vimeo.com/user16928011/hwxuc3overview 
• SOURCE: 
https://github.com/DataVirtualizationByExample/HortonworksUseCase3 
Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Benefits of JBoss Data Virtualization with 
Hortonworks HDP 2.1 
• Creates virtual databases for controlling 
access to data in a data lake while giving 
lines of business the autonomy they seek 
• Combines new data in Hadoop with data in 
traditional data sources without moving or 
copying data 
• Gives access to a variety of BI and analytics 
tools 
• Provides caching for faster access to data 
• Provides consistent security policy across 
multiple data sources 
Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Thank you! 
Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Hortonworks and Red Hat JBoss Data Virtualization
Next Steps... 
More about Red Hat & Hortonworks 
http://hortonworks.com/partner/redhat 
Download the Hortonworks Sandbox 
Learn Hadoop 
Build Your Analytic App 
Try Hadoop 2 
Contact us: events@hortonworks.com 
Page 34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Don’t Forget to Register for our Next Webinar! 
Page 35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
September 17th, 10 AM PST 
Red Hat JBoss Data Virtualization and Hortonworks Data Platform 
http://info.hortonworks.com/RedHatSeries_Hortonworks.html

More Related Content

What's hot

Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchHortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataHortonworks
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Hortonworks
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveHortonworks
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDataWorks Summit
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your BudgetHortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsHortonworks
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceHortonworks
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks
 

What's hot (20)

Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your Budget
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 

Similar to Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3

Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Rommel Garcia
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in HadoopRommel Garcia
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championAmeet Paranjape
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - finalHortonworks
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitDataWorks Summit
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
YARN - Strata 2014
YARN - Strata 2014YARN - Strata 2014
YARN - Strata 2014Hortonworks
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGskumpf
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopPOSSCON
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksData Con LA
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopHortonworks
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Hortonworks
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark Summit
 
Spark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's KeynoteSpark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's KeynoteHortonworks
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3Hortonworks
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks
 

Similar to Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3 (20)

Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
YARN - Strata 2014
YARN - Strata 2014YARN - Strata 2014
YARN - Strata 2014
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
Spark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's KeynoteSpark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's Keynote
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 

Recently uploaded (20)

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 

Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3

  • 1. We’ll get started soon… Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining! Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 2. Deliver the Data Lake (demo/deep dive) …using HDP and Red Hat JBoss Data Virtualization Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved We do Hadoop.
  • 3. Your speakers… Raghu Thiagarajan, Dir, Partner Product Management, Hortonworks Kimberly Palko, Principal Product Manager, Red Hat Kenny Peeples, Principal Technical Marketing Manager, Red Hat Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 4. An architectural shift towards an HDP Data Lake Unlocking the Data Lake SCALE SCOPE RDBMS MPP EDW Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data Lake Enabled by YARN • Single data repository, shared infrastructure • Multiple biz apps accessing all the data • Enable a shift from reactive to proactive interactions • Gain new insight across the entire enterprise New Analytic Apps or IT Optimization HDP 2.1 Governance & Integration Security Operations Data Access YARN Data Management
  • 5. What is a Data Lake? Architectural Pattern in the Data Center Uses Hadoop to deliver deeper insight across a large, broad, diverse set of data efficiently § Multipurpose, Open PLATFORM for Data (NOT a database) § Land all data in a single place and interact with it in many ways § Allows for the ecosystem to provide higher level services (SAS, SAP, Microsoft for Streaming, MPP, In-memory, etc..) § First class data management capabilities (metadata management, security, transformation pipelines, replication, retention, etc..) Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 6. HDP Data Lake Solution Architecture Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced Security Step 4: Schedule and Orchestrate Step 3: Transform, Aggregate & Materialize STORM JMS Step 1:Extract & Load NFS Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HIVE PIG Cascading (table & user-defined metadata) Step 2: Model/Apply Metadata compute & storage HCATALOG . . . SolR Storm . . . . . compute & storage . . YARN AMBARI Data Lake HDP Grid Use Case Type 1: Materialize & Exchange Interactive Hive Server (Tez/Stinger) Stream Processing, Real-time Search, MPI, etc. YARN Apps Opens up Many New Use Cases Query/ Analytics/Reporting Tools Tableau, Excel, Microstrategy Datameer, Platfora, Business Objects Use Case Type 2: Explore/Visualize FALCON (Data pipeline & flow management) SOURCE DATA Click Stream Sales Transactions Product Data Marketing/ Inventory Social Data EDW NFS Apache Argus (Unified Access Controls and Audit) (data processing) Exchange HBase Client Sqoop/Hive Downstream Data Sources OLTP HBase EDW (Teradata) MR2 Graph SAS Ingestion SQOOP FLUME Web HDFS REST HTTP Streamin g TEZ Mahout
  • 7. HDP Data Lake Solution Architecture + Virtual Data Mart Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced Security Step 4: Schedule and Orchestrate HIVE PIG Cascadin g Step 3: Transform, Aggregate & Materialize (table & user-defined metadata) Step 2: Model/Apply Metadata compute & storage STORM JMS Step 1:Extract & Load NFS Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HCATALOG . . . SolR Storm . . . . . compute & storage . . YARN AMBARI Data Lake HDP Grid Use Case Type 1: Materialize & Exchange Interactive Hive Server (Tez/Stinger) Stream Processing, Real-time Search, MPI, etc. YARN Apps Opens up Many New Use Cases Query/ Analytics/ Reporting Tools Tableau, Excel, Microstrategy Datameer, Platfora, Business Objects Use Case Type 2: Explore/Visualize FALCON (Data pipeline & flow management) SOURCE DATA Click Stream Sales Transactions Product Data Marketing/ Inventory Social Data EDW NFS Apache Argus (Unified Access Controls and Audit) (data processing) Exchange HBase Client Sqoop/Hive Downstream Data Sources OLTP HBase EDW (Teradata) MR2 Graph SAS Ingestion SQOOP FLUME Web HDFS REST HTTP Streami ng TEZ Mahout Dept Base Virtual Database (VDB) Team 1 VDB Team2 VDB View1 View2
  • 8. Yarn allows for new processing engines Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced STORM JMS Step 1:Extract & Load NFS Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Security Step 4: Schedule and Orchestrate HIVE PIG Cascading Step 3: Transform, Aggregate & Materialize (table & user-defined metadata) Step 2: Model/Apply Metadata compute & storage HCATALOG . . . SolR Storm . . . . . compute & storage . . YARN AMBARI Data Lake HDP Grid Use Case Type 1: Materialize & Exchange Interactive Hive Server (Tez/Stinger) Stream Processing, Real-time Search, MPI, etc. YARN Apps Opens up Many New Use Cases Query/ Analytics/Reporting Tools Tableau, Excel, Microstrategy Datameer, Platfora, Business Objects Use Case Type 2: Explore/Visualize FALCON (Data pipeline & flow management) SOURCE DATA Click Stream Sales Transactions Product Data Marketing/ Inventory Social Data EDW NFS Apache Argus (Unified Access Controls and Audit) (data processing) Exchange HBase Client Sqoop/Hive Downstream Data Sources OLTP HBase EDW (Teradata) MR2 Graph SAS Ingestion SQOOP FLUME Web HDFS REST HTTP Streamin g TEZ Mahout
  • 9. Falcon enables Governance of Data Pipelines Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced STORM JMS Step 1:Extract & Load NFS Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Security Step 4: Schedule and Orchestrate HIVE PIG Cascading Step 3: Transform, Aggregate & Materialize (table & user-defined metadata) Step 2: Model/Apply Metadata compute & storage HCATALOG . . . SolR Storm . . . . . compute & storage . . YARN AMBARI Data Lake HDP Grid Use Case Type 1: Materialize & Exchange Interactive Hive Server (Tez/Stinger) Stream Processing, Real-time Search, MPI, etc. YARN Apps Opens up Many New Use Cases Query/ Analytics/Reporting Tools Tableau, Excel, Microstrategy Datameer, Platfora, Business Objects Use Case Type 2: Explore/Visualize FALCON (Data pipeline & flow management) SOURCE DATA Click Stream Sales Transactions Product Data Marketing/ Inventory Social Data EDW NFS Apache Argus (Unified Access Controls and Audit) (data processing) Exchange HBase Client Sqoop/Hive Downstream Data Sources OLTP HBase EDW (Teradata) MR2 Graph SAS Ingestion SQOOP FLUME Web HDFS REST HTTP Streamin g TEZ Mahout
  • 10. Apache Falcon: Data Governance in the Lake Falcon Adds the required data governance features Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data pipeline Raw Clean Prep Defined in Adds the required data governance Auto generate & orchestrate Multiple complex Oozie workflows Job1 Job2 JobN Job3 Job4 Job7 Job6 JobN Job1 Job2 JobN Job3 Job4 Job7 Job6 JobN Other Hadoop ecosystem tools Eg. DistCp features DEFINITION Replication | Retention Eviction | Late data MONITORING TRACING Audit | Lineage Tagging
  • 11. Mashing up diverse data types in the Data Lake Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 12. Mashing up diverse data types in the Data Lake Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 13. Mashing up diverse data types in the Data Lake Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 14. Mashing up diverse data types in the Data Lake Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 15. Mashing up diverse data types in the Data Lake Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 16. Mashing up diverse data types in the Data Lake Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 17. Virtual Data Marts with Red Hat JBoss Data Virtualization and Hortonworks HDP Kimberly Palko Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 18. Data Supply and Integration Solution Data Virtualization sits in front of multiple data sources and ü allows them to be treated a single source ü delivering the desired data ü in the required form ü at the right time ü to any application and/or user. THINK VIRTUAL MACHINE FOR DATA Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 19. Easy Access to Big Data Hive Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved • Reporting tool accesses the data virtualization server via rich SQL dialect • The data virtualization server translates rich SQL dialect to HiveQL • Hive translates HiveQL to MapReduce • MapReduce runs MR job on big data MapReduce HDFS Analytical Reporting Tool Data Virtualization Server Hadoop Big Data
  • 20. Use Case 1: Combine data from Hadoop with traditional data sources Problem: Data from new data sources like social media, clickstream and sensors needs to be combined with data from traditional sources to get the full value. Solution: Leverage JBoss Data Virtualization to mashup new data in Hadoop with data in traditional data sources without moving or copying any data and access it through a variety of BI tools and SOA technologies. Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data can be accessed by mul/ple tools and methods already in-­‐house Consume Compose Connect JBoss Data Virtualization Hive SOURCE 1: Hive/Hadoop contains data from new data sources like social media, clickstream and sensor data SOURCE 2: Tradi/onal rela/onal databases in the enterprise
  • 21. Use Case 2: Federating across Geographically Distributed Hadoop Clusters Problem: Geographically distributed Hadoop clusters contains sensitive data like patient records or customer identification that cannot be accessed by other regions due to regulatory policy. IT needs access to all data, but users can only access the data in their region. Solution: Leverage JBoss Data Virtualization to provide Row Level Security and Masking of columns while federating across Hadoop clusters. Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data can be accessed by mul/ple tools and methods already in-­‐house Consume Compose Connect JBoss Data Virtualization Hive Hadoop cluster in one geographic region Hive Hadoop cluster in a second geographic region
  • 22. Data for entire organization in Hadoop Data Lake Problem: How does IT control access and give business users just the data they need? - Does every line of business have access to everyone’s data? - How do business users get access to the data they need in a simple (even self-service) way? Hadoop Data Lake HR Employee Files Server Marketing Clickstream Data Finance Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Expense Reports Logs Sales Transactions Customer Twitter Sentiment Accounts Data
  • 23. Secure, Self-Service Virtual Data Marts for Hadoop Solution: Use JBoss Data Virtualization to create virtual data marts on top of a Hadoop cluster - Lines of Business get access to the data they need in a simple manner - IT maintains the process and control it needs - All data remains in the data lake, nothing is copied or moved Marketing Finance IT Marketing Clickstream Data Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop Data Lake HR Employee Files Sales Transactions Finance Customer Expense Reports Twitter Sentiment Accounts Data Sales Server Logs
  • 24. Optional hierarchical data architectures with virtual data mart Can be combined with security features like user role access and row and column masking Team2 VDB Dept Base Virtual Database (VDB) Team 1 VDB Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved View1 View2
  • 25. Want most recent data in an operational data store Problem: All the legacy and archived data is in the Hadoop data lake. We want to access the most recent, up to the minute, operational data often and quickly. Marketing Clickstream Data Hadoop Data Lake Historical Data Finance Expense Reports Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HR Employee Files Server Logs Sales Transactions Customer Accounts Twitter Sentiment Data
  • 26. Caching For Faster Performance – Materialized View Query 1 Query 2 Virtual Database (VDB) Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Cached or Materialized View 1 View 1 • Same cached view for multiple queries • Refreshed automatically or manually • Cache repository can be any supported data source
  • 27. Want most recent data in an operational data store Solution: Use JBoss Data Virtualization to integrate up to the minute data from multiple diverse data sources that can be quickly queried. - Use HDP for all data older than today. - Use JDV to materialize the data in HDP for faster access and to combine with operational VDB Materialized View Operational VDB Historical Data with up to the minute data Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Marketing Clickstream Data Hadoop Data Lake HR Employee Files Finance Expense Reports Server Logs Sales Transactions Customer Accounts Twitter Sentiment Data Nightly Transfer from Data Sources
  • 28. Demonstration Virtual Data Marts Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved with Hadoop Data Lake Kenny Peeples
  • 29. Use Case 3 - Overview Objexcxtivxe : –Purpose oriented data views for functional teams over a rich variety of semi-structured and structured data Problem: –Data Lakes have large volumes of consolidated clickstream data, product and customer data that need to be constrained for multi-departmental use. Solution: –Leverage HDP to mashup Clickstream analysis data with product and customer data on HDP to answer - Leverage Jboss Data Virt to provide Virtual data marts for each of Marketing and Product teams to ….. Page 29 © Hortonworks Inc. 2011 – 2014. All Rights RHesOerRveTdO NWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
  • 30. Use Case 3 - Architecture APPLICATIONS Business Analy/cs Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Custom Applica/ons Packaged Applica/ons DATA SYSTEM SOURCES Emerging Sources (Sensor, Sen/ment, Geo, Unstructured) Exis/ng Sources (CRM, ERP, Clickstream, Logs) HDP 2.1 Governance & Integration Security Operations Data Access VIRTUAL DATA MART Data Management
  • 31. Use Case 3 - Resources • GUIDE How to guide: https://github.com/DataVirtualizationByExample/HortonworksUseCase3 Tutorial: Available soon • VIDEOS: http://vimeo.com/user16928011/hwxuc3configuration http://vimeo.com/user16928011/hwxuc3run http://vimeo.com/user16928011/hwxuc3overview • SOURCE: https://github.com/DataVirtualizationByExample/HortonworksUseCase3 Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 32. Benefits of JBoss Data Virtualization with Hortonworks HDP 2.1 • Creates virtual databases for controlling access to data in a data lake while giving lines of business the autonomy they seek • Combines new data in Hadoop with data in traditional data sources without moving or copying data • Gives access to a variety of BI and analytics tools • Provides caching for faster access to data • Provides consistent security policy across multiple data sources Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 33. Thank you! Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hortonworks and Red Hat JBoss Data Virtualization
  • 34. Next Steps... More about Red Hat & Hortonworks http://hortonworks.com/partner/redhat Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop 2 Contact us: events@hortonworks.com Page 34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 35. Don’t Forget to Register for our Next Webinar! Page 35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved September 17th, 10 AM PST Red Hat JBoss Data Virtualization and Hortonworks Data Platform http://info.hortonworks.com/RedHatSeries_Hortonworks.html