Submit Search
Upload
Cloudbreak - Technical Deep Dive
•
Download as PPTX, PDF
•
7 likes
•
1,733 views
DataWorks Summit/Hadoop Summit
Follow
Cloudbreak - Technical Deep Dive
Read less
Read more
Technology
Report
Share
Report
Share
1 of 23
Download now
Recommended
Provisioning Big Data Platform using Cloudbreak & Ambari
Provisioning Big Data Platform using Cloudbreak & Ambari
DataWorks Summit/Hadoop Summit
Keep your Hadoop Cluster at its Best
Keep your Hadoop Cluster at its Best
DataWorks Summit/Hadoop Summit
Scheduling Policies in YARN
Scheduling Policies in YARN
DataWorks Summit/Hadoop Summit
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
DataWorks Summit
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
DataWorks Summit
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
DataWorks Summit/Hadoop Summit
Recommended
Provisioning Big Data Platform using Cloudbreak & Ambari
Provisioning Big Data Platform using Cloudbreak & Ambari
DataWorks Summit/Hadoop Summit
Keep your Hadoop Cluster at its Best
Keep your Hadoop Cluster at its Best
DataWorks Summit/Hadoop Summit
Scheduling Policies in YARN
Scheduling Policies in YARN
DataWorks Summit/Hadoop Summit
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
DataWorks Summit
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
DataWorks Summit
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
DataWorks Summit/Hadoop Summit
Hybrid is the New Normal
Hybrid is the New Normal
DataWorks Summit
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
DataWorks Summit/Hadoop Summit
Built-In Security for the Cloud
Built-In Security for the Cloud
DataWorks Summit
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
DataWorks Summit/Hadoop Summit
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Cloudera, Inc.
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
Cécile Poyet
ebay
ebay
DataWorks Summit/Hadoop Summit
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
DataWorks Summit
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Cedric CARBONE
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
DataWorks Summit
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
DataWorks Summit/Hadoop Summit
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government data
DataWorks Summit
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
DataWorks Summit
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
Hortonworks
Hadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluation
mattlieber
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral Processing
DataWorks Summit
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Cloudera, Inc.
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
DataWorks Summit
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere
Janos Matyas
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
DataWorks Summit/Hadoop Summit
More Related Content
What's hot
Hybrid is the New Normal
Hybrid is the New Normal
DataWorks Summit
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
DataWorks Summit/Hadoop Summit
Built-In Security for the Cloud
Built-In Security for the Cloud
DataWorks Summit
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
DataWorks Summit/Hadoop Summit
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Cloudera, Inc.
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
Cécile Poyet
ebay
ebay
DataWorks Summit/Hadoop Summit
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
DataWorks Summit
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Cedric CARBONE
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
DataWorks Summit
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
DataWorks Summit/Hadoop Summit
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government data
DataWorks Summit
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
DataWorks Summit
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
Hortonworks
Hadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluation
mattlieber
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral Processing
DataWorks Summit
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Cloudera, Inc.
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
DataWorks Summit
What's hot
(20)
Hybrid is the New Normal
Hybrid is the New Normal
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
Built-In Security for the Cloud
Built-In Security for the Cloud
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
ebay
ebay
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government data
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
Hadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluation
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral Processing
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
Viewers also liked
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere
Janos Matyas
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
DataWorks Summit/Hadoop Summit
Docker based Hadoop provisioning - Hadoop Summit 2014
Docker based Hadoop provisioning - Hadoop Summit 2014
Janos Matyas
On Demand HDP Clusters using Cloudbreak and Ambari
On Demand HDP Clusters using Cloudbreak and Ambari
DataWorks Summit/Hadoop Summit
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
Yifeng Jiang
Empower Data-Driven Organizations
Empower Data-Driven Organizations
DataWorks Summit/Hadoop Summit
Apache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basics
Gladson Manuel
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DC
Paco Nathan
DockerCon SF 2015: Scaling New Services
DockerCon SF 2015: Scaling New Services
Docker, Inc.
7+1 myths of the new os
7+1 myths of the new os
Alexis Richardson
AWS Lambda: Event-driven Code in the Cloud
AWS Lambda: Event-driven Code in the Cloud
Amazon Web Services
Are you paying attention
Are you paying attention
Hiba Hamdan
Apache Tez – Present and Future
Apache Tez – Present and Future
DataWorks Summit
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
Big Data at your Desk with KNIME
Big Data at your Desk with KNIME
DataWorks Summit/Hadoop Summit
The EDW Ecosystem
The EDW Ecosystem
DataWorks Summit/Hadoop Summit
Scalable On-Demand Hadoop Clusters with Docker and Mesos
Scalable On-Demand Hadoop Clusters with Docker and Mesos
DataWorks Summit
Data encoding and Metadata for Streams
Data encoding and Metadata for Streams
univalence
Sql Stream Intro
Sql Stream Intro
Chris Clabaugh
Scale-Out Resource Management at Microsoft using Apache YARN
Scale-Out Resource Management at Microsoft using Apache YARN
DataWorks Summit/Hadoop Summit
Viewers also liked
(20)
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
Docker based Hadoop provisioning - Hadoop Summit 2014
Docker based Hadoop provisioning - Hadoop Summit 2014
On Demand HDP Clusters using Cloudbreak and Ambari
On Demand HDP Clusters using Cloudbreak and Ambari
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
Empower Data-Driven Organizations
Empower Data-Driven Organizations
Apache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basics
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DC
DockerCon SF 2015: Scaling New Services
DockerCon SF 2015: Scaling New Services
7+1 myths of the new os
7+1 myths of the new os
AWS Lambda: Event-driven Code in the Cloud
AWS Lambda: Event-driven Code in the Cloud
Are you paying attention
Are you paying attention
Apache Tez – Present and Future
Apache Tez – Present and Future
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
Big Data at your Desk with KNIME
Big Data at your Desk with KNIME
The EDW Ecosystem
The EDW Ecosystem
Scalable On-Demand Hadoop Clusters with Docker and Mesos
Scalable On-Demand Hadoop Clusters with Docker and Mesos
Data encoding and Metadata for Streams
Data encoding and Metadata for Streams
Sql Stream Intro
Sql Stream Intro
Scale-Out Resource Management at Microsoft using Apache YARN
Scale-Out Resource Management at Microsoft using Apache YARN
Similar to Cloudbreak - Technical Deep Dive
Cloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World Considerations
DataWorks Summit/Hadoop Summit
Micro services vs hadoop
Micro services vs hadoop
Gergely Devenyi
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Mingliang Liu
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
DataWorks Summit
Running Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
Krisztián Horváth
Running Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
Future of Data Meetup
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
DataWorks Summit
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual Machines
DataWorks Summit
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & Cloudbreak
Sean Roberts
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
DataWorks Summit
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
Hortonworks
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
DataWorks Summit
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
DataWorks Summit
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
DataWorks Summit
DCOS Presentation
DCOS Presentation
Jan Repnak
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Hortonworks
Druid deep dive
Druid deep dive
Kashif Khan
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
Chris Nauroth
An overview of OpenStack for the VMware community
An overview of OpenStack for the VMware community
Anthony Chow
CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
Similar to Cloudbreak - Technical Deep Dive
(20)
Cloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World Considerations
Micro services vs hadoop
Micro services vs hadoop
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Running Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & Cloudbreak
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
DCOS Presentation
DCOS Presentation
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Druid deep dive
Druid deep dive
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
An overview of OpenStack for the VMware community
An overview of OpenStack for the VMware community
CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014
More from DataWorks Summit/Hadoop Summit
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
Hadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
Apache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
More from DataWorks Summit/Hadoop Summit
(20)
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
Hadoop Crash Course
Hadoop Crash Course
Data Science Crash Course
Data Science Crash Course
Apache Spark Crash Course
Apache Spark Crash Course
Dataflow with Apache NiFi
Dataflow with Apache NiFi
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
HBase in Practice
HBase in Practice
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
Recently uploaded
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
DianaGray10
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Juan lago vázquez
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
UiPathCommunity
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
apidays
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Remote DBA Services
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
Remote DBA Services
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
Sandro Moreira
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
jfdjdjcjdnsjd
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
Zilliz
Architecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
sudhanshuwaghmare1
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
apidays
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Andrey Devyatkin
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
danishmna97
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
The Digital Insurer
Recently uploaded
(20)
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
Architecting Cloud Native Applications
Architecting Cloud Native Applications
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
Cloudbreak - Technical Deep Dive
1.
Cloudbreak – Technical
Deep Dive Janos Matyas & Krisztian Horvath Hortonworks
2.
2 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Presenters Krisztian Horvath Senior Member of technical staff, Cloudbreak Former Co-Founder at SequenceIQ Janos Matyas Senior Director of Engineering, Cloudbreak Former Co-Founder and CTO for SequenceIQ
3.
3 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Agenda Goals and Motivations Technology Stack + Deep Dive Lessons Learned + Best Practices Demo + Q & A
4.
4 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Goals and Motivations – What We Wanted to Do… Declarative/full Hadoop stack provisioning in all major cloud providers Automate and unify the process Zero-configuration approach Same process through a cluster lifecycle (Dev, QA, UAT, Prod) Provide tooling - UI, REST API and CLI/shell Secure and multi-tenant SLA policy based autoscaling
5.
5 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Goals and Motivations – What We Wanted to Do… All cloud providers are fundamentally different… Compute, network, security, performance We want to share what we found, and how we made it work!
6.
6 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Agenda Goals and Motivations Technology Stack + Deep Dive Lessons Learned + Best Practices Demo + Q & A
7.
7 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Technology Stack Apache Ambari Cloud provider API Salt Docker Packer
8.
8 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Deep Dive - Overview Cloudbreak Deployer (CBD) – Tool to deploy the Cloudbreak application – Microservice architecture (using Docker) – DevOps friendly Cloudbreak Application – Extensible, available through UI, CLI, REST API – SLA auto-scaling policy management Cluster deployed with Cloudbreak
9.
9 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Deep Dive – Cloudbreak Deployer Installation – Single binary, written in Go – Requires Docker 1.9.1+ – DIY installation on any RHEL / CentOS / Oracle Linux 7 (64-bit) distro – Use one of the pre-built cloud images (AWS, Azure, GCP, OpenStack) Operations – Easy upgrades/downgrades, automatic schema migration Cloud provider support – AWS – generates IAM roles – Azure – ARM and DASH config Utilities – Cloudbreak shell support - interactive, remote, automated execution, OAuth2 token generation – Local development environment setup
10.
10 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Deep Dive – Cloudbreak Application Installation – Done with Cloudbreak Deployer (CBD) Operations – Consistent feature set through UI, CLI and secure REST API – Multi-tenant, ACL setup, usage reports – Custom stack repositories, failure actions – Event history, cluster management – SLA based auto-scaling policy configs, enforcement Cloud provider support – Agnostic API – AWS, Azure, GCP, OpenStack, Mesos – SPI interface – bring your own provider, stack under Cloudbreak management
11.
11 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Deep Dive – Cluster deployed with Cloudbreak Installation – Managed by Cloudbreak using cloud provider API – Default (optimized) configs – specific to cloud provider Operations – Default, custom configs for stacks, services, network, storage, security – Declarative Hadoop cluster – Custom instance types (heterogeneous clusters) – Different storage types – Configurable network – Security (access, Kerberos, SSSD, FreeIPA) Utilities – Ambari Views – Metadata/shared clusters support
12.
12 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Agenda Goals and Motivations Technology Stack + Deep Dive Lessons Learned + Best Practices Demo + Q & A
13.
13 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned Not all cloud providers are the same – Difference in performance, storage and functionality (Capacity) planning – Based on workload type (batch / interactive and ad-hoc / long running) – Use heterogeneous clusters – Trial and error – mistakes are cheap, iterate until you find your best fit – Leverage the cloud - scale your cluster on demand Number one consideration – storage – Multiple choices (ephemeral, block storage and BLOB store) – Bring compute to storage – might not work (everywhere) – in cloud everything is as a service – Independently scale storage from compute, partition your data Security – Consider using strict security rules (private subnets, access, etc) and use edge nodes
14.
14 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned - AWS Compute – Find your instance types for the workload, use heterogeneous clusters – Different instance types for transient (e.g. C4, M4) and long running (e.g. H2, D2) clusters – Dedicated instances (to avoid noise, regulations e.g. HIPPA) Storage – Use latest version of Hadoop (Hortonworks contributed cloud specific optimizations) – Note that S3 gives you only eventual consistency – Different driver implementation: S3n (native, jets3t based), S3a (successor of n) , S3 (block based) Network – Use enhanced networking (Amazon Linux by default, RHEL based – apply patch) – Placement groups – Not all instance types can use the 10Gbit network (e.g. use 8x) Security – Use instance roles to access S3, deploy in a private subnet/VPC
15.
15 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned - AWS * D28xlarge used as instance type
16.
16 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned - AWS * D28xlarge used as instance type
17.
17 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned - Azure Compute – Find your instance types for the workload, use heterogeneous clusters – Different instance types for transient (e.g. A and D family) and long running (e.g. Dv2) clusters – Use ARM instead of old API Storage – Use latest version of Hadoop (Hortonworks contributed cloud specific optimizations) – Storage account scaling limitations – Use WASB or WASB with DASH (default with Cloudbreak) – Azure Data Lake Store – soon – Ephemeral disk is faster than root disk – does not survive auto-updates Network – No PTR record/reverse lookup support Security – Integrate/sync with your corporate AD
18.
18 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned - Azure
19.
19 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned - GCP Compute – Find your instance types for the workload, use heterogeneous clusters – No template based provisioning Storage – Use latest version of Hadoop (Hortonworks contributed cloud specific optimizations) – Use Google Cloud Storage Connector Network – Network isolation/DNS problem Security
20.
20 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned - OpenStack Compute – Find your instance types for the workload, use heterogeneous clusters – Use Heat templates instead of API calls (we support both) Storage – Currently we support only Cinder volumes – Swift and Ceph is planned – Data locality through Cloudbreak – let us know your topology or rack/hypervisor mapping Network – Configure DNS properly – Use multiple network (Neutron) nodes in case of a large cluster Security – Use Keystone 3 (support for OAuth, Federation, introduction of groups/domains)
21.
21 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Lessons Learned - Mesos In Tech Preview – come and talk to us after the talk – Or @Hortonworks boot
22.
22 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Agenda Goals and Motivations Technology Stack + Deep Dive Lessons Learned + Best Practices Demo + Q & A
23.
23 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Thank You
Download now