Cloudbreak - Technical Deep Dive

Cloudbreak – Technical Deep Dive
Janos Matyas & Krisztian Horvath
Hortonworks

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Presenters
Krisztian Horvath
Senior Member of technical staff, Cloudbreak
Former Co-Founder at SequenceIQ
Janos Matyas
Senior Director of Engineering, Cloudbreak
Former Co-Founder and CTO for SequenceIQ

Agenda
Goals and Motivations
Technology Stack + Deep Dive
Lessons Learned + Best Practices
Demo + Q & A

Goals and Motivations – What We Wanted to Do…
 Declarative/full Hadoop stack provisioning in all major cloud providers
 Automate and unify the process
 Zero-configuration approach
 Same process through a cluster lifecycle (Dev, QA, UAT, Prod)
 Provide tooling - UI, REST API and CLI/shell
 Secure and multi-tenant
 SLA policy based autoscaling

Goals and Motivations – What We Wanted to Do…
 All cloud providers are fundamentally different…
 Compute, network, security, performance
 We want to share what we found, and how we made it work!

Agenda
Demo + Q & A

Technology Stack
 Apache Ambari
 Cloud provider API
 Salt
 Docker
 Packer

Deep Dive - Overview
 Cloudbreak Deployer (CBD)
– Tool to deploy the Cloudbreak application
– Microservice architecture (using Docker)
– DevOps friendly
 Cloudbreak Application
– Extensible, available through UI, CLI, REST API
– SLA auto-scaling policy management
 Cluster deployed with Cloudbreak

Deep Dive – Cloudbreak Deployer
 Installation
– Single binary, written in Go
– Requires Docker 1.9.1+
– DIY installation on any RHEL / CentOS / Oracle Linux 7 (64-bit) distro
– Use one of the pre-built cloud images (AWS, Azure, GCP, OpenStack)
 Operations
– Easy upgrades/downgrades, automatic schema migration
 Cloud provider support
– AWS – generates IAM roles
– Azure – ARM and DASH config
 Utilities
– Cloudbreak shell support - interactive, remote, automated execution, OAuth2 token generation
– Local development environment setup

Deep Dive – Cloudbreak Application
 Installation
– Done with Cloudbreak Deployer (CBD)
 Operations
– Consistent feature set through UI, CLI and secure REST API
– Multi-tenant, ACL setup, usage reports
– Custom stack repositories, failure actions
– Event history, cluster management
– SLA based auto-scaling policy configs, enforcement
 Cloud provider support
– Agnostic API
– AWS, Azure, GCP, OpenStack, Mesos
– SPI interface – bring your own provider, stack under Cloudbreak management

Deep Dive – Cluster deployed with Cloudbreak
 Installation
– Managed by Cloudbreak using cloud provider API
– Default (optimized) configs – specific to cloud provider
 Operations
– Default, custom configs for stacks, services, network, storage, security
– Declarative Hadoop cluster
– Custom instance types (heterogeneous clusters)
– Different storage types
– Configurable network
– Security (access, Kerberos, SSSD, FreeIPA)
 Utilities
– Ambari Views
– Metadata/shared clusters support

Agenda
Demo + Q & A

Lessons Learned
 Not all cloud providers are the same
– Difference in performance, storage and functionality
 (Capacity) planning
– Based on workload type (batch / interactive and ad-hoc / long running)
– Use heterogeneous clusters
– Trial and error – mistakes are cheap, iterate until you find your best fit
– Leverage the cloud - scale your cluster on demand
 Number one consideration – storage
– Multiple choices (ephemeral, block storage and BLOB store)
– Bring compute to storage – might not work (everywhere) – in cloud everything is as a service
– Independently scale storage from compute, partition your data
 Security
– Consider using strict security rules (private subnets, access, etc) and use edge nodes

Lessons Learned - AWS
 Compute
– Find your instance types for the workload, use heterogeneous clusters
– Different instance types for transient (e.g. C4, M4) and long running (e.g. H2, D2) clusters
– Dedicated instances (to avoid noise, regulations e.g. HIPPA)
 Storage
– Use latest version of Hadoop (Hortonworks contributed cloud specific optimizations)
– Note that S3 gives you only eventual consistency
– Different driver implementation: S3n (native, jets3t based), S3a (successor of n) , S3 (block based)
 Network
– Use enhanced networking (Amazon Linux by default, RHEL based – apply patch)
– Placement groups
– Not all instance types can use the 10Gbit network (e.g. use 8x)
 Security
– Use instance roles to access S3, deploy in a private subnet/VPC

* D28xlarge used as instance type

Lessons Learned - Azure
 Compute
– Different instance types for transient (e.g. A and D family) and long running (e.g. Dv2) clusters
– Use ARM instead of old API
 Storage
– Storage account scaling limitations
– Use WASB or WASB with DASH (default with Cloudbreak)
– Azure Data Lake Store – soon
– Ephemeral disk is faster than root disk – does not survive auto-updates
 Network
– No PTR record/reverse lookup support
 Security
– Integrate/sync with your corporate AD

Lessons Learned - Azure

Lessons Learned - GCP
 Compute
– No template based provisioning
 Storage
– Use Google Cloud Storage Connector
 Network
– Network isolation/DNS problem
 Security

Lessons Learned - OpenStack
 Compute
– Use Heat templates instead of API calls (we support both)
 Storage
– Currently we support only Cinder volumes
– Swift and Ceph is planned
– Data locality through Cloudbreak – let us know your topology or rack/hypervisor mapping
 Network
– Configure DNS properly
– Use multiple network (Neutron) nodes in case of a large cluster
 Security
– Use Keystone 3 (support for OAuth, Federation, introduction of groups/domains)

Lessons Learned - Mesos
 In Tech Preview
– come and talk to us after the talk
– Or @Hortonworks boot

Agenda
Demo + Q & A

Thank You

Cloudbreak - Technical Deep Dive

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Cloudbreak - Technical Deep Dive

Similar to Cloudbreak - Technical Deep Dive (20)

More from DataWorks Summit/Hadoop Summit

More from DataWorks Summit/Hadoop Summit (20)

Recently uploaded

Recently uploaded (20)

Cloudbreak - Technical Deep Dive