SlideShare a Scribd company logo
1 of 20
Scalable On-Demand Hadoop
Clusters with Docker and
Mesos
Andrew Nelson, Nutanix
@vmwnelson http://virtual-hiking.blogspot.com
Chris Mutchler, VMware
@chrismutchler http://virtualelephant.com
V
Agenda
 New Approach for Hadoop Ops
 Infrastructure Resource Considerations
 Docker as the new “Unit of Work”
 Future Work
2
Last Year’s State of the Art
 Self-service and multi-tenant Hadoop
 Elastic and decoupled infrastructure
 Extensible blueprinting
3
New Goals
 Operationalize multiple frameworks
 Decoupled service architecture
 Flexible and developer-friendly form factor
4
Apache Mesos Introduction
 Started at Berkeley
 Graduated to top level Apache project
2013
 Commercial entity is Mesosphere
 https://github.com/apache/mesos/
5
Mesos Architecture
6
Source: http://mesos.apache.org/assets/img/documentation/architecture3.jpg
Mesos as a Multi-Tenant
Resource Pool
7
Source: https://github.com/mesos/myriad/blob/phase1/docs/how-it-works.md
Tools to Build and Scale
 Serengeti, Vmware
 https://github.com/vmware-serengeti
 BOSH, Pivotal
 https://github.com/cloudfoundry/bosh
 Cloudify, Gigaspaces
 https://github.com/CloudifySource/cloudify
 Cloudbreak, SequenceIQ
 https://github.com/sequenceiq/cloudbreak
8
Advantages for Ops
 Mesos as a Resource Pool
 Multiple concurrent frameworks
 Decouple frameworks from resource pools
9
Compute Partitions on Mesos
10
Shared
Hadoop
Storm
Spark
Kafka
Hadoop Cassandra Storm Spark
Marathon
Cassandra
Siloed
HDFS as a Service
11
Namenode
Standby
Namenode
Secondary
Namenode
HDFS
MapReduce
Spark
Hive
Storm
…
Networking Services
 Service Discovery
 Handled per framework
 Port range resource managed by Mesos slave
 For example, Marathon uses HAProxy for request routing
 Per-container network monitoring
 Egress rate-limiting
12
Scheduling Options
 Mesos scheduling
 Capacity Scheduler
 Fair Scheduler
 Tenant scheduling examples
 Hadoop on Mesos
 Myriad (YARN) on Mesos
13
Dev Workflow
 Code Repo / Registry
 Pull / Push / Commit / Run
 Automated Builds
 Version tagging
 Marathon CI / CD
 Dependencies
 Rolling restarts
14
Registry Services
 Pluggable storage
 Webhooks
 Image control
 Security
 Logging
15
Registry
Repository Repository
Image Image Image
Advantages for Developers
 Interchangeable verbs for code<->containers
 Choice of framework to use as their PaaS
 Adopt microservices approach to app pipeline
16
Recommendations for Success
 Start small, scale fast
 Use most appropriate framework for the job
 Think ahead, decouple
 Plan for rolling restart capacity up front
17
Gap Analysis
 Be prepared to “look under the hood”
 Variable maturity and resiliency of the layers
 Networking
 Security
18
Where Are We Going Next
 Scale and learn
 Container-focused OS
 Software-defined networking services
 Discover key performance and availability metrics
19
Wrapping up
 Mesos allows for choice of framework
 Devs utilize Docker with familiar workflow
 Portable, flexible, and scalable architecture
20

More Related Content

What's hot

Lessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark WorkloadsLessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark WorkloadsBlueData, Inc.
 
Highly scalable caching service on cloud - Redis
Highly scalable caching service on cloud - RedisHighly scalable caching service on cloud - Redis
Highly scalable caching service on cloud - RedisKrishna-Kumar
 
Lessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On DockerLessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On DockerSpark Summit
 
Guaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike TutkowskiGuaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike Tutkowskibuildacloud
 
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...Cloud Native Day Tel Aviv
 
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Docker, Inc.
 
Stratoscale Latest and Greatest
Stratoscale Latest and GreatestStratoscale Latest and Greatest
Stratoscale Latest and GreatestZach Lanksbury
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalleybuildacloud
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Running OpenStack on Amazon AWS, Alex Fishman
Running OpenStack on Amazon AWS, Alex FishmanRunning OpenStack on Amazon AWS, Alex Fishman
Running OpenStack on Amazon AWS, Alex FishmanCloud Native Day Tel Aviv
 
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformLarge Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformDataStax Academy
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaKamesh Pemmaraju
 
Managing Redis with Kubernetes - Kelsey Hightower, Google
Managing Redis with Kubernetes - Kelsey Hightower, GoogleManaging Redis with Kubernetes - Kelsey Hightower, Google
Managing Redis with Kubernetes - Kelsey Hightower, GoogleRedis Labs
 
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...Frank Munz
 
Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - D...
Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - D...Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - D...
Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - D...DevOpsDays Tel Aviv
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatOpenStack
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesYousun Jeong
 
7 distributed storage_open_stack
7 distributed storage_open_stack7 distributed storage_open_stack
7 distributed storage_open_stackopenstackindia
 

What's hot (20)

Lessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark WorkloadsLessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark Workloads
 
Highly scalable caching service on cloud - Redis
Highly scalable caching service on cloud - RedisHighly scalable caching service on cloud - Redis
Highly scalable caching service on cloud - Redis
 
Lessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On DockerLessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On Docker
 
Guaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike TutkowskiGuaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike Tutkowski
 
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
 
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
 
Stratoscale Latest and Greatest
Stratoscale Latest and GreatestStratoscale Latest and Greatest
Stratoscale Latest and Greatest
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalley
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Running OpenStack on Amazon AWS, Alex Fishman
Running OpenStack on Amazon AWS, Alex FishmanRunning OpenStack on Amazon AWS, Alex Fishman
Running OpenStack on Amazon AWS, Alex Fishman
 
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformLarge Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
 
CloudStack and BigData
CloudStack and BigDataCloudStack and BigData
CloudStack and BigData
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
 
Running Cassandra in AWS
Running Cassandra in AWSRunning Cassandra in AWS
Running Cassandra in AWS
 
Managing Redis with Kubernetes - Kelsey Hightower, Google
Managing Redis with Kubernetes - Kelsey Hightower, GoogleManaging Redis with Kubernetes - Kelsey Hightower, Google
Managing Redis with Kubernetes - Kelsey Hightower, Google
 
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
 
Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - D...
Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - D...Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - D...
Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - D...
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
 
7 distributed storage_open_stack
7 distributed storage_open_stack7 distributed storage_open_stack
7 distributed storage_open_stack
 

Viewers also liked

Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersBlueData, Inc.
 
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere Janos Matyas
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosJoe Stein
 
Lessons in moving from physical hosts to mesos
Lessons in moving from physical hosts to mesosLessons in moving from physical hosts to mesos
Lessons in moving from physical hosts to mesosRaj Shekhar
 
The power of hadoop in business
The power of hadoop in businessThe power of hadoop in business
The power of hadoop in businessMapR Technologies
 
Creative photo effects
Creative photo effectsCreative photo effects
Creative photo effectsMarco Belzoni
 
Obtaining patentable claims after Prometheus and Myriad
Obtaining patentable claims after Prometheus and MyriadObtaining patentable claims after Prometheus and Myriad
Obtaining patentable claims after Prometheus and MyriadMaryBreenSmith
 
Resource Sharing Beyond Boundaries - Apache Myriad
Resource Sharing Beyond Boundaries - Apache MyriadResource Sharing Beyond Boundaries - Apache Myriad
Resource Sharing Beyond Boundaries - Apache MyriadSantosh Marella
 
Myriad_Product Collaterals
Myriad_Product CollateralsMyriad_Product Collaterals
Myriad_Product CollateralsSuman Mishra
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderHortonworks
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...Big Data Spain
 
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the EnterpriseEnabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the EnterpriseDataWorks Summit/Hadoop Summit
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosJoe Stein
 

Viewers also liked (20)

Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker Containers
 
Hadoop on-mesos
Hadoop on-mesosHadoop on-mesos
Hadoop on-mesos
 
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
 
Lessons in moving from physical hosts to mesos
Lessons in moving from physical hosts to mesosLessons in moving from physical hosts to mesos
Lessons in moving from physical hosts to mesos
 
The power of hadoop in business
The power of hadoop in businessThe power of hadoop in business
The power of hadoop in business
 
Fair Fitness
Fair FitnessFair Fitness
Fair Fitness
 
Creative photo effects
Creative photo effectsCreative photo effects
Creative photo effects
 
Obtaining patentable claims after Prometheus and Myriad
Obtaining patentable claims after Prometheus and MyriadObtaining patentable claims after Prometheus and Myriad
Obtaining patentable claims after Prometheus and Myriad
 
Resource Sharing Beyond Boundaries - Apache Myriad
Resource Sharing Beyond Boundaries - Apache MyriadResource Sharing Beyond Boundaries - Apache Myriad
Resource Sharing Beyond Boundaries - Apache Myriad
 
Momentum Myriad
Momentum Myriad Momentum Myriad
Momentum Myriad
 
Myriad_Product Collaterals
Myriad_Product CollateralsMyriad_Product Collaterals
Myriad_Product Collaterals
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via Slider
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
To The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid AnalyticsTo The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid Analytics
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
 
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the EnterpriseEnabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache Mesos
 
The truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on HadoopThe truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on Hadoop
 

Similar to Scalable On-Demand Hadoop Clusters with Docker and Mesos

Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSSteve Wong
 
Mesos vs kubernetes comparison
Mesos vs kubernetes comparisonMesos vs kubernetes comparison
Mesos vs kubernetes comparisonKrishna-Kumar
 
OpenSlava 2014 - CloudFoundry inside-out
OpenSlava 2014 - CloudFoundry inside-outOpenSlava 2014 - CloudFoundry inside-out
OpenSlava 2014 - CloudFoundry inside-outAntons Kranga
 
The New Stack Container Summit Talk
The New Stack Container Summit TalkThe New Stack Container Summit Talk
The New Stack Container Summit TalkThe New Stack
 
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating SystemOSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating SystemNETWAYS
 
Open source based container solution in Azure - May Docker Meetup
Open source based container solution in Azure - May Docker MeetupOpen source based container solution in Azure - May Docker Meetup
Open source based container solution in Azure - May Docker MeetupWiredcraft
 
Mesos and Kubernetes ecosystem overview
Mesos and Kubernetes ecosystem overviewMesos and Kubernetes ecosystem overview
Mesos and Kubernetes ecosystem overviewKrishna-Kumar
 
PaaS with Docker
PaaS with DockerPaaS with Docker
PaaS with DockerAditya Jain
 
Mesosphere quick overview
Mesosphere quick overviewMesosphere quick overview
Mesosphere quick overviewKrishna-Kumar
 
Datacenter Computing with Apache Mesos - シリコンバレー日本人駐在員Meetup
Datacenter Computing with Apache Mesos - シリコンバレー日本人駐在員MeetupDatacenter Computing with Apache Mesos - シリコンバレー日本人駐在員Meetup
Datacenter Computing with Apache Mesos - シリコンバレー日本人駐在員MeetupPaco Nathan
 
Net core microservice development made easy with azure dev spaces
Net core microservice development made easy with azure dev spacesNet core microservice development made easy with azure dev spaces
Net core microservice development made easy with azure dev spacesAlon Fliess
 
Moving Your Enterprise to the Cloud
Moving Your Enterprise to the CloudMoving Your Enterprise to the Cloud
Moving Your Enterprise to the CloudImesh Gunaratne
 
Platform as a Service
Platform as a ServicePlatform as a Service
Platform as a ServiceAshok Kumar
 
Mesos: Cluster Management System
Mesos: Cluster Management SystemMesos: Cluster Management System
Mesos: Cluster Management SystemErhan Bagdemir
 
Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS Amazon Web Services
 
Comparison of Several PaaS Cloud Computing Platforms
Comparison of Several PaaS Cloud Computing PlatformsComparison of Several PaaS Cloud Computing Platforms
Comparison of Several PaaS Cloud Computing Platformsijsrd.com
 
A clear strategy for moving your enterprise to the cloud
A clear strategy for moving your enterprise to the cloudA clear strategy for moving your enterprise to the cloud
A clear strategy for moving your enterprise to the cloudWSO2
 
Cloud Native Application @ VMUG.IT 20150529
Cloud Native Application @ VMUG.IT 20150529Cloud Native Application @ VMUG.IT 20150529
Cloud Native Application @ VMUG.IT 20150529VMUG IT
 
Apache Mesos Overview and Integration
Apache Mesos Overview and IntegrationApache Mesos Overview and Integration
Apache Mesos Overview and IntegrationAlex Baretto
 

Similar to Scalable On-Demand Hadoop Clusters with Docker and Mesos (20)

Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
Mesos vs kubernetes comparison
Mesos vs kubernetes comparisonMesos vs kubernetes comparison
Mesos vs kubernetes comparison
 
OpenSlava 2014 - CloudFoundry inside-out
OpenSlava 2014 - CloudFoundry inside-outOpenSlava 2014 - CloudFoundry inside-out
OpenSlava 2014 - CloudFoundry inside-out
 
The New Stack Container Summit Talk
The New Stack Container Summit TalkThe New Stack Container Summit Talk
The New Stack Container Summit Talk
 
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating SystemOSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
 
Open source based container solution in Azure - May Docker Meetup
Open source based container solution in Azure - May Docker MeetupOpen source based container solution in Azure - May Docker Meetup
Open source based container solution in Azure - May Docker Meetup
 
Mesos and Kubernetes ecosystem overview
Mesos and Kubernetes ecosystem overviewMesos and Kubernetes ecosystem overview
Mesos and Kubernetes ecosystem overview
 
PaaS with Docker
PaaS with DockerPaaS with Docker
PaaS with Docker
 
Mesosphere quick overview
Mesosphere quick overviewMesosphere quick overview
Mesosphere quick overview
 
PaaS Solutions Comparison
PaaS Solutions ComparisonPaaS Solutions Comparison
PaaS Solutions Comparison
 
Datacenter Computing with Apache Mesos - シリコンバレー日本人駐在員Meetup
Datacenter Computing with Apache Mesos - シリコンバレー日本人駐在員MeetupDatacenter Computing with Apache Mesos - シリコンバレー日本人駐在員Meetup
Datacenter Computing with Apache Mesos - シリコンバレー日本人駐在員Meetup
 
Net core microservice development made easy with azure dev spaces
Net core microservice development made easy with azure dev spacesNet core microservice development made easy with azure dev spaces
Net core microservice development made easy with azure dev spaces
 
Moving Your Enterprise to the Cloud
Moving Your Enterprise to the CloudMoving Your Enterprise to the Cloud
Moving Your Enterprise to the Cloud
 
Platform as a Service
Platform as a ServicePlatform as a Service
Platform as a Service
 
Mesos: Cluster Management System
Mesos: Cluster Management SystemMesos: Cluster Management System
Mesos: Cluster Management System
 
Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS
 
Comparison of Several PaaS Cloud Computing Platforms
Comparison of Several PaaS Cloud Computing PlatformsComparison of Several PaaS Cloud Computing Platforms
Comparison of Several PaaS Cloud Computing Platforms
 
A clear strategy for moving your enterprise to the cloud
A clear strategy for moving your enterprise to the cloudA clear strategy for moving your enterprise to the cloud
A clear strategy for moving your enterprise to the cloud
 
Cloud Native Application @ VMUG.IT 20150529
Cloud Native Application @ VMUG.IT 20150529Cloud Native Application @ VMUG.IT 20150529
Cloud Native Application @ VMUG.IT 20150529
 
Apache Mesos Overview and Integration
Apache Mesos Overview and IntegrationApache Mesos Overview and Integration
Apache Mesos Overview and Integration
 

Recently uploaded

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

Scalable On-Demand Hadoop Clusters with Docker and Mesos

  • 1. Scalable On-Demand Hadoop Clusters with Docker and Mesos Andrew Nelson, Nutanix @vmwnelson http://virtual-hiking.blogspot.com Chris Mutchler, VMware @chrismutchler http://virtualelephant.com V
  • 2. Agenda  New Approach for Hadoop Ops  Infrastructure Resource Considerations  Docker as the new “Unit of Work”  Future Work 2
  • 3. Last Year’s State of the Art  Self-service and multi-tenant Hadoop  Elastic and decoupled infrastructure  Extensible blueprinting 3
  • 4. New Goals  Operationalize multiple frameworks  Decoupled service architecture  Flexible and developer-friendly form factor 4
  • 5. Apache Mesos Introduction  Started at Berkeley  Graduated to top level Apache project 2013  Commercial entity is Mesosphere  https://github.com/apache/mesos/ 5
  • 7. Mesos as a Multi-Tenant Resource Pool 7 Source: https://github.com/mesos/myriad/blob/phase1/docs/how-it-works.md
  • 8. Tools to Build and Scale  Serengeti, Vmware  https://github.com/vmware-serengeti  BOSH, Pivotal  https://github.com/cloudfoundry/bosh  Cloudify, Gigaspaces  https://github.com/CloudifySource/cloudify  Cloudbreak, SequenceIQ  https://github.com/sequenceiq/cloudbreak 8
  • 9. Advantages for Ops  Mesos as a Resource Pool  Multiple concurrent frameworks  Decouple frameworks from resource pools 9
  • 10. Compute Partitions on Mesos 10 Shared Hadoop Storm Spark Kafka Hadoop Cassandra Storm Spark Marathon Cassandra Siloed
  • 11. HDFS as a Service 11 Namenode Standby Namenode Secondary Namenode HDFS MapReduce Spark Hive Storm …
  • 12. Networking Services  Service Discovery  Handled per framework  Port range resource managed by Mesos slave  For example, Marathon uses HAProxy for request routing  Per-container network monitoring  Egress rate-limiting 12
  • 13. Scheduling Options  Mesos scheduling  Capacity Scheduler  Fair Scheduler  Tenant scheduling examples  Hadoop on Mesos  Myriad (YARN) on Mesos 13
  • 14. Dev Workflow  Code Repo / Registry  Pull / Push / Commit / Run  Automated Builds  Version tagging  Marathon CI / CD  Dependencies  Rolling restarts 14
  • 15. Registry Services  Pluggable storage  Webhooks  Image control  Security  Logging 15 Registry Repository Repository Image Image Image
  • 16. Advantages for Developers  Interchangeable verbs for code<->containers  Choice of framework to use as their PaaS  Adopt microservices approach to app pipeline 16
  • 17. Recommendations for Success  Start small, scale fast  Use most appropriate framework for the job  Think ahead, decouple  Plan for rolling restart capacity up front 17
  • 18. Gap Analysis  Be prepared to “look under the hood”  Variable maturity and resiliency of the layers  Networking  Security 18
  • 19. Where Are We Going Next  Scale and learn  Container-focused OS  Software-defined networking services  Discover key performance and availability metrics 19
  • 20. Wrapping up  Mesos allows for choice of framework  Devs utilize Docker with familiar workflow  Portable, flexible, and scalable architecture 20

Editor's Notes

  1. I'm going to be discussing some new opportunities to change the operational model of Hadoop and how to accommodate new services as well as work on better integration and end to end testing of modern application pipelines. This has everything to do with how ops can provide devs with the most flexible building environment without stretching too far to try and support everything. Key takeaways: Hadoop+docker for lightweight self-service on your laptop, in your cloud For building modern app pipelines, need CI/CD, to iterate faster, need this self-service, customizable framework to build what the devs want to build Evaluate whether yarn fits your needs or mesos Just pick a physical form factor or pick a cloud and move on, with portability in mind, unique situation in so many software choices that will affect your ultimate product more than hardware will Test and iterate, scale and learn
  2. Last year, Chris and I talked about how Adobe was virtualizing their Hadoop clusters in order to emulate a public cloud environment. Developers wanted to be able to be more flexible in what kind of Hadoop cluster was deployed, sizing, which templates, and which distro they wanted to work with. All of these things could be customized and were enabled for self-service. Potentially each developer could utilize their own private, dedicated cluster for experimentation and not have to worry about dedicated hardware. The automation and blueprints necessary were shared via catalog and extended to accommodate more than just Hadoop to include other distributed systems such as Storm, Kafka, Mesos, etc.
  3. One key realization is that you can't get there with just one framework. There are a ton of different solutions out there for cluster management and for different frameworks, different building blocks that devs can use to build their app and its date pipeline. So we needed to be able to be more flexible in giving developers options for building their desired service. Should they be building realtime or batch workloads, how will they scale? What if parameters need to be changed as they scale? So many questions and new code to look at and devs need to be just as quick about evaluating what tools are helpful and worth including as what code they are adding in themselves With all of these different frameworks, and to retain the element of flexibility once they go down a road, the devs need to ensure they remain loosely coupled. Otherwise all this flexibility was kinda pointless. What's flexible about having to go back and start from scratch? You could do that before and it was in a lot simpler system right? Now we're all platform-building, even if we're using someone else's services to bootstrap basic functionality. We need to deliver reliability somewhere before we get to the top of the stack. That's what CI and CD are basically about, imo. So what we need that is telatively portable, easily resizable across these different frameworks and reasonably self-contained so that we can pick it up and move it around when we need to? Last year the currency was VMs. We could resize, repurpose, share hardware, and blueprint. I have worked with VMs in high performance and I don't think that's the issue. However, they are not developer-friendly. Dev-friendly to me is basically infrastructure as code, or even infra as text files. As an architect I want devs to feel free to customize, do it themselves, and be able to interact with the system in a form factor that is consistent with their processes. Key part of self-service is choice
  4. Users: Twitter, Airbnb, Apple, Ebay Aurora, Marathon, Chronos
  5. http://mesos.apache.org/documentation/latest/mesos-architecture/ http://mesos.apache.org/assets/img/documentation/architecture3.jpg So from an infara perspective, why not just work on YARN. Well, YARN is not a hierarchical scheduler frmawork. It’s a framework for writing scalable analytics jobs and it does that really well. But how to encapsulate infra for jobs that don't fit that model. Maybe next year, YARN will have a competely different set of capabilities but for now, we have devs with those diverse set of job characteristics. Allows for multiple executors Allows for multiple independent schedulers Allows for multiple frameworks / toolsets Highly available master The master enables fine-grained sharing of resources (cpu, ram, …) across applications by making them resource offers. Each resource offer contains a list of . The master decides how many resources to offer to each framework according to a given organizational policy, such as fair sharing, or strict priority. To support a diverse set of policies, the master employs a modular architecture that makes it easy to add new allocation modules via a plugin mechanism. A framework running on top of Mesos consists of two components: a scheduler that registers with the master to be offered resources, and an executor process that is launched on slave nodes to run the framework’s tasks (/documentation/latest/see theApp/Framework development guide for more details about application schedulers and executors). While the master determines how many resources are offered to each framework, the frameworks' schedulers select which of the offered resources to use. When a frameworks accepts offered resources, it passes to Mesos a description of the tasks it wants to run on them. In turn, Mesos launches the tasks on the corresponding slaves.
  6. https://github.com/mesos/myriad/blob/phase1/docs/how-it-works.md https://github.com/mesos/myriad/raw/phase1/docs/images/how-it-works.png Each tenant has their own framework Each tenant can derive their own scheduling Each tenant can leverage services in a decoupled fashion
  7. This list will probably keep growing before it becomes consolidated. This is about blueprinting the distributed systems. There will typically be an infrastructure layer and a configuration management layer. Vmw is a solution based on vmware vcenter and chef obviously. There is the flexibility of creating your own roles and recipes but dependent on vmw licensing based on sockets. There is only a single template ever at any given time and calls are blocking meaning only one cluster can be in any stage of cration at any given time. Bosh is its own animal, originally conceived as a way to stand up cloud foundry because it is its own distributed system that can't instantiate itself. There is a director-based version or bosh-init as a quick and less heavyweight CLI. Bosh uses yaml as its conf format of choice. It can handle any cloud platform with a known CPI or cloud platform interface. Its templates are called stemcells. It has an async queue kv store with multiple workers that can build in parallel. Networking and dns are fully declared in the manifest but have to be much more explicit. Cloudbreak is relatively new cloud agnostic framework that uses cloud specific APIs for building out components, for example aws cloudformation. For hadoop blueprints, it uses ambari and at the guest-image level, everything is docker with swarm for clustering and consul for communication and service mgmt Clouidfy uses open source tosca blueprints which are yaml files that contain srvice definitions, tiers and dependencies. Cloudify determines the infra compatibility layer and config mgmt is chef or puppet
  8. Mesos is fundamentally a framework for accommodating different frameworks on the same hardware using cgroups, docker
  9. http://mesos.apache.org/documentation/latest/mesos-frameworks/ Compute is determined by resource offers. Instead of trying to fit a workload on whats left of a host, the host or worker advertises some resources, its up to the framework what it can accept and provision or wait.
  10. You have HA, checkpointing, and a common durable and resilient storage layer that can support the ecosystem of compute platforms. MapReduce (batch) Spark (In-memory) HIVE (SQL) Storm (streaming) Solr (Lucene Search) Flume Kafka (with Camus)
  11. Imo, the most immature portion of the tenant svcs of mesos but still headed in the right direction. Frameworks don’t want to manage ports or physical networking. Allow for per container granularity monitoring and logging which is good for debugging.
  12. These are the top-level scheduling algorithms that Mesos can use. Remember that it’s a hierarchy. When a job request comes into the YARN resource manager, YARN evaluates all the resources available, and it places the job. It’s the one making the decision where jobs should go… YARN is optimized for scheduling Hadoop jobs, which are historically (and still typically) batch jobs with long run times. This means that YARN was not designed for long-running services, nor for short-lived interactive queries…, and while it’s possible to have it schedule other kinds of workloads, this is not an ideal model. … uses a two-level scheduling mechanism where resource offers are made to frameworks (applications that run on top of Mesos). The Mesos master node decides how many resources to offer each framework, while each framework determines the resources it accepts and what application to execute on those resources. This method of resource allocation allows near-optimal data locality when sharing a cluster of nodes amongst diverse frameworks. This open source software project is both a Mesos framework and a YARN scheduler that enables Mesos to manage YARN resource requests. When a job comes into YARN, it will schedule it via the Myriad Scheduler, which will match the request to incoming Mesos resource offers. Mesos, in turn, will pass it on to the Mesos worker nodes. The Mesos nodes will then communicate the request to a Myriad executor which is running the YARN node manager. Myriad launches YARN node managers on Mesos resources, which then communicate to the YARN resource manager what resources are available to them. YARN can then consume the resources as it sees fit. Myriad provides a seamless bridge from the pool of resources available in Mesos to the YARN tasks that want those resources.
  13. Developers can push their code and Dockerfile to Git, as they usually do From there, Jenkins can build a container from the Dockerfile and then publish to a registry
  14. As typical, will there be template-creep? Container-creep? Image curation and testing necessary, but hopefully this fits into your CI/CD methodology.
  15. Working with Docker for developers should feel very familiar. Docker push, pull, commit Version dependency and tag-based search verbs Can choose from Marathon, YARN 2.7.0 CI/CD with cloudbees, shippable, drone, jenkins, on and on
  16. Logging is key, of course, best to test and iterate since stuff will break and pick a method that allows you to revert easily Decouple! Be ready to pull in network teams and security teams early and often The SDN decoupling is in progress but for now, infra should be ready to be explicit so devs don’t have to be Don’t just shift complexity, abstract Security, SDLC and infrastructure and ops and…
  17. Often need to change as we scale Remove the guest os as much as possible, options are multiplying, coreos, lxd, msft nano, rhat atomic, vmware photon Don’t know which will work better so need to test and iterate, ultimately we want decoupled so it doesn’t or shouldn’t matter A lot of maturation in the SDN space, controllers are just reaching scalability of thousands of VMs, what happens when I throw a million containers at them? Test and iterate
  18. YARN can be first class citizen, avoids siloeing datacenter Avoid siloing dev into specific frameworks Docker is the new currency for continuous test and deployment of code in infrastructure as text form factor for CI/CD