SlideShare a Scribd company logo
1 of 38
Download to read offline
Honeycomb
Scalable Clusters on Demand
December 1st, 2017
Simplifying life’s most meaningful transaction.
Intro
Bogdan & Gustavo
● Data & ML engineers at Opendoor
● Building ETL and ML infrastructure
● Prior experience building serving and data infrastructure at Google & Airbnb
Curious to learn more about us?
● https://medium.com/opendoor-labs
● https://blog.opendoor.com
Outline
1. Motivation
2. Existings Solutions
3. Honeycomb Architecture
4. Operators
5. Autoscaling
6. Pros & Cons
7. Learnings
8. Demo
Motivation
● Cluster
○ Docker / Kubernetes
○ Datadog
○ Scalyr and Stackdriver for logs
● ETL
○ Parquet on S3
○ Spark and Dask computing engine
○ Airflow / Luigi for ETL
● Databases:
○ Postgres for serving data
○ BigQuery for analytics
Stack
Kubernetes
Kubernetes is an open-source platform designed to automate deploying,
scaling, and operating application containers.
Legacy architecture
● Multiple schedulers (1 per team)
○ Lack of visibility into dependencies
○ Consistency issues
● Configuration is tightly coupled with code
○ Dependencies are not clear
○ Configuration deploys restart all running pods
● Statically allocated clusters resources
○ Operationally expensive
○ Always overprovisioned
Cluster utilization
Existing Solutions
Single Airflow with monorepo for DAGs
Pros:
● Easy to set up
● Works well with when compute is delegates to other services (Hive,
Presto, Spark, etc)
Cons:
● Different teams need different library versions
● One team becomes a bottleneck for trying out new libraries / tools
● Every new dependency / library upgrade requires airflow restart
● Using airflow workers for computationally extensive jobs is considered
antipattern
Pros:
● Easy to setup and run
● No need to maintain
Cons:
● Not frequent spark upgrades
● Installing native dependencies in python may take hours
● Python packaging is not an easy problem
● Does not work with our secrets management
● Is not integrated with out logging
EMR or DataProc
Honeycomb Architecture
Honeycomb Deep Dive
How everything fits
Project Goal
Build company wide data processing system with low
maintenance cost
● Support data science and data engineering requirements
○ Spark and Dask clusters
○ Flexibility and freedom for teams to use different tools and libraries
● Efficient cluster utilization
○ No idle resources
○ Spot pricing
● Universal scheduling system
○ Central store for ETL configuration
○ Easy to use abstractions to schedule jobs
Inspiration
Bloomberg Airflow fork:
● Bloomberg fork
● Airflow Kubernetes talk
Airflow roadmap:
● Kubernetes Executor/Operator
● PR: Kubernetes Operators
● Kubernetes docker operator
Honeycomb Architecture
Airflow
Operators
Honeycomb Deep Dive
Run your task on your custom
environment
OdPodOperator
OdPodOperator
market_buybox =
od_pod_operator.OdPodOperator(
dag=dag,
task_id='market-buybox',
milli_cores=100,
image='opendoor/dwellings:master'
memory_mb=256,
command='run_task mls_buybox_task',
)
OdPodOperator: Template
OdPodOperator
apiVersion: v1
kind: Pod
spec:
restartPolicy: Never
containers:
- name: pod-airflow-k8s-container
image: $(IMAGE)
command:
- bash
args:
- -c
- $(RUN_CMD)
env: $(UNIX_ENV)
resources: $(RESOURCES)
OdPodOperator: Monitoring
OdPodOperator
- Pod is monitored for exit status of 0 (success) and non-zero (failure)
- Logs are redirected from the Pod to the Airflow worker
SparkSubmitOperator
Cluster Operator
spark_task = spark_submit_operator.SparkSubmitOperator(
dag=dag,
task_id='test-spark',
milli_cores=500,
memory_mb=1024,
image='opendoor/dwellings:master-latest',
worker_count=2,
worker_milli_cores=2500,
worker_memory_mb=13 * 1024,
spark_program_path='/opt/spark/src/main/python/pi.py',
spark_program_args='1000'
)
SparkSubmitOperator
Cluster Operator
Autoscaling
Honeycomb Deep Dive
All the resources that you
need when you need them
Autoscaling: Exclusive Nodes (Instances)
Autoscaling
Autoscaling: Pod Affinity (bin packing)
Autoscaling
Autoscaling
Autoscaling
AWS Honeycomb
Instance Group
AWS Master k8s Instance
Group
Cluster Autoscaler
Pod
AWS Honeycomb
Instance Group
Taint: honeycomb
Honeycomb Pod
Honeycomb Pod
Honeycomb Pod
Pending
Honeycomb Pod
Monitors
unscheduled Pods
Spin up a new
spot instance
Cost: autoscaling + spot pricing
Pros & Cons
Honeycomb Deep Dive
Pros
Pros & Cons
+ Empowered users via full stack freedom
+ Low maintenance for infrastructure teams
+ Visibility into company data processing
+ Cloud independence
+ Low cost
Cons
Pros & Cons
- Requires kubernetes cluster
- Stateless spark
- Autoscaling brings some complexity to the kubernetes
configuration
Current big data open source tools allows us to build
scalable data processing infrastructure at fairly low cost
Demo
Honeycomb Deep Dive
Questions
Honeycomb Deep Dive
Extras
Example Pipelines
Honeycomb Deep Dive
Market Buybox Predictions
Example Pipeline
Market Buybox Predictions
Example Pipeline
Event Flow

More Related Content

What's hot

Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017
Ioannis Papapanagiotou
 
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebula Project
 

What's hot (20)

Intro to creating kubernetes operators
Intro to creating kubernetes operators Intro to creating kubernetes operators
Intro to creating kubernetes operators
 
Introducing MagnetoDB, a key-value storage sevice for OpenStack
Introducing MagnetoDB, a key-value storage sevice for OpenStackIntroducing MagnetoDB, a key-value storage sevice for OpenStack
Introducing MagnetoDB, a key-value storage sevice for OpenStack
 
Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017
 
SC4 Hangout - Luigi Selmi, Transport pilot architecture
SC4 Hangout - Luigi Selmi, Transport pilot architectureSC4 Hangout - Luigi Selmi, Transport pilot architecture
SC4 Hangout - Luigi Selmi, Transport pilot architecture
 
Taking Your Database Global with Kubernetes
Taking Your Database Global with KubernetesTaking Your Database Global with Kubernetes
Taking Your Database Global with Kubernetes
 
Tim Hall and Ryan Betts [InfluxData] | InfluxDB Roadmap and Engineering Updat...
Tim Hall and Ryan Betts [InfluxData] | InfluxDB Roadmap and Engineering Updat...Tim Hall and Ryan Betts [InfluxData] | InfluxDB Roadmap and Engineering Updat...
Tim Hall and Ryan Betts [InfluxData] | InfluxDB Roadmap and Engineering Updat...
 
Graph Databases at Netflix
Graph Databases at NetflixGraph Databases at Netflix
Graph Databases at Netflix
 
OpenNebulaConf2017EU: Growing into the Petabytes for Fun and Profit by Michal...
OpenNebulaConf2017EU: Growing into the Petabytes for Fun and Profit by Michal...OpenNebulaConf2017EU: Growing into the Petabytes for Fun and Profit by Michal...
OpenNebulaConf2017EU: Growing into the Petabytes for Fun and Profit by Michal...
 
InfluxData Architecture for IoT | Noah Crowley | InfluxData
InfluxData Architecture for IoT | Noah Crowley | InfluxDataInfluxData Architecture for IoT | Noah Crowley | InfluxData
InfluxData Architecture for IoT | Noah Crowley | InfluxData
 
Head in the clouds @ bol.com
Head in the clouds @ bol.comHead in the clouds @ bol.com
Head in the clouds @ bol.com
 
OpenNebulaConf2017EU: Enabling Dev and Infra teams by Lodewijk De Schuyter,De...
OpenNebulaConf2017EU: Enabling Dev and Infra teams by Lodewijk De Schuyter,De...OpenNebulaConf2017EU: Enabling Dev and Infra teams by Lodewijk De Schuyter,De...
OpenNebulaConf2017EU: Enabling Dev and Infra teams by Lodewijk De Schuyter,De...
 
Google Cloud platform: GKE with CI/CD using CircleCI and Flux
Google Cloud platform: GKE with CI/CD using CircleCI and FluxGoogle Cloud platform: GKE with CI/CD using CircleCI and Flux
Google Cloud platform: GKE with CI/CD using CircleCI and Flux
 
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
 
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
 
Data Engineer’s Lunch #41: PygramETL
Data Engineer’s Lunch #41: PygramETLData Engineer’s Lunch #41: PygramETL
Data Engineer’s Lunch #41: PygramETL
 
Hello, Docker!
Hello, Docker!Hello, Docker!
Hello, Docker!
 
Architectural caching patterns for kubernetes
Architectural caching patterns for kubernetesArchitectural caching patterns for kubernetes
Architectural caching patterns for kubernetes
 
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStack
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStackPeanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStack
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStack
 
OpenNebulaConf2017EU: Welcome Talk State and Future of OpenNebula by Ignacio ...
OpenNebulaConf2017EU: Welcome Talk State and Future of OpenNebula by Ignacio ...OpenNebulaConf2017EU: Welcome Talk State and Future of OpenNebula by Ignacio ...
OpenNebulaConf2017EU: Welcome Talk State and Future of OpenNebula by Ignacio ...
 
MongoDB SF Ruby
MongoDB SF RubyMongoDB SF Ruby
MongoDB SF Ruby
 

Similar to Scalable Clusters On Demand

SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
Chester Chen
 

Similar to Scalable Clusters On Demand (20)

Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with KubernetesKubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
 
DevEx | there’s no place like k3s
DevEx | there’s no place like k3sDevEx | there’s no place like k3s
DevEx | there’s no place like k3s
 
Greenplum for Kubernetes - Greenplum Summit 2019
Greenplum for Kubernetes - Greenplum Summit 2019Greenplum for Kubernetes - Greenplum Summit 2019
Greenplum for Kubernetes - Greenplum Summit 2019
 
Pydata 2020 containers meetup
Pydata  2020 containers meetup Pydata  2020 containers meetup
Pydata 2020 containers meetup
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
Kubernetes - how to orchestrate containers
Kubernetes - how to orchestrate containersKubernetes - how to orchestrate containers
Kubernetes - how to orchestrate containers
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
 
Improving Apache Spark Downscaling
 Improving Apache Spark Downscaling Improving Apache Spark Downscaling
Improving Apache Spark Downscaling
 
[scala.by] Launching new application fast
[scala.by] Launching new application fast[scala.by] Launching new application fast
[scala.by] Launching new application fast
 
AirBNB's ML platform - BigHead
AirBNB's ML platform - BigHeadAirBNB's ML platform - BigHead
AirBNB's ML platform - BigHead
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...
 
6 Months Sailing with Docker in Production
6 Months Sailing with Docker in Production 6 Months Sailing with Docker in Production
6 Months Sailing with Docker in Production
 
Truemotion Adventures in Containerization
Truemotion Adventures in ContainerizationTruemotion Adventures in Containerization
Truemotion Adventures in Containerization
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflow
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro services
 
Session 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramSession 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers Program
 

Recently uploaded

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Recently uploaded (20)

Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

Scalable Clusters On Demand