SlideShare a Scribd company logo
1 of 15
Download to read offline
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 1
Auto Scaling Systems With Elastic Spark Streaming
PhuDuc Nguyen
Consulting Engineer @ Oracle Data Cloud
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 2
Auto Scaling Systems With Elastic Spark Streaming
Problem - streaming jobs with variable traffic pattern
● Our traffic pattern has natural peaks and valleys.
● We also experience unpredictable traffic spikes.
○ Breaking world news events - eg Brexit.
○ Some of our partners run batch jobs and send data at unknown times.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 3
Auto Scaling Systems With Elastic Spark Streaming
Various scenarios that result in “catch up mode”
● Streaming job has fallen behind real-time. In order to catch up, throughput must be > 1x to process backlog +
real-time.
● Catch up from failure: hardware, software, network, etc. Data doesn’t stop just because your streaming job has.
Deal with it. Be ready to catch up.
● Catch up from replay: pushing historical data back through the pipeline to fix a bug or apply new
feature/transformation retroactively.
● Unforeseen spikes in traffic.
● Underestimating the size of a streaming job.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 4
Auto Scaling Systems With Elastic Spark Streaming
Problems aren’t new - can use traditional methods to cope…
● Statically size the cluster/job to max servers needed to handle peak rate + some percentage for extra wiggle room.
○ Works but wastes resources when not at peak traffic; larger difference between peaks/valleys = larger waste
when idle.
● Manually restart job adjusting spark.cores.max=n
○ Manually babysitting cluster = pain. No one wants to wake up at 3am to grow or shrink a cluster.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 5
Auto Scaling Systems With Elastic Spark Streaming
What about backpressure?
● Async consumer signalling upstream producer to slow down. Reactive Manifesto is brilliantly simple, elegant, and
powerful API.
● Impl in Spark Streaming is effective and works very well.
● Can use “enough” servers and allow backpressure to kick in during peaks/spikes and will catch up later during valley.
○ But can result in falling behind or a lag in real-time data during backpressure. If job is required to maintain
real-time, buffering upstream may not be an option.
○ Once caught up, during traffic valley, you’ll likely have idle/wasted resources.
○ Prolonged sustained spikes > than “normal peak” for many days may exceed your upstream buffer - eg Brexit
caused a very large spike in our traffic for about 10 days. Our Kafka retention window is only 5 days.
○ Ultimately, max throughput is capped by the amount of servers it has...and sometimes we simply need more
servers.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 6
Auto Scaling Systems With Elastic Spark Streaming
A solution: Elastic Spark Streaming
● Auto scale - ability to add/remove servers dynamically without restarting the stream.
● Streaming job must automatically adjust to the demands of traffic.
● Maximize resource utilization - use servers when you need them, release them when you don’t. Don’t pay for idle
resources.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 7
Auto Scaling Systems With Elastic Spark Streaming
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 8
Auto Scaling Systems With Elastic Spark Streaming
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 9
Auto Scaling Systems With Elastic Spark Streaming
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 10
Auto Scaling Systems With Elastic Spark Streaming
Implementation
● We collect “processing %” = totalProcessingTime / microbatchInterval
● If “processing %” > 100%, that microbatch fell behind
● If “processing %” < 100%, that microbatch is stable
● Don’t act on a single datapoint. Collect a sliding window and act on trend.
● Careful to avoid creating a system that constantly scales up and down within seconds or minutes - appears as if
the streaming job is indecisive and thrashing/flapping.
● Each streaming job has a unique workload, traffic pattern, and micro batch interval and thus requires
tunable/configurable thresholds.
○ Max threshold default is 100%; trigger scale up after N datapoints have median > max threshold.
○ Min threshold default is 75%; trigger scale down after N datapoints have medan < min threshold and queued
batches are empty.
○ N is configurable.
○ Scale up and down by configurable percentage of current servers.
● sparkContext.requestExecutors(count) and sparkContext.killExecutor(executorId).
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 11
Auto Scaling Systems With Elastic Spark Streaming
Testing Elastic Streams
● Invariant - elasticity affects performance and resource utilization, it cannot impact data integrity.
● Impl random scaling - WTF why?!?! Before using any heuristics to trigger elasticity, we want to prove that no
matter how good/bad your heuristics are, elasticity can only affect performance. Even with the worst heuristics,
we want to prove that no data loss or corruption occurs. Random scaling intentionally forces the streaming job
into a state of thrashing/flapping. We can observe the performance is much worse than no scaling at all.
● Single command in testing project does the following…
○ Build new mesos cluster.
○ Use 1 static docker image that contains all input/test data.
○ Start 2 spark streaming jobs that ingest same dataset and share same processing logic: one job runs with
no-scaling, other job randomly scales and thrashes.
○ Start last spark job that ingests output from other 2 jobs, perform distinct and diff against the datasets for
every record and field. They must be equivalent otherwise you’ve violated the invariant.
○ Tear down mesos cluster.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 12
Auto Scaling Systems With Elastic Spark Streaming
Where do the spark jobs add servers from and release to?
● Deploy our streaming jobs on mesos cluster and launched via marathon.
● Each streaming job
○ Expands/contracts independently from one another - it has no knowledge of its neighbor.
○ Can gain more servers from the mesos cluster and releases them back to the mesos cluster on-demand.
● A lightweight app monitors used/unused resources on mesos. If the mesos cluster needs to grow, the app automatically triggers
the automation to add mesos workers from the cloud. As the mesos cluster needs to shrink, the app automatically releases the
servers back to the cloud. Like nested balloons that can expand/contract on demand...
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 13
Auto Scaling Systems With Elastic Spark Streaming
Metrics
● A single ingest streaming job ingests over 5TB/day.
● Output is fed into many other streaming jobs (over 30). Results of processing generates > 2x ingest rate.
● At peak times, single ingest job processes > 130k events/second.
● Roughly 500-800 servers in the cloud for our team alone - elasticity changes this number frequently.
● Our team alone spends $250k/month = $3mln/year on infrastructure.
● 8 engineers are responsible for all of it: infrastructure, development, testing.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 14
Auto Scaling Systems With Elastic Spark Streaming
Impact of Elastic Spark Streaming
● Operational gains
○ Minimize manual developer intervention
○ Devs focus on features not babysitting
● Financial impact
○ Maximize resource utilization
○ If your cloud bill is in the millions, your savings can be in the millions!
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 15
Auto Scaling Systems With Elastic Spark Streaming
Thanks for listening. Q&A?

More Related Content

What's hot

Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsDatabricks
 
Hive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみたHive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみたRecruit Technologies
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Databricks
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilDatabricks
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLDatabricks
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDatabricks
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesNishith Agarwal
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsDatabricks
 
Dynamic filtering for presto join optimisation
Dynamic filtering for presto join optimisationDynamic filtering for presto join optimisation
Dynamic filtering for presto join optimisationOri Reshef
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroDatabricks
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoopclairvoyantllc
 
Stream de dados e Data Lake com Debezium, Delta Lake e EMR
Stream de dados e Data Lake com Debezium, Delta Lake e EMRStream de dados e Data Lake com Debezium, Delta Lake e EMR
Stream de dados e Data Lake com Debezium, Delta Lake e EMRCicero Joasyo Mateus de Moura
 
How We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IOHow We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IODatabricks
 

What's hot (20)

Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 
Hive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみたHive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみた
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
Spark SQL - The internal -
Spark SQL - The internal -Spark SQL - The internal -
Spark SQL - The internal -
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
Airflow 101
Airflow 101Airflow 101
Airflow 101
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
Dynamic filtering for presto join optimisation
Dynamic filtering for presto join optimisationDynamic filtering for presto join optimisation
Dynamic filtering for presto join optimisation
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
 
Stream de dados e Data Lake com Debezium, Delta Lake e EMR
Stream de dados e Data Lake com Debezium, Delta Lake e EMRStream de dados e Data Lake com Debezium, Delta Lake e EMR
Stream de dados e Data Lake com Debezium, Delta Lake e EMR
 
How We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IOHow We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IO
 

Viewers also liked

Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Spark Summit
 
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark Summit
 
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...Spark Summit
 
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...Spark Summit
 
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-MallaKerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-MallaSpark Summit
 
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Spark Summit
 
Accelerating Machine Learning and Deep Learning At Scale...With Apache Spark:...
Accelerating Machine Learning and Deep Learning At Scale...With Apache Spark:...Accelerating Machine Learning and Deep Learning At Scale...With Apache Spark:...
Accelerating Machine Learning and Deep Learning At Scale...With Apache Spark:...Spark Summit
 
Scalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Scalable Data Science with SparkR: Spark Summit East talk by Felix CheungScalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Scalable Data Science with SparkR: Spark Summit East talk by Felix CheungSpark Summit
 
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Spark Summit
 
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...Spark Summit
 
Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Imp...
Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Imp...Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Imp...
Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Imp...Spark Summit
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSpark Summit
 
Real-time Platform for Second Look Business Use Case Using Spark and Kafka: S...
Real-time Platform for Second Look Business Use Case Using Spark and Kafka: S...Real-time Platform for Second Look Business Use Case Using Spark and Kafka: S...
Real-time Platform for Second Look Business Use Case Using Spark and Kafka: S...Spark Summit
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Spark Summit
 
Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...
Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...
Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...Spark Summit
 
Exceptions are the Norm: Dealing with Bad Actors in ETL
Exceptions are the Norm: Dealing with Bad Actors in ETLExceptions are the Norm: Dealing with Bad Actors in ETL
Exceptions are the Norm: Dealing with Bad Actors in ETLDatabricks
 
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...Spark Summit
 
Building Real-Time BI Systems with Kafka, Spark, and Kudu: Spark Summit East ...
Building Real-Time BI Systems with Kafka, Spark, and Kudu: Spark Summit East ...Building Real-Time BI Systems with Kafka, Spark, and Kudu: Spark Summit East ...
Building Real-Time BI Systems with Kafka, Spark, and Kudu: Spark Summit East ...Spark Summit
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Spark Summit
 
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...Spark Summit
 

Viewers also liked (20)

Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
 
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
 
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
Optimizing Spark Deployments for Containers: Isolation, Safety, and Performan...
 
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
 
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-MallaKerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
 
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
 
Accelerating Machine Learning and Deep Learning At Scale...With Apache Spark:...
Accelerating Machine Learning and Deep Learning At Scale...With Apache Spark:...Accelerating Machine Learning and Deep Learning At Scale...With Apache Spark:...
Accelerating Machine Learning and Deep Learning At Scale...With Apache Spark:...
 
Scalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Scalable Data Science with SparkR: Spark Summit East talk by Felix CheungScalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Scalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
 
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
 
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...
 
Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Imp...
Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Imp...Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Imp...
Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Imp...
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
 
Real-time Platform for Second Look Business Use Case Using Spark and Kafka: S...
Real-time Platform for Second Look Business Use Case Using Spark and Kafka: S...Real-time Platform for Second Look Business Use Case Using Spark and Kafka: S...
Real-time Platform for Second Look Business Use Case Using Spark and Kafka: S...
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
 
Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...
Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...
Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...
 
Exceptions are the Norm: Dealing with Bad Actors in ETL
Exceptions are the Norm: Dealing with Bad Actors in ETLExceptions are the Norm: Dealing with Bad Actors in ETL
Exceptions are the Norm: Dealing with Bad Actors in ETL
 
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
 
Building Real-Time BI Systems with Kafka, Spark, and Kudu: Spark Summit East ...
Building Real-Time BI Systems with Kafka, Spark, and Kudu: Spark Summit East ...Building Real-Time BI Systems with Kafka, Spark, and Kudu: Spark Summit East ...
Building Real-Time BI Systems with Kafka, Spark, and Kudu: Spark Summit East ...
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
 
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...
 

Similar to Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by PhuDuc Nguyen

Oracle super cluster for oracle e business suite
Oracle super cluster for oracle e business suiteOracle super cluster for oracle e business suite
Oracle super cluster for oracle e business suiteOTN Systems Hub
 
Alta Disponibilidade no MySQL 5.7
Alta Disponibilidade no MySQL 5.7Alta Disponibilidade no MySQL 5.7
Alta Disponibilidade no MySQL 5.7MySQL Brasil
 
How Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintHow Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintScyllaDB
 
GLOC Keynote 2014 - In-memory
GLOC Keynote 2014 - In-memoryGLOC Keynote 2014 - In-memory
GLOC Keynote 2014 - In-memoryConnor McDonald
 
Ground Breakers Romania: Oracle Autonomous Database
Ground Breakers Romania: Oracle Autonomous DatabaseGround Breakers Romania: Oracle Autonomous Database
Ground Breakers Romania: Oracle Autonomous DatabaseMaria Colgan
 
Anna Vergeles, Nataliia Manakova "Unsupervised Real-Time Stream-Based Novelty...
Anna Vergeles, Nataliia Manakova "Unsupervised Real-Time Stream-Based Novelty...Anna Vergeles, Nataliia Manakova "Unsupervised Real-Time Stream-Based Novelty...
Anna Vergeles, Nataliia Manakova "Unsupervised Real-Time Stream-Based Novelty...Fwdays
 
Oracle Management Cloud - HybridCloud Café - May 2016
Oracle Management Cloud - HybridCloud Café - May 2016Oracle Management Cloud - HybridCloud Café - May 2016
Oracle Management Cloud - HybridCloud Café - May 2016Bastien Leblanc
 
Oracle Cloud Café hybrid Cloud 19 mai 2016
Oracle Cloud Café hybrid Cloud 19 mai 2016Oracle Cloud Café hybrid Cloud 19 mai 2016
Oracle Cloud Café hybrid Cloud 19 mai 2016Sorathaya Sirimanotham
 
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)avanttic Consultoría Tecnológica
 
Oracle RAC 12c Rel. 2 for Continuous Availability
Oracle RAC 12c Rel. 2 for Continuous AvailabilityOracle RAC 12c Rel. 2 for Continuous Availability
Oracle RAC 12c Rel. 2 for Continuous AvailabilityMarkus Michalewicz
 
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at UberDisaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uberconfluent
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKafkaZone
 
Exadata x4 for_sap
Exadata x4 for_sapExadata x4 for_sap
Exadata x4 for_sapFran Navarro
 
Oracle ADF Architecture TV - Development - Performance & Tuning
Oracle ADF Architecture TV - Development - Performance & TuningOracle ADF Architecture TV - Development - Performance & Tuning
Oracle ADF Architecture TV - Development - Performance & TuningChris Muir
 
A5 oracle exadata-the game changer for online transaction processing data w...
A5   oracle exadata-the game changer for online transaction processing data w...A5   oracle exadata-the game changer for online transaction processing data w...
A5 oracle exadata-the game changer for online transaction processing data w...Dr. Wilfred Lin (Ph.D.)
 
CON6492 - Oracle Database Public Cloud Services v1 1
CON6492 - Oracle Database Public Cloud Services v1 1CON6492 - Oracle Database Public Cloud Services v1 1
CON6492 - Oracle Database Public Cloud Services v1 1David van Schalkwyk
 
Netherlands Tech Tour 02 - MySQL Fabric
Netherlands Tech Tour 02 -   MySQL FabricNetherlands Tech Tour 02 -   MySQL Fabric
Netherlands Tech Tour 02 - MySQL FabricMark Swarbrick
 
Performance tuning intro
Performance tuning introPerformance tuning intro
Performance tuning introaioughydchapter
 

Similar to Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by PhuDuc Nguyen (20)

Oracle super cluster for oracle e business suite
Oracle super cluster for oracle e business suiteOracle super cluster for oracle e business suite
Oracle super cluster for oracle e business suite
 
Alta Disponibilidade no MySQL 5.7
Alta Disponibilidade no MySQL 5.7Alta Disponibilidade no MySQL 5.7
Alta Disponibilidade no MySQL 5.7
 
Enterprise manager 13c
Enterprise manager 13cEnterprise manager 13c
Enterprise manager 13c
 
How Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintHow Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter Footprint
 
GLOC Keynote 2014 - In-memory
GLOC Keynote 2014 - In-memoryGLOC Keynote 2014 - In-memory
GLOC Keynote 2014 - In-memory
 
Ground Breakers Romania: Oracle Autonomous Database
Ground Breakers Romania: Oracle Autonomous DatabaseGround Breakers Romania: Oracle Autonomous Database
Ground Breakers Romania: Oracle Autonomous Database
 
Anna Vergeles, Nataliia Manakova "Unsupervised Real-Time Stream-Based Novelty...
Anna Vergeles, Nataliia Manakova "Unsupervised Real-Time Stream-Based Novelty...Anna Vergeles, Nataliia Manakova "Unsupervised Real-Time Stream-Based Novelty...
Anna Vergeles, Nataliia Manakova "Unsupervised Real-Time Stream-Based Novelty...
 
Oracle Management Cloud - HybridCloud Café - May 2016
Oracle Management Cloud - HybridCloud Café - May 2016Oracle Management Cloud - HybridCloud Café - May 2016
Oracle Management Cloud - HybridCloud Café - May 2016
 
Oracle Cloud Café hybrid Cloud 19 mai 2016
Oracle Cloud Café hybrid Cloud 19 mai 2016Oracle Cloud Café hybrid Cloud 19 mai 2016
Oracle Cloud Café hybrid Cloud 19 mai 2016
 
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
 
Oracle RAC 12c Rel. 2 for Continuous Availability
Oracle RAC 12c Rel. 2 for Continuous AvailabilityOracle RAC 12c Rel. 2 for Continuous Availability
Oracle RAC 12c Rel. 2 for Continuous Availability
 
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at UberDisaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applications
 
Exadata x4 for_sap
Exadata x4 for_sapExadata x4 for_sap
Exadata x4 for_sap
 
Oracle ADF Architecture TV - Development - Performance & Tuning
Oracle ADF Architecture TV - Development - Performance & TuningOracle ADF Architecture TV - Development - Performance & Tuning
Oracle ADF Architecture TV - Development - Performance & Tuning
 
A5 oracle exadata-the game changer for online transaction processing data w...
A5   oracle exadata-the game changer for online transaction processing data w...A5   oracle exadata-the game changer for online transaction processing data w...
A5 oracle exadata-the game changer for online transaction processing data w...
 
CON6492 - Oracle Database Public Cloud Services v1 1
CON6492 - Oracle Database Public Cloud Services v1 1CON6492 - Oracle Database Public Cloud Services v1 1
CON6492 - Oracle Database Public Cloud Services v1 1
 
AWR, ASH with EM13 at HotSos 2016
AWR, ASH with EM13 at HotSos 2016AWR, ASH with EM13 at HotSos 2016
AWR, ASH with EM13 at HotSos 2016
 
Netherlands Tech Tour 02 - MySQL Fabric
Netherlands Tech Tour 02 -   MySQL FabricNetherlands Tech Tour 02 -   MySQL Fabric
Netherlands Tech Tour 02 - MySQL Fabric
 
Performance tuning intro
Performance tuning introPerformance tuning intro
Performance tuning intro
 

More from Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang WuSpark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya RaghavendraSpark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimSpark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraSpark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovSpark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkSpark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...Spark Summit
 

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Recently uploaded

Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 

Recently uploaded (17)

Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 

Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by PhuDuc Nguyen

  • 1. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 1 Auto Scaling Systems With Elastic Spark Streaming PhuDuc Nguyen Consulting Engineer @ Oracle Data Cloud
  • 2. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 2 Auto Scaling Systems With Elastic Spark Streaming Problem - streaming jobs with variable traffic pattern ● Our traffic pattern has natural peaks and valleys. ● We also experience unpredictable traffic spikes. ○ Breaking world news events - eg Brexit. ○ Some of our partners run batch jobs and send data at unknown times.
  • 3. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 3 Auto Scaling Systems With Elastic Spark Streaming Various scenarios that result in “catch up mode” ● Streaming job has fallen behind real-time. In order to catch up, throughput must be > 1x to process backlog + real-time. ● Catch up from failure: hardware, software, network, etc. Data doesn’t stop just because your streaming job has. Deal with it. Be ready to catch up. ● Catch up from replay: pushing historical data back through the pipeline to fix a bug or apply new feature/transformation retroactively. ● Unforeseen spikes in traffic. ● Underestimating the size of a streaming job.
  • 4. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 4 Auto Scaling Systems With Elastic Spark Streaming Problems aren’t new - can use traditional methods to cope… ● Statically size the cluster/job to max servers needed to handle peak rate + some percentage for extra wiggle room. ○ Works but wastes resources when not at peak traffic; larger difference between peaks/valleys = larger waste when idle. ● Manually restart job adjusting spark.cores.max=n ○ Manually babysitting cluster = pain. No one wants to wake up at 3am to grow or shrink a cluster.
  • 5. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 5 Auto Scaling Systems With Elastic Spark Streaming What about backpressure? ● Async consumer signalling upstream producer to slow down. Reactive Manifesto is brilliantly simple, elegant, and powerful API. ● Impl in Spark Streaming is effective and works very well. ● Can use “enough” servers and allow backpressure to kick in during peaks/spikes and will catch up later during valley. ○ But can result in falling behind or a lag in real-time data during backpressure. If job is required to maintain real-time, buffering upstream may not be an option. ○ Once caught up, during traffic valley, you’ll likely have idle/wasted resources. ○ Prolonged sustained spikes > than “normal peak” for many days may exceed your upstream buffer - eg Brexit caused a very large spike in our traffic for about 10 days. Our Kafka retention window is only 5 days. ○ Ultimately, max throughput is capped by the amount of servers it has...and sometimes we simply need more servers.
  • 6. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 6 Auto Scaling Systems With Elastic Spark Streaming A solution: Elastic Spark Streaming ● Auto scale - ability to add/remove servers dynamically without restarting the stream. ● Streaming job must automatically adjust to the demands of traffic. ● Maximize resource utilization - use servers when you need them, release them when you don’t. Don’t pay for idle resources.
  • 7. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 7 Auto Scaling Systems With Elastic Spark Streaming
  • 8. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 8 Auto Scaling Systems With Elastic Spark Streaming
  • 9. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 9 Auto Scaling Systems With Elastic Spark Streaming
  • 10. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 10 Auto Scaling Systems With Elastic Spark Streaming Implementation ● We collect “processing %” = totalProcessingTime / microbatchInterval ● If “processing %” > 100%, that microbatch fell behind ● If “processing %” < 100%, that microbatch is stable ● Don’t act on a single datapoint. Collect a sliding window and act on trend. ● Careful to avoid creating a system that constantly scales up and down within seconds or minutes - appears as if the streaming job is indecisive and thrashing/flapping. ● Each streaming job has a unique workload, traffic pattern, and micro batch interval and thus requires tunable/configurable thresholds. ○ Max threshold default is 100%; trigger scale up after N datapoints have median > max threshold. ○ Min threshold default is 75%; trigger scale down after N datapoints have medan < min threshold and queued batches are empty. ○ N is configurable. ○ Scale up and down by configurable percentage of current servers. ● sparkContext.requestExecutors(count) and sparkContext.killExecutor(executorId).
  • 11. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 11 Auto Scaling Systems With Elastic Spark Streaming Testing Elastic Streams ● Invariant - elasticity affects performance and resource utilization, it cannot impact data integrity. ● Impl random scaling - WTF why?!?! Before using any heuristics to trigger elasticity, we want to prove that no matter how good/bad your heuristics are, elasticity can only affect performance. Even with the worst heuristics, we want to prove that no data loss or corruption occurs. Random scaling intentionally forces the streaming job into a state of thrashing/flapping. We can observe the performance is much worse than no scaling at all. ● Single command in testing project does the following… ○ Build new mesos cluster. ○ Use 1 static docker image that contains all input/test data. ○ Start 2 spark streaming jobs that ingest same dataset and share same processing logic: one job runs with no-scaling, other job randomly scales and thrashes. ○ Start last spark job that ingests output from other 2 jobs, perform distinct and diff against the datasets for every record and field. They must be equivalent otherwise you’ve violated the invariant. ○ Tear down mesos cluster.
  • 12. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 12 Auto Scaling Systems With Elastic Spark Streaming Where do the spark jobs add servers from and release to? ● Deploy our streaming jobs on mesos cluster and launched via marathon. ● Each streaming job ○ Expands/contracts independently from one another - it has no knowledge of its neighbor. ○ Can gain more servers from the mesos cluster and releases them back to the mesos cluster on-demand. ● A lightweight app monitors used/unused resources on mesos. If the mesos cluster needs to grow, the app automatically triggers the automation to add mesos workers from the cloud. As the mesos cluster needs to shrink, the app automatically releases the servers back to the cloud. Like nested balloons that can expand/contract on demand...
  • 13. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 13 Auto Scaling Systems With Elastic Spark Streaming Metrics ● A single ingest streaming job ingests over 5TB/day. ● Output is fed into many other streaming jobs (over 30). Results of processing generates > 2x ingest rate. ● At peak times, single ingest job processes > 130k events/second. ● Roughly 500-800 servers in the cloud for our team alone - elasticity changes this number frequently. ● Our team alone spends $250k/month = $3mln/year on infrastructure. ● 8 engineers are responsible for all of it: infrastructure, development, testing.
  • 14. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 14 Auto Scaling Systems With Elastic Spark Streaming Impact of Elastic Spark Streaming ● Operational gains ○ Minimize manual developer intervention ○ Devs focus on features not babysitting ● Financial impact ○ Maximize resource utilization ○ If your cloud bill is in the millions, your savings can be in the millions!
  • 15. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 15 Auto Scaling Systems With Elastic Spark Streaming Thanks for listening. Q&A?