SlideShare a Scribd company logo
1 of 50
Download to read offline
Benchmarking Elastic Cloud Big Data
Services under SLA Constraints
Nicolas Poggi, Victor Cuevas-Vicenttín, David Carrera, Josep Lluis Berral, Thomas Fenech,
Gonzalo Gomez, Davide Brini, Alejandro Montero
Umar Farooq Minhas, Jose A. Blakeley, Donald Kossmann, Raghu Ramakrishnan and
Clemens Szyperski.
TPCTC - August 2019
Outline
1. Intro to TPCx-BB
a. Limitations for cloud systems
b. Contributions
2. Realistic workload generation
a. Production datasets
b. Job arrival rates
3. Elasticity Test
a. Current metric
b. SLA-based addition
4. Experimental evaluation
a. Elasticity Test
b. Load, Power, Throughput tests
c. Metric evaluation
5. Conclusions
a. Future directions
2
Benchmarking and TPCx-BB
• Benchmarks capture the solution to a problem and guide decisions.
• Widely used in development, configuration, and testing.
• TPCx-BB (BigBench) is the first standardized big data benchmark
• Collaboration between industry and academia
• Follows the retailer model of TPC-DS
• Adds:
• Semi and unstructured data
• SQL, UDF, ML, and NLP queries
Retailer data model
TPCx-BB benchmark workflow
• Similar to previous TPC database benchmarks:
• Load Test (TLD):
• Generates the DB
• imports raw data, metastore, stats, columnar
• Power Test (TPT)
• Runs queries sequentially
• Throughput Test (TTT)
• Runs queries concurrently
• Includes a data refresh stage
• Produces a final performance metric
• BB queries per minute
DB @ SF
Load data
Seq q1 … q30
User1 q15 q21 … q16
User2 q12 q18 … q2
UserN …
Metric
Limitations of the cocurrency test
Drawback 1:
• Constant concurrency workloads
at the same scale
Drawback 2:
• Does not consider QoS (isolation)
• Query time degradation is not obvious
from the final metric
• We found poor scalability under
concurrency in BB [1]Stream1 q15 q21 … q16
Stream2 q12 q18 … q2
Stream3 q16 q30 … q19
…
[1] Characterizing BigBench queries, Hive, and Spark in multi-cloud environments TPCTC'17
Q4 from 10 to 100GB
over 15X slower
Proposal and contributions
1. Build a realistic big data workload generator
• Based on production workloads
2. Measure QoS in the form of per-query SLAs
• Apply the results in a new metric
• With minimal parameters
3. Extend TPCx-BB with a new concurrency test and metric
• Implement a driver and evaluate differences
Realistic workload generation
Analyzing production big data workloads
• Cosmos cluster operated within Microsoft
• Sample of 350,000 job submissions
• Over a month of data in 2017
• Objectives:
1. Model job submission patterns
2. Workload characterization
Peaks
Valleys
Modeling arrival rates
• Use Hidden Markov Model (HMM) to
model temporal pattern in the workload
• Probabilities between finite number of states
• HMM allows scaling the workload
Peaks
Valleys
Modeling arrival rates
• Use Hidden Markov Model (HMM) to
model temporal pattern in the workload
• Probabilities between finite number of states
• HMM allows scaling the workload
Fluctuations are captured by 4
states and the transitions between them
Peaks
Valleys
Job input data size
• As no general temporal pattern found
• Cumulative distribution sufficient for
modeling SF
• CDF used to generate random
variates mapped to SF
• 1, 10, 100, 1000 GB
• Studied further in [2]
• Findings:
• 55% < 1GB
• 90% < 1TB
CDF of the job’s input data size
[2] Big Data Data Management Systems performance analysis using Aloja and BigBench. Master thesis
Elasticity Test
Methodology for generating workloads
1. Set scale (max concurrent submissions)
• Defaults to n
• Total queries = n * total queries
2. Generate model (queries per interval)
1. Assign queries to each batch randomly
• Query repetition avoided within a batch
2. Multi scale factors can be set
• Include all standard smaller SF
3. Define granularity
1. Set time between batches
2. Defaults to 60s.
Methodology for generating workloads
1. Set scale (max concurrent submissions)
• Defaults to n
• Total queries = n * total queries
2. Generate model (queries per interval)
1. Assign queries to each batch randomly
• Query repetition avoided within a batch
2. Multi scale factors can be set
• Include all standard smaller SF
3. Define granularity
1. Set time between batches
2. Defaults to 60s.
t1 q17
t2 q7
t3 q15 q21
t4 q6 q9 q14
t5 q9 q14
t6 q11 q22 q21
t7 q16 q15
t8 q24
…
Elasticity Test sequence
Timeintervals
# queries / batch
New SLA-aware benchmark metric
• Query-specific SLAs
• Sets a limit for query completion time
• Measures
• Number of misses
• Distance to SLA
• Currently defined ad-hoc
• Uses Power Test times for the SUT(s)
• Adds a 25% margin tolerance
• Benefits
• Works on all SF and future proof
New SLA-aware benchmark metric
• Query-specific SLAs
• Sets a limit for query completion time
• Measures
• Number of misses
• Distance to SLA
• Currently defined ad-hoc
• Uses Power Test times for the SUT(s)
• Adds a 25% margin tolerance
• Benefits
• Works on all SF and future proof
Example:
q1 took 38s. in isolation
SLA for q1 = 47.5s.
New SLA-aware benchmark metric
• Query-specific SLAs on concurrency
• Sets a limit for query completion time
• Measures
• Number of misses
• Distance to SLA
• Indirectly isolation and dependencies
• Currently defined ad-hoc
• Uses Power Test times for the SUT(s)
• Adds a 25% margin tolerance
• Benefits
• Works on all SF and future proof to tech.
Example:
q1 took 38s. in isolation
SLA for q1 = 47.5s.
t1 q17
t2 q7
t3 q15 q21
t4 q6 q9 q14
t5 q9 q14
t6 q11 q22 q21
t7 q16 q15
t8 q24
…
Elasticity Test sequence
Time
# queries / batch time
SLA distance
Current TPCx-BB performance metric
Current TPCx-BB performance metric
Scale factor
Total number of queries
Current TPCx-BB performance metric
Current TPCx-BB performance metric
Current TPCx-BB performance metric
New SLA-aware benchmark metric
BB++
New SLA-aware benchmark metric
BB++
New SLA-aware benchmark metric
Interval between each batch of queries
BB++
New SLA-aware benchmark metric
BB++
SLA distance
New SLA-aware benchmark metric
BB++
SLA factor
New SLA-aware benchmark metric
BB++
Total execution time of the elasticity test
SLA distance
• Distance between the actual execution time and the specified SLA
SLA distance
• Distance between the actual execution time and the specified SLA
SLA distance
• Distance between the actual execution time and the specified SLA
Queries that complete within their SLA
do not contribute to the sum
SLA distance
• Distance between the actual execution time and the specified SLA
SLA factor
< 1 when less tan 25% of the queries fail their SLA,
> 1 if more of 25% of the queries fail their SLA
SLA factor
< 1 when less tan 25% of the queries fail their SLA,
> 1 if more of 25% of the queries fail their SLA
Number of queries that fail to meet their SLA
SLA factor
< 1 when less tan 25% of the queries fail their SLA,
> 1 if more of 25% of the queries fail their SLA
Experimental evaluation
Experimental evaluation
• Experiments performed on Apache Hive (2.2/2.3) and Spark (2.1/2.2)
• Benchmark runs limited to the 14 SQL queries of TPCx-BB
• Due to errors and scalability limitations
• Using a fixed scale factor
• Total 512-cores and 2TB of RAM
• 32 workers: 16 vcpus and 64GB RAM
• Ran on 3 major cloud providers using block storage
• Results anonymized
• (Only results for Provider1 at 10TB presented)
Elasticity Test at 10TB and 2 streams
Provider A: Hive
Elasticity Test at 10TB and 2 streams
Provider A: Hive Provider A: Spark
Complete TPCx-BB test times at 10TB
21
Provider A: Hive Provider A: Spark
Elasticity Time (s) 7,084 6,603
Throughput Time (s) 12,878 6,496
Power Time (s) 5,036 5,520
Load time (s) 5,124 5,124
Total Time (s) 30,122 23,743
5,124 5,124
5,036 5,520
12,878
6,496
7,084
6,603
Total Time (s), 30,122
Total Time (s), 23,743
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
Time(s)
Provider A: Hive Provider B: Spark
BB++Qpm (new)
1 2
Provider A: Hive 1,352 295
Provider A: Spark 1,767 1,286
Provider A: Hive 1,352
Provider A: Hive 295
Provider A: Spark 1,767
Provider A: Spark 1,286
Metricscore
Comparison of the two scores at 10TB
22
Hive gets 4.3x
lower score in
the new metric
30% diff
Spark also
gets a lower
score
BB++QpmBBQpm
BBQpm (old)
BB++Qpm (new)
1 2
Provider A: Hive 1,352 295
Provider A: Spark 1,767 1,286
Provider A: Hive 1,352
Provider A: Hive 295
Provider A: Spark 1,767
Provider A: Spark 1,286
Metricscore
Comparison of the two scores at 10TB
22
Hive gets 4.3x
lower score in
the new metric
30% diff
Spark also
gets a lower
score
BB++QpmBBQpm
BBQpm (old)
Summary and future directions
Summary
• The throughput test under TPC DB benchmarks provides limited signal
• Closed loop system (constant load)
• Does not consider temporal patterns
• Limited test of load balancers and schedulers (no queueing)
• Modeling a real-world big data cluster we have produced:
• A workload generator with job arrival rates
• Multi-data-scales test
• Extended TPCx-BB with the Elasticity Test
• Incorporating SLAs and proposing a new metric
• Evaluated its applicability to cloud big data systems
• And how scores differs to the current metric
24
Conclusions and future work
• The Elasticity Test considers aspects crucial for the cloud
• Dynamic workloads in accordance to real-world behavior
• QoS at the query-level or isolation
• The ET can improve the development of elastic cloud systems
• By rewarding systems that can keep QoS under concurrency
• While saving costs in periods of low intensity
Future directions
• Test elastic DBaaS / QaaS under concurrency
• Specification of SLAs needs to be studied further
• Work with this community and gather feedback and next steps
Thanks, questions?
Follow up / feedback : Npoggi@ac.upc.edu
Benchmarking Elastic Cloud Big Data Services
under SLA Constraints
TPCTC - August 2019
Extra slides
Elasticity Test at 1TB Hive: Prov A and B
SLA tester (sample)
Sample total queries and arrivals
Workload parameters:
• 10 TB scale factor
• 2 streams of 14 SQL queries
• total of 28 queries
• λbatch = 240 sec (4 min)
Experiments at 100GB with 8-streams (112 total queries)
Fast system Slow system showing queueing and degraded performance

More Related Content

What's hot

Java Garbage Collectors – Moving to Java7 Garbage First (G1) Collector
Java Garbage Collectors – Moving to Java7 Garbage First (G1) CollectorJava Garbage Collectors – Moving to Java7 Garbage First (G1) Collector
Java Garbage Collectors – Moving to Java7 Garbage First (G1) CollectorGurpreet Sachdeva
 
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGijccsa
 
Dockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and NovaDockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and Novaclayton_oneill
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Xavier Lucas
 
observability pre-release: using prometheus to test and fix new software
observability pre-release: using prometheus to test and fix new softwareobservability pre-release: using prometheus to test and fix new software
observability pre-release: using prometheus to test and fix new softwareSneha Inguva
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineThe Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineMonica Beckwith
 
Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an actionGordon Chung
 
Value-Based Allocation of Docker Containers
Value-Based Allocation of Docker ContainersValue-Based Allocation of Docker Containers
Value-Based Allocation of Docker ContainersPiotr Dziurzanski
 
Objective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming
Objective and Subjective QoE Evaluation for Adaptive Point Cloud StreamingObjective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming
Objective and Subjective QoE Evaluation for Adaptive Point Cloud StreamingAlpen-Adria-Universität
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridDataWorks Summit
 
Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!Monica Beckwith
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormJohn Georgiadis
 
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerMonica Beckwith
 
Dynamic Classification in a Silicon-Based Forwarding Engine
Dynamic Classification in a Silicon-Based Forwarding EngineDynamic Classification in a Silicon-Based Forwarding Engine
Dynamic Classification in a Silicon-Based Forwarding EngineTal Lavian Ph.D.
 
NTP Project Presentation
NTP Project PresentationNTP Project Presentation
NTP Project PresentationAndrew McGarry
 

What's hot (20)

Java Garbage Collectors – Moving to Java7 Garbage First (G1) Collector
Java Garbage Collectors – Moving to Java7 Garbage First (G1) CollectorJava Garbage Collectors – Moving to Java7 Garbage First (G1) Collector
Java Garbage Collectors – Moving to Java7 Garbage First (G1) Collector
 
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
 
Dockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and NovaDockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and Nova
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28
 
observability pre-release: using prometheus to test and fix new software
observability pre-release: using prometheus to test and fix new softwareobservability pre-release: using prometheus to test and fix new software
observability pre-release: using prometheus to test and fix new software
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
cnsm2012_slide
cnsm2012_slidecnsm2012_slide
cnsm2012_slide
 
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineThe Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
 
Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an action
 
Value-Based Allocation of Docker Containers
Value-Based Allocation of Docker ContainersValue-Based Allocation of Docker Containers
Value-Based Allocation of Docker Containers
 
Objective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming
Objective and Subjective QoE Evaluation for Adaptive Point Cloud StreamingObjective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming
Objective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
 
Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 
PraveenBOUT++
PraveenBOUT++PraveenBOUT++
PraveenBOUT++
 
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance Engineer
 
Dynamic Classification in a Silicon-Based Forwarding Engine
Dynamic Classification in a Silicon-Based Forwarding EngineDynamic Classification in a Silicon-Based Forwarding Engine
Dynamic Classification in a Silicon-Based Forwarding Engine
 
NTP Project Presentation
NTP Project PresentationNTP Project Presentation
NTP Project Presentation
 

Similar to Benchmarking Elastic Cloud Big Data Services under SLA Constraints

Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...Coburn Watson
 
Elasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsElasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsTubular Labs
 
Arte 12052005 1
Arte 12052005 1Arte 12052005 1
Arte 12052005 1pkedar79
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...WMLab,NCU
 
Predictive Analytics in Manufacturing
Predictive Analytics in ManufacturingPredictive Analytics in Manufacturing
Predictive Analytics in ManufacturingData Science Thailand
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3LibbySchulze
 
Process Capability.ppt related to quality
Process Capability.ppt related to qualityProcess Capability.ppt related to quality
Process Capability.ppt related to qualitynikhilyadav365577
 
quality control STUDY ON 3 POLE MCCB MBA SIP report
quality control STUDY ON 3 POLE MCCB MBA SIP report quality control STUDY ON 3 POLE MCCB MBA SIP report
quality control STUDY ON 3 POLE MCCB MBA SIP report Akshay Nair
 
Observer, a "real life" time series application
Observer, a "real life" time series applicationObserver, a "real life" time series application
Observer, a "real life" time series applicationKévin LOVATO
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
 
Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...Paul Brebner
 
Model Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition SystemsModel Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition SystemsLionel Briand
 
Architectural Optimizations for High Performance and Energy Efficient Smith-W...
Architectural Optimizations for High Performance and Energy Efficient Smith-W...Architectural Optimizations for High Performance and Energy Efficient Smith-W...
Architectural Optimizations for High Performance and Energy Efficient Smith-W...NECST Lab @ Politecnico di Milano
 
Dealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in VerificationDealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in VerificationDVClub
 
Jason Morton - SOP Linac Commissioning
Jason Morton - SOP Linac CommissioningJason Morton - SOP Linac Commissioning
Jason Morton - SOP Linac CommissioningGenesisCareUK
 
Bug deBug Chennai 2012 Talk - V3 analysis an approach for estimating software...
Bug deBug Chennai 2012 Talk - V3 analysis an approach for estimating software...Bug deBug Chennai 2012 Talk - V3 analysis an approach for estimating software...
Bug deBug Chennai 2012 Talk - V3 analysis an approach for estimating software...RIA RUI Society
 
IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0Matt Lucas
 

Similar to Benchmarking Elastic Cloud Big Data Services under SLA Constraints (20)

Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
 
Elasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsElasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular Labs
 
Arte 12052005 1
Arte 12052005 1Arte 12052005 1
Arte 12052005 1
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...
 
Predictive Analytics in Manufacturing
Predictive Analytics in ManufacturingPredictive Analytics in Manufacturing
Predictive Analytics in Manufacturing
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Process Capability.ppt related to quality
Process Capability.ppt related to qualityProcess Capability.ppt related to quality
Process Capability.ppt related to quality
 
quality control STUDY ON 3 POLE MCCB MBA SIP report
quality control STUDY ON 3 POLE MCCB MBA SIP report quality control STUDY ON 3 POLE MCCB MBA SIP report
quality control STUDY ON 3 POLE MCCB MBA SIP report
 
MyPGDMM_MS_Chap12_TQM (1).pdf
MyPGDMM_MS_Chap12_TQM (1).pdfMyPGDMM_MS_Chap12_TQM (1).pdf
MyPGDMM_MS_Chap12_TQM (1).pdf
 
Observer, a "real life" time series application
Observer, a "real life" time series applicationObserver, a "real life" time series application
Observer, a "real life" time series application
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...
 
Svm on cloud (presntation)
Svm on cloud  (presntation)Svm on cloud  (presntation)
Svm on cloud (presntation)
 
Preso-v0.1
Preso-v0.1Preso-v0.1
Preso-v0.1
 
Model Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition SystemsModel Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition Systems
 
Architectural Optimizations for High Performance and Energy Efficient Smith-W...
Architectural Optimizations for High Performance and Energy Efficient Smith-W...Architectural Optimizations for High Performance and Energy Efficient Smith-W...
Architectural Optimizations for High Performance and Energy Efficient Smith-W...
 
Dealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in VerificationDealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in Verification
 
Jason Morton - SOP Linac Commissioning
Jason Morton - SOP Linac CommissioningJason Morton - SOP Linac Commissioning
Jason Morton - SOP Linac Commissioning
 
Bug deBug Chennai 2012 Talk - V3 analysis an approach for estimating software...
Bug deBug Chennai 2012 Talk - V3 analysis an approach for estimating software...Bug deBug Chennai 2012 Talk - V3 analysis an approach for estimating software...
Bug deBug Chennai 2012 Talk - V3 analysis an approach for estimating software...
 
IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0
 

More from Nicolas Poggi

Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLNicolas Poggi
 
State of Spark in the cloud (Spark Summit EU 2017)
State of Spark in the cloud (Spark Summit EU 2017)State of Spark in the cloud (Spark Summit EU 2017)
State of Spark in the cloud (Spark Summit EU 2017)Nicolas Poggi
 
The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)Nicolas Poggi
 
The state of Spark in the cloud
The state of Spark in the cloudThe state of Spark in the cloud
The state of Spark in the cloudNicolas Poggi
 
Using BigBench to compare Hive and Spark (Long version)
Using BigBench to compare Hive and Spark (Long version)Using BigBench to compare Hive and Spark (Long version)
Using BigBench to compare Hive and Spark (Long version)Nicolas Poggi
 
Using BigBench to compare Hive and Spark (short version)
Using BigBench to compare Hive and Spark (short version)Using BigBench to compare Hive and Spark (short version)
Using BigBench to compare Hive and Spark (short version)Nicolas Poggi
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudNicolas Poggi
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJANicolas Poggi
 
Benchmarking Hadoop and Big Data
Benchmarking Hadoop and Big DataBenchmarking Hadoop and Big Data
Benchmarking Hadoop and Big DataNicolas Poggi
 
Vagrant + Docker provider [+Puppet]
Vagrant + Docker provider [+Puppet]Vagrant + Docker provider [+Puppet]
Vagrant + Docker provider [+Puppet]Nicolas Poggi
 
The case for Hadoop performance
The case for Hadoop performanceThe case for Hadoop performance
The case for Hadoop performanceNicolas Poggi
 

More from Nicolas Poggi (12)

Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQL
 
State of Spark in the cloud (Spark Summit EU 2017)
State of Spark in the cloud (Spark Summit EU 2017)State of Spark in the cloud (Spark Summit EU 2017)
State of Spark in the cloud (Spark Summit EU 2017)
 
The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)
 
The state of Spark in the cloud
The state of Spark in the cloudThe state of Spark in the cloud
The state of Spark in the cloud
 
Using BigBench to compare Hive and Spark (Long version)
Using BigBench to compare Hive and Spark (Long version)Using BigBench to compare Hive and Spark (Long version)
Using BigBench to compare Hive and Spark (Long version)
 
Using BigBench to compare Hive and Spark (short version)
Using BigBench to compare Hive and Spark (short version)Using BigBench to compare Hive and Spark (short version)
Using BigBench to compare Hive and Spark (short version)
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
 
Benchmarking Hadoop and Big Data
Benchmarking Hadoop and Big DataBenchmarking Hadoop and Big Data
Benchmarking Hadoop and Big Data
 
Vagrant + Docker provider [+Puppet]
Vagrant + Docker provider [+Puppet]Vagrant + Docker provider [+Puppet]
Vagrant + Docker provider [+Puppet]
 
The case for Hadoop performance
The case for Hadoop performanceThe case for Hadoop performance
The case for Hadoop performance
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Benchmarking Elastic Cloud Big Data Services under SLA Constraints

  • 1. Benchmarking Elastic Cloud Big Data Services under SLA Constraints Nicolas Poggi, Victor Cuevas-Vicenttín, David Carrera, Josep Lluis Berral, Thomas Fenech, Gonzalo Gomez, Davide Brini, Alejandro Montero Umar Farooq Minhas, Jose A. Blakeley, Donald Kossmann, Raghu Ramakrishnan and Clemens Szyperski. TPCTC - August 2019
  • 2. Outline 1. Intro to TPCx-BB a. Limitations for cloud systems b. Contributions 2. Realistic workload generation a. Production datasets b. Job arrival rates 3. Elasticity Test a. Current metric b. SLA-based addition 4. Experimental evaluation a. Elasticity Test b. Load, Power, Throughput tests c. Metric evaluation 5. Conclusions a. Future directions 2
  • 3. Benchmarking and TPCx-BB • Benchmarks capture the solution to a problem and guide decisions. • Widely used in development, configuration, and testing. • TPCx-BB (BigBench) is the first standardized big data benchmark • Collaboration between industry and academia • Follows the retailer model of TPC-DS • Adds: • Semi and unstructured data • SQL, UDF, ML, and NLP queries Retailer data model
  • 4. TPCx-BB benchmark workflow • Similar to previous TPC database benchmarks: • Load Test (TLD): • Generates the DB • imports raw data, metastore, stats, columnar • Power Test (TPT) • Runs queries sequentially • Throughput Test (TTT) • Runs queries concurrently • Includes a data refresh stage • Produces a final performance metric • BB queries per minute DB @ SF Load data Seq q1 … q30 User1 q15 q21 … q16 User2 q12 q18 … q2 UserN … Metric
  • 5. Limitations of the cocurrency test Drawback 1: • Constant concurrency workloads at the same scale Drawback 2: • Does not consider QoS (isolation) • Query time degradation is not obvious from the final metric • We found poor scalability under concurrency in BB [1]Stream1 q15 q21 … q16 Stream2 q12 q18 … q2 Stream3 q16 q30 … q19 … [1] Characterizing BigBench queries, Hive, and Spark in multi-cloud environments TPCTC'17 Q4 from 10 to 100GB over 15X slower
  • 6. Proposal and contributions 1. Build a realistic big data workload generator • Based on production workloads 2. Measure QoS in the form of per-query SLAs • Apply the results in a new metric • With minimal parameters 3. Extend TPCx-BB with a new concurrency test and metric • Implement a driver and evaluate differences
  • 8. Analyzing production big data workloads • Cosmos cluster operated within Microsoft • Sample of 350,000 job submissions • Over a month of data in 2017 • Objectives: 1. Model job submission patterns 2. Workload characterization Peaks Valleys
  • 9. Modeling arrival rates • Use Hidden Markov Model (HMM) to model temporal pattern in the workload • Probabilities between finite number of states • HMM allows scaling the workload Peaks Valleys
  • 10. Modeling arrival rates • Use Hidden Markov Model (HMM) to model temporal pattern in the workload • Probabilities between finite number of states • HMM allows scaling the workload Fluctuations are captured by 4 states and the transitions between them Peaks Valleys
  • 11. Job input data size • As no general temporal pattern found • Cumulative distribution sufficient for modeling SF • CDF used to generate random variates mapped to SF • 1, 10, 100, 1000 GB • Studied further in [2] • Findings: • 55% < 1GB • 90% < 1TB CDF of the job’s input data size [2] Big Data Data Management Systems performance analysis using Aloja and BigBench. Master thesis
  • 13. Methodology for generating workloads 1. Set scale (max concurrent submissions) • Defaults to n • Total queries = n * total queries 2. Generate model (queries per interval) 1. Assign queries to each batch randomly • Query repetition avoided within a batch 2. Multi scale factors can be set • Include all standard smaller SF 3. Define granularity 1. Set time between batches 2. Defaults to 60s.
  • 14. Methodology for generating workloads 1. Set scale (max concurrent submissions) • Defaults to n • Total queries = n * total queries 2. Generate model (queries per interval) 1. Assign queries to each batch randomly • Query repetition avoided within a batch 2. Multi scale factors can be set • Include all standard smaller SF 3. Define granularity 1. Set time between batches 2. Defaults to 60s. t1 q17 t2 q7 t3 q15 q21 t4 q6 q9 q14 t5 q9 q14 t6 q11 q22 q21 t7 q16 q15 t8 q24 … Elasticity Test sequence Timeintervals # queries / batch
  • 15. New SLA-aware benchmark metric • Query-specific SLAs • Sets a limit for query completion time • Measures • Number of misses • Distance to SLA • Currently defined ad-hoc • Uses Power Test times for the SUT(s) • Adds a 25% margin tolerance • Benefits • Works on all SF and future proof
  • 16. New SLA-aware benchmark metric • Query-specific SLAs • Sets a limit for query completion time • Measures • Number of misses • Distance to SLA • Currently defined ad-hoc • Uses Power Test times for the SUT(s) • Adds a 25% margin tolerance • Benefits • Works on all SF and future proof Example: q1 took 38s. in isolation SLA for q1 = 47.5s.
  • 17. New SLA-aware benchmark metric • Query-specific SLAs on concurrency • Sets a limit for query completion time • Measures • Number of misses • Distance to SLA • Indirectly isolation and dependencies • Currently defined ad-hoc • Uses Power Test times for the SUT(s) • Adds a 25% margin tolerance • Benefits • Works on all SF and future proof to tech. Example: q1 took 38s. in isolation SLA for q1 = 47.5s. t1 q17 t2 q7 t3 q15 q21 t4 q6 q9 q14 t5 q9 q14 t6 q11 q22 q21 t7 q16 q15 t8 q24 … Elasticity Test sequence Time # queries / batch time SLA distance
  • 19. Current TPCx-BB performance metric Scale factor Total number of queries
  • 23. New SLA-aware benchmark metric BB++
  • 24. New SLA-aware benchmark metric BB++
  • 25. New SLA-aware benchmark metric Interval between each batch of queries BB++
  • 26. New SLA-aware benchmark metric BB++ SLA distance
  • 27. New SLA-aware benchmark metric BB++ SLA factor
  • 28. New SLA-aware benchmark metric BB++ Total execution time of the elasticity test
  • 29. SLA distance • Distance between the actual execution time and the specified SLA
  • 30. SLA distance • Distance between the actual execution time and the specified SLA
  • 31. SLA distance • Distance between the actual execution time and the specified SLA Queries that complete within their SLA do not contribute to the sum
  • 32. SLA distance • Distance between the actual execution time and the specified SLA
  • 33. SLA factor < 1 when less tan 25% of the queries fail their SLA, > 1 if more of 25% of the queries fail their SLA
  • 34. SLA factor < 1 when less tan 25% of the queries fail their SLA, > 1 if more of 25% of the queries fail their SLA Number of queries that fail to meet their SLA
  • 35. SLA factor < 1 when less tan 25% of the queries fail their SLA, > 1 if more of 25% of the queries fail their SLA
  • 37. Experimental evaluation • Experiments performed on Apache Hive (2.2/2.3) and Spark (2.1/2.2) • Benchmark runs limited to the 14 SQL queries of TPCx-BB • Due to errors and scalability limitations • Using a fixed scale factor • Total 512-cores and 2TB of RAM • 32 workers: 16 vcpus and 64GB RAM • Ran on 3 major cloud providers using block storage • Results anonymized • (Only results for Provider1 at 10TB presented)
  • 38. Elasticity Test at 10TB and 2 streams Provider A: Hive
  • 39. Elasticity Test at 10TB and 2 streams Provider A: Hive Provider A: Spark
  • 40. Complete TPCx-BB test times at 10TB 21 Provider A: Hive Provider A: Spark Elasticity Time (s) 7,084 6,603 Throughput Time (s) 12,878 6,496 Power Time (s) 5,036 5,520 Load time (s) 5,124 5,124 Total Time (s) 30,122 23,743 5,124 5,124 5,036 5,520 12,878 6,496 7,084 6,603 Total Time (s), 30,122 Total Time (s), 23,743 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 Time(s) Provider A: Hive Provider B: Spark
  • 41. BB++Qpm (new) 1 2 Provider A: Hive 1,352 295 Provider A: Spark 1,767 1,286 Provider A: Hive 1,352 Provider A: Hive 295 Provider A: Spark 1,767 Provider A: Spark 1,286 Metricscore Comparison of the two scores at 10TB 22 Hive gets 4.3x lower score in the new metric 30% diff Spark also gets a lower score BB++QpmBBQpm BBQpm (old)
  • 42. BB++Qpm (new) 1 2 Provider A: Hive 1,352 295 Provider A: Spark 1,767 1,286 Provider A: Hive 1,352 Provider A: Hive 295 Provider A: Spark 1,767 Provider A: Spark 1,286 Metricscore Comparison of the two scores at 10TB 22 Hive gets 4.3x lower score in the new metric 30% diff Spark also gets a lower score BB++QpmBBQpm BBQpm (old)
  • 43. Summary and future directions
  • 44. Summary • The throughput test under TPC DB benchmarks provides limited signal • Closed loop system (constant load) • Does not consider temporal patterns • Limited test of load balancers and schedulers (no queueing) • Modeling a real-world big data cluster we have produced: • A workload generator with job arrival rates • Multi-data-scales test • Extended TPCx-BB with the Elasticity Test • Incorporating SLAs and proposing a new metric • Evaluated its applicability to cloud big data systems • And how scores differs to the current metric 24
  • 45. Conclusions and future work • The Elasticity Test considers aspects crucial for the cloud • Dynamic workloads in accordance to real-world behavior • QoS at the query-level or isolation • The ET can improve the development of elastic cloud systems • By rewarding systems that can keep QoS under concurrency • While saving costs in periods of low intensity Future directions • Test elastic DBaaS / QaaS under concurrency • Specification of SLAs needs to be studied further • Work with this community and gather feedback and next steps
  • 46. Thanks, questions? Follow up / feedback : Npoggi@ac.upc.edu Benchmarking Elastic Cloud Big Data Services under SLA Constraints TPCTC - August 2019
  • 48. Elasticity Test at 1TB Hive: Prov A and B SLA tester (sample)
  • 49. Sample total queries and arrivals Workload parameters: • 10 TB scale factor • 2 streams of 14 SQL queries • total of 28 queries • λbatch = 240 sec (4 min)
  • 50. Experiments at 100GB with 8-streams (112 total queries) Fast system Slow system showing queueing and degraded performance