SlideShare a Scribd company logo
1 of 32
Air Traffic Controller
Using Samza to manage communications with members
By: Cameron Lee and Shubhanshu Nagar
Outline
Problem Statement
How ATC Solves it
Implementation
Interesting Features
What problem are we trying to solve?
In the past, LinkedIn provided a poor communications experience to some of
its members.
Too much email, low quality email, fired on multiple channels at once
Our goal was to build a system which could apply some common
functionality across many different communication types and use cases in
order to improve the member experience.
Handle thousands of communications per second
Good understanding of state of members on the site in near-real-time
How does ATC think about
creating
a delightful member experience?
5 Rights
Right member
Right message
Useful to member
Shouldn’t have seen it before
Right frequency
Right channel
Filtering
Don’t send stale messages
Don’t send spammy messages
Don’t send duplicate messages
Aggregation and Capping
Don’t flood me. Consolidate if you have too much to say.
Channel Selection
“Don’t blast all channels at the same time”
Delivery-time Optimization
● Hold on to a message and deliver it at the right
moment.
● Ex: Don’t buzz my phone at 2 AM.
● I like to read my daily digests every day after work.
How did we build this
thing?
Requirements for ATC
● Highly-scalable
● Nearline (but close to real-time!)
● Ingest data from many sources
● Persist some data, but most needs are low TTL
What’s ATC built on?
Ecosystem
Message
Delivery Service
Offline
apps
Online apps
ATCRelevance
scores
User action
data
Persistence: RocksDB
Out-of-the-box storage layer
Write-optimized for high performance on SSDs.
Changelogs provide fault tolerance and bootstrapping capabilities
ATC
Pipeline
instance 1
ATC
Repartitioner
Re-partitioning of events
External
services
ATC
Pipeline
instance n
ATC
task
External
Requests
Channel
Selection
Message Delivery
Service
Scheduler
Filtering
Message Data
Tree
Generation
Aggregation
& Capping
Hipster Stream Processing
Implementation Details
Streaming Technologies
Kafka: publish-subscribe messaging system
Used to send input to ATC to trigger communications
Many actions and signals in the LinkedIn ecosystem are tracked in kafka events. We can
consume these signals to better understand the state of the ecosystem.
Databus: change capture system for databases
Produces an event whenever an entry in a database changes
Host affinity
By default, whenever a Samza app is deployed, the task instances can be
moved to any host in the cluster, regardless of where the instances were
previously deployed.
If there was any state saved (e.g. RocksDB), then the new instances would
have to rebuild that state off of the changelog. This bootstrapping can take
some time depending on the amount of data to reload. Task instances
can’t process new input until bootstrapping is complete.
We have some use cases which can’t be delayed for the amount of time it
Host affinity (continued)
Host affinity is a Samza feature which allows us to deploy task instances
back to the same hosts from the previous deployment, so state does not
need to be reloaded.
In case of failures for individual instances, Samza can fallback to moving the
instance elsewhere and bootstrapping off of the changelog.
Multiple datacenters
Samza does not currently support replicating persistent application state
(e.g. RocksDB) across multiple clusters which are running the same app.
We need ATC to run in multiple datacenters for redundancy.
We need to have state in each datacenter so that if we have to move
processing between datacenters, then we can continue to properly handle
input.
Multiple datacenters
We rely on the input streams to replicate the main input so that we can do
processing and build up state in all datacenters.
The side effects (trigger the actual email send) then will only get emitted by
one of the datacenters. We can dynamically choose where side effects are
triggered.
Multiple datacenters (continued)
Deployments
When we deploy changes to ATC, we can deploy to a single datacenter at a
time in order to test new versions on only a fraction of traffic.
In some cases, we shift all side effects out of a datacenter to do an upgrade.
Since we still process all input, we can validate almost all of our
functionality and ensure performance doesn’t take an unexpected hit.
Store migrations
In some cases, we need to migrate our system to use a new instance of a
store.
For example, when support was added to use RocksDB TTL, we needed to migrate some of
our stores.
Since we only needed the last X days of data, we could use the following
strategy for the migration:
Write to both the old and new store for X days, but continue to read from the old store.
After X days, read from the new store, but continue writing both stores so we could fall back
Personalization through relevance
We work closely with a relevance team in order to make better decisions
about the communications we send out.
e.g. channel selection, delivery time, aggregation thresholds
Every day, scores for different decisions are computed offline (Hadoop) by the
relevance team. Those scores are pushed to ATC through Kafka, and then
ATC stores the scores in RocksDB.
Scores are generated for each member, so we can personalize the
experience.
Interesting features
Remote calls
Some data is not available on a Kafka stream in a pragmatic way
We make REST requests to fetch that data
Done at the beginning of pipeline
Extract event
Make remote calls and decorate event
Process decorated event
Remote calls - Efficiently
Use ParSeq
Framework to write asynchronous code in Java
Open Sourced
ParSeq uses a thread pool for making remote calls
Rest of processing happens serially
Checkpointing handled by application
Real-time Processing
Some messages require real-time latency
Tuned Kafka’s batching configuration to achieve sub-second of pre-ATC
latency
Can be tuned even more aggressively!
ATC/Samza processes most events in 2-3 ms
No remote calls for these messages
Scheduler
Scheduler RocksDB
Scheduled requests
(from aggregation,
follow-up, etc.)
Window task
(periodic)
Other
processing
Message
Delivery Service
Questions?

More Related Content

What's hot

Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...HostedbyConfluent
 
Instana - ClickHouse presentation
Instana - ClickHouse presentationInstana - ClickHouse presentation
Instana - ClickHouse presentationMiel Donkers
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsAlluxio, Inc.
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Splunk Search Optimization
Splunk Search OptimizationSplunk Search Optimization
Splunk Search OptimizationSplunk
 
PySpark dataframe
PySpark dataframePySpark dataframe
PySpark dataframeJaemun Jung
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDatabricks
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQLTop 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQLJim Mlodgenski
 
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...Amazon Web Services
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Apache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceApache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceSijie Guo
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Databricks
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Building Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaBuilding Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaGuido Schmutz
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 

What's hot (20)

Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
 
Instana - ClickHouse presentation
Instana - ClickHouse presentationInstana - ClickHouse presentation
Instana - ClickHouse presentation
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Splunk Search Optimization
Splunk Search OptimizationSplunk Search Optimization
Splunk Search Optimization
 
PySpark dataframe
PySpark dataframePySpark dataframe
PySpark dataframe
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
 
Fluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at ScaleFluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at Scale
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQLTop 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Apache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceApache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage Service
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Building Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaBuilding Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache Kafka
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 

Similar to ATC Communication System

Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019
Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019 Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019
Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019 confluent
 
The Art of Message Queues - TEKX
The Art of Message Queues - TEKXThe Art of Message Queues - TEKX
The Art of Message Queues - TEKXMike Willbanks
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systemsMalisa Ncube
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 
Building and Scaling a WebSockets Pubsub System
Building and Scaling a WebSockets Pubsub SystemBuilding and Scaling a WebSockets Pubsub System
Building and Scaling a WebSockets Pubsub SystemKapil Reddy
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsDirecti Group
 
High volume real time contiguous etl and audit
High volume real time contiguous etl and auditHigh volume real time contiguous etl and audit
High volume real time contiguous etl and auditRemus Rusanu
 
Spark Streaming Recipes and "Exactly Once" Semantics Revised
Spark Streaming Recipes and "Exactly Once" Semantics RevisedSpark Streaming Recipes and "Exactly Once" Semantics Revised
Spark Streaming Recipes and "Exactly Once" Semantics RevisedMichael Spector
 
High powered messaging with RabbitMQ
High powered messaging with RabbitMQHigh powered messaging with RabbitMQ
High powered messaging with RabbitMQJames Carr
 
ReactiveSummeriserAkka-ScalaByBay2016
ReactiveSummeriserAkka-ScalaByBay2016ReactiveSummeriserAkka-ScalaByBay2016
ReactiveSummeriserAkka-ScalaByBay2016Ho Tien VU
 
[ScalaByTheBay2016] Implement a scalable statistical aggregation system using...
[ScalaByTheBay2016] Implement a scalable statistical aggregation system using...[ScalaByTheBay2016] Implement a scalable statistical aggregation system using...
[ScalaByTheBay2016] Implement a scalable statistical aggregation system using...Stanley Nguyen Xuan Tuong
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...GeeksLab Odessa
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategySaptarshi Chatterjee
 
Apache samza past, present and future
Apache samza  past, present and futureApache samza  past, present and future
Apache samza past, present and futureEd Yakabosky
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with storesYoni Farin
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudRick Bilodeau
 

Similar to ATC Communication System (20)

Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019
Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019 Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019
Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019
 
The Art of Message Queues - TEKX
The Art of Message Queues - TEKXThe Art of Message Queues - TEKX
The Art of Message Queues - TEKX
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systems
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
Building and Scaling a WebSockets Pubsub System
Building and Scaling a WebSockets Pubsub SystemBuilding and Scaling a WebSockets Pubsub System
Building and Scaling a WebSockets Pubsub System
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
High volume real time contiguous etl and audit
High volume real time contiguous etl and auditHigh volume real time contiguous etl and audit
High volume real time contiguous etl and audit
 
Csc concepts
Csc conceptsCsc concepts
Csc concepts
 
Spark Streaming Recipes and "Exactly Once" Semantics Revised
Spark Streaming Recipes and "Exactly Once" Semantics RevisedSpark Streaming Recipes and "Exactly Once" Semantics Revised
Spark Streaming Recipes and "Exactly Once" Semantics Revised
 
High powered messaging with RabbitMQ
High powered messaging with RabbitMQHigh powered messaging with RabbitMQ
High powered messaging with RabbitMQ
 
ReactiveSummeriserAkka-ScalaByBay2016
ReactiveSummeriserAkka-ScalaByBay2016ReactiveSummeriserAkka-ScalaByBay2016
ReactiveSummeriserAkka-ScalaByBay2016
 
[ScalaByTheBay2016] Implement a scalable statistical aggregation system using...
[ScalaByTheBay2016] Implement a scalable statistical aggregation system using...[ScalaByTheBay2016] Implement a scalable statistical aggregation system using...
[ScalaByTheBay2016] Implement a scalable statistical aggregation system using...
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Event driven-arch
Event driven-archEvent driven-arch
Event driven-arch
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategy
 
Apache samza past, present and future
Apache samza  past, present and futureApache samza  past, present and future
Apache samza past, present and future
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
 

Recently uploaded

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 

Recently uploaded (20)

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 

ATC Communication System

  • 1. Air Traffic Controller Using Samza to manage communications with members By: Cameron Lee and Shubhanshu Nagar
  • 2. Outline Problem Statement How ATC Solves it Implementation Interesting Features
  • 3. What problem are we trying to solve? In the past, LinkedIn provided a poor communications experience to some of its members. Too much email, low quality email, fired on multiple channels at once Our goal was to build a system which could apply some common functionality across many different communication types and use cases in order to improve the member experience. Handle thousands of communications per second Good understanding of state of members on the site in near-real-time
  • 4. How does ATC think about creating a delightful member experience?
  • 5. 5 Rights Right member Right message Useful to member Shouldn’t have seen it before Right frequency Right channel
  • 6. Filtering Don’t send stale messages Don’t send spammy messages Don’t send duplicate messages
  • 7. Aggregation and Capping Don’t flood me. Consolidate if you have too much to say.
  • 8. Channel Selection “Don’t blast all channels at the same time”
  • 9. Delivery-time Optimization ● Hold on to a message and deliver it at the right moment. ● Ex: Don’t buzz my phone at 2 AM. ● I like to read my daily digests every day after work.
  • 10. How did we build this thing?
  • 11. Requirements for ATC ● Highly-scalable ● Nearline (but close to real-time!) ● Ingest data from many sources ● Persist some data, but most needs are low TTL
  • 14. Persistence: RocksDB Out-of-the-box storage layer Write-optimized for high performance on SSDs. Changelogs provide fault tolerance and bootstrapping capabilities
  • 15. ATC Pipeline instance 1 ATC Repartitioner Re-partitioning of events External services ATC Pipeline instance n
  • 18. Streaming Technologies Kafka: publish-subscribe messaging system Used to send input to ATC to trigger communications Many actions and signals in the LinkedIn ecosystem are tracked in kafka events. We can consume these signals to better understand the state of the ecosystem. Databus: change capture system for databases Produces an event whenever an entry in a database changes
  • 19. Host affinity By default, whenever a Samza app is deployed, the task instances can be moved to any host in the cluster, regardless of where the instances were previously deployed. If there was any state saved (e.g. RocksDB), then the new instances would have to rebuild that state off of the changelog. This bootstrapping can take some time depending on the amount of data to reload. Task instances can’t process new input until bootstrapping is complete. We have some use cases which can’t be delayed for the amount of time it
  • 20. Host affinity (continued) Host affinity is a Samza feature which allows us to deploy task instances back to the same hosts from the previous deployment, so state does not need to be reloaded. In case of failures for individual instances, Samza can fallback to moving the instance elsewhere and bootstrapping off of the changelog.
  • 21. Multiple datacenters Samza does not currently support replicating persistent application state (e.g. RocksDB) across multiple clusters which are running the same app. We need ATC to run in multiple datacenters for redundancy. We need to have state in each datacenter so that if we have to move processing between datacenters, then we can continue to properly handle input.
  • 22. Multiple datacenters We rely on the input streams to replicate the main input so that we can do processing and build up state in all datacenters. The side effects (trigger the actual email send) then will only get emitted by one of the datacenters. We can dynamically choose where side effects are triggered.
  • 24. Deployments When we deploy changes to ATC, we can deploy to a single datacenter at a time in order to test new versions on only a fraction of traffic. In some cases, we shift all side effects out of a datacenter to do an upgrade. Since we still process all input, we can validate almost all of our functionality and ensure performance doesn’t take an unexpected hit.
  • 25. Store migrations In some cases, we need to migrate our system to use a new instance of a store. For example, when support was added to use RocksDB TTL, we needed to migrate some of our stores. Since we only needed the last X days of data, we could use the following strategy for the migration: Write to both the old and new store for X days, but continue to read from the old store. After X days, read from the new store, but continue writing both stores so we could fall back
  • 26. Personalization through relevance We work closely with a relevance team in order to make better decisions about the communications we send out. e.g. channel selection, delivery time, aggregation thresholds Every day, scores for different decisions are computed offline (Hadoop) by the relevance team. Those scores are pushed to ATC through Kafka, and then ATC stores the scores in RocksDB. Scores are generated for each member, so we can personalize the experience.
  • 28. Remote calls Some data is not available on a Kafka stream in a pragmatic way We make REST requests to fetch that data Done at the beginning of pipeline Extract event Make remote calls and decorate event Process decorated event
  • 29. Remote calls - Efficiently Use ParSeq Framework to write asynchronous code in Java Open Sourced ParSeq uses a thread pool for making remote calls Rest of processing happens serially Checkpointing handled by application
  • 30. Real-time Processing Some messages require real-time latency Tuned Kafka’s batching configuration to achieve sub-second of pre-ATC latency Can be tuned even more aggressively! ATC/Samza processes most events in 2-3 ms No remote calls for these messages
  • 31. Scheduler Scheduler RocksDB Scheduled requests (from aggregation, follow-up, etc.) Window task (periodic) Other processing Message Delivery Service