Real Time Processing Using Twitter Heron by Karthik Ramasamy

Low Latency Streaming Using Heron
Karthik Ramasamy
@karthikz
Co-founder of Streamlio

2
Real-time is key
Information Age
Ká

3
Real Time Connected World
Internet of Things
30 B connected devices by 2020
Health Care
153 Exabytes (2013) -> 2314 Exabytes
(2020)
Machine Data
40% of digital universe by 2020
Connected Vehicles
Data transferred per vehicle per month
4 MB -> 5 GB
Digital Assistants (Predictive Analytics)
$2B (2012) -> $6.5B (2019) [1]
Siri/Cortana/Google Now
Augmented/Virtual Reality
$150B by 2020 [2]
Oculus/HoloLens/Magic Leap
Ñ
+
>
[1] http://www.siemens.com/innovation/en/home/pictures-of-the-future/digitalization-and-software/digital-assistants-trends.html
[2] http://techcrunch.com/2015/04/06/augmented-and-virtual-reality-to-hit-150-billion-by-2020/#.7q0heh:oABw

4
Value of Data
Value&of&Data&to&Decision/Making&
Time&
Preven8ve/&
Predic8ve&
Ac8onable&
Reac8ve&
Historical&
Real%&
Time&
Seconds& Minutes& Hours& Days&
Tradi8onal&“Batch”&&&&&&&&&&&&&&&
Business&&Intelligence&
Informa9on&Half%Life&
In&Decision%Making&
Months&
Time/cri8cal&
Decisions&
[1] Courtesy Michael Franklin, BIRTE, 2015.

5
Introducing Heron
! Scaling
! Debugging
! Consistent performance
! Yet another system to manage
! Consistent performance at scale
! Easy to debug and tune
! Fast/Efficient General purpose streaming engine
! Storm API compatibile
! Latency/Thruput configurability
! Library not a service
Issues with Apache Storm Heron Design Goals

6
Heron in Production @ Twitter
Completely replaced Storm 3 years ago
3x reduction in cores and memory
Signiﬁcantly reduced operational overhead
10x reduction in production incidents

7
Heron Use Cases
REALTIME
ETL
REAL TIME
BI
SPAM
DETECTION REAL TIME
TRENDS
REALTIME
ML
REAL TIME
OPS

8
Open Souring
https://github.com/twitter/heron
http://heron.io
Apache 2.0 License
Contributions from
Microsoft, Machine Zone, Mesosphere, Google,
Wanda Group, WeChat, Fitbit and growing
OPEN SOURCED
MAY 2016

9
Heron Core Concepts
Topology
Directed acyclic graph
vertices = computation, and
edges = streams of data tuples
Spouts
Sources of data tuples for the topology
Examples - Kafka/Kestrel/MySQL/Postgres
Bolts
Process incoming tuples, and emit outgoing
tuples
Examples - ﬁltering/aggregation/join/any
function
,
%

10
Sample Heron Topology
%
%
%
%
%
Spout 1
Spout 2
Bolt 1
Bolt 2
Bolt 3
Bolt 4
Bolt 5

11
Topology Architecture
Topology
Master
ZK
Cluster
Stream
Manager
I1 I2 I3 I4
Stream
Manager
I1 I2 I3 I4
Logical Plan,
Physical Plan and
Execution State
Sync Physical Plan
CONTAINER CONTAINER
Metrics
Manager
Metrics
Manager

12
Stream Manager - Design Goals
Core logic in one centralized place
Super Efﬁcient
Pluggable
Transport (tcp sockets, unix sockets,
shared memory)
Interlanguage Data Format (Protobufs,
Cap N’ Proto, etc)
Protocol (HTTP, gRPC, custom, etc)
Oculus/HoloLens/Magic Leap
Ñ
+
>
Multilanguage Instances (C++, Java,
Python),

13
Stream Manager - Current Implementation
Implements at most once and at least
once
Written in C++ for efﬁciency
Custom protocol (very similar to gRPC)
Transport using TCP Sockets
Protobuf data format
Ñ
+

14
Stream Manager - Shortcomings
01 02 03
Transport
Shortcomings
TCP Overhead
Multiple memory copies
Protobuf
Shortcomings
Serde is very expensive
Full deserialization necessary to access any ﬁeld
Creation/Deletion is very expensive
Core Logic
Implementation
Followed immutable pattern
Easy to reason but inefﬁcient
/ .
-

15
Stream Manager - Performance Analysis
Too slow
Too much overhead
Changes what we are trying to observe
Very fast
Doesn’t do code instrumentation
cpu-proﬁling/memory-proﬁling in one tool
Ñ
Valgrind Google Perftools

16
Stream Manager - Performance Analysis
17% in new/delete
15% immutable pattern
15% eager deserialization
12% protobuf size collection

17
Stream Manager - Optimization 1
! new/delete overhead
! Problem:- Create/Delete a new protobuf object every time we read/wrote
something.
! Protobuf sacrifices speed for safety
! Solution
! Create protobuf pools at startup
! We do a “new” only when the pool is exhausted
! The pool is bounded in size to avoid running out of memory

18
! Immutable pattern
! Problem:- In general case, one tuple can fan-out to multiple downstream instances
! For each downstream instance, we made a new copy
! Solution
! Do early serialization to create an immutable byte array
! Just copy the raw bytes

19
! Eager Deserialization
! Problem:- Protobuf deserializes the entire message even if we access just the ‘header’
! Solution
! Change the protobuf message to have raw bytes to avoid expensive deserialization
! Lazy deserialization is done manually only when needed

20
! Calculation of Protobuf ByteSize
! Problem:- Bytesize computation is expensive and every time the computation is from the scratch
! Solution
! Used CachedByteSize when possible

21
Benchmark Settings
Components Expt 1 Expt 2 Expt 3
Spout 25 100 200
Bolt 25 100 200
# Heron
Containers
25 100 200
Dual Intel Xeon
E5645@2.4GHz, 72GB RAM,
500GB Disk
175K Random words generated
Word Count Topology

22
Benchmark - At most once throughput
5 - 6x

23
Benchmark - At least once throughput
4 - 5x

24
Benchmark - At least once latency
2 - 4x

25
Real-Time is Messy, Unpredictable and Hard
Aggregation
Systems
Messaging
Systems
Result
Engine
HDFS
Queryable
Engines

26
Real Time - End to End
Storm API DSL SQL
Application
Builder
Ingestion
API
Query
API

27
Curious to Learn More?
Twitter Heron: Stream Processing at Scale
Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg,
Sailesh Mittal, Jignesh M. Patel*,1
, Karthik Ramasamy, Siddarth Taneja
@sanjeevrk, @challenger_nik, @Louis_Fumaosong, @vikkyrk, @cckellogg,
@saileshmittal, @pateljm, @karthikz, @staneja
Twitter, Inc., *University of Wisconsin – Madison
ABSTRACT
Storm has long served as the main platform for real-time analytics
at Twitter. However, as the scale of data being processed in real-
time at Twitter has increased, along with an increase in the
diversity and the number of use cases, many limitations of Storm
have become apparent. We need a system that scales better, has
better debug-ability, has better performance, and is easier to
manage – all while working in a shared cluster infrastructure. We
considered various alternatives to meet these needs, and in the end
concluded that we needed to build a new real-time stream data
processing system. This paper presents the design and
implementation of this new system, called Heron. Heron is now
the de facto stream data processing engine inside Twitter, and in
this paper we also share our experiences from running Heron in
production. In this paper, we also provide empirical evidence
demonstrating the efficiency and scalability of Heron.
ACM Classification
H.2.4 [Information Systems]: Database Management—systems
Keywords
Stream data processing systems; real-time data processing.
1. INTRODUCTION
Twitter, like many other organizations, relies heavily on real-time
system process, which makes debugging very challenging. Thus, we
needed a cleaner mapping from the logical units of computation to
each physical process. The importance of such clean mapping for
debug-ability is really crucial when responding to pager alerts for a
failing topology, especially if it is a topology that is critical to the
underlying business model.
In addition, Storm needs dedicated cluster resources, which requires
special hardware allocation to run Storm topologies. This approach
leads to inefficiencies in using precious cluster resources, and also
limits the ability to scale on demand. We needed the ability to work
in a more flexible way with popular cluster scheduling software that
allows sharing the cluster resources across different types of data
processing systems (and not just a stream processing system).
Internally at Twitter, this meant working with Aurora [1], as that is
the dominant cluster management system in use.
With Storm, provisioning a new production topology requires
manual isolation of machines, and conversely, when a topology is
no longer needed, the machines allocated to serve that topology
now have to be decommissioned. Managing machine provisioning
in this way is cumbersome. Furthermore, we also wanted to be far
more efficient than the Storm system in production, simply
because at Twitter’s scale, any improvement in performance
translates into significant reduction in infrastructure costs and also
significant improvements in the productivity of our end users.
Storm @Twitter
Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel*, Sanjeev Kulkarni,
Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, Dmitriy Ryaboy
@ankitoshniwal, @staneja, @amits, @karthikz, @pateljm, @sanjeevrk,
@jason_j, @krishnagade, @Louis_Fumaosong, @jakedonham, @challenger_nik, @saileshmittal, @squarecog
Twitter, Inc., *University of Wisconsin – Madison
Streaming@Twitter
Maosong Fu, Sailesh Mittal, Vikas Kedigehalli, Karthik Ramasamy, Michael Barry,
Andrew Jorgensen, Christopher Kellogg, Neng Lu, Bill Graham, Jingwei Wu
Twitter, Inc.
Abstract
Twitter generates tens of billions of events per hour when users interact with it. Analyzing these
events to surface relevant content and to derive insights in real time is a challenge. To address this, we
developed Heron, a new real time distributed streaming engine. In this paper, we ﬁrst describe the design
goals of Heron and show how the Heron architecture achieves task isolation and resource reservation
to ease debugging, troubleshooting, and seamless use of shared cluster infrastructure with other critical
Twitter services. We subsequently explore how a topology self adjusts using back pressure so that the
pace of the topology goes as its slowest component. Finally, we outline how Heron implements at most
once and at least once semantics and we describe a few operational stories based on running Heron in
production.
1 Introduction
Stream processing platforms enable enterprises to extract business value from data in motion similar to batch
processing platforms that facilitated the same with data at rest [42]. The goal of stream processing is to enable
real time or near real time decision making by providing capabilities to inspect, correlate and analyze data as

29
WHAT WHY WHERE WHEN WHO HOW
Any Question ???

Real Time Processing Using Twitter Heron by Karthik Ramasamy

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Real Time Processing Using Twitter Heron by Karthik Ramasamy

Similar to Real Time Processing Using Twitter Heron by Karthik Ramasamy (20)

More from Data Con LA

More from Data Con LA (20)

Recently uploaded

Recently uploaded (20)

Real Time Processing Using Twitter Heron by Karthik Ramasamy