Zipkin is a distributed tracing system that helps us gather timing data for all the disparate services at Twitter, and manages collection and lookup of data through a Collector and a Query service. With Zipkin, we can trace a subset of all requests made to the site, and collect detailed data about the path taken through our systems, as well as timings. Then, we can visualize and ultimately pinpoint where and possibly why a response took longer than expected.
13. • Collects traces from production requests
• Low overhead
• Minimum of extra work for developers
@skr | @thisisfranklin 9
14. Finagle
“Finagle is an asynchronous network stack for the JVM that you can use to build
asynchronous Remote Procedure Call (RPC) clients and servers in Java, Scala, or
any JVM-hosted language.”
github.com/twitter/finagle
25. Zipkin terminology
‣ Annotation: string data associated with a particular timestamp, service, and
host
Time
time: 2012-01-21 22:37:01
value: “something happened”
server: 135.34.53.2
service: “timelineservice”
@skr | @thisisfranklin 13
26. ‣ Span: represents one specific method call; made up of a set of annotations.
Has a name and an id.
Time
T:0ms Client Send
Span
@skr | @thisisfranklin 14
27. ‣ Span: represents one specific method call; made up of a set of annotations.
Has a name and an id.
Time
T:0ms Client Send
Span
T:10ms Server Receive
@skr | @thisisfranklin 14
28. ‣ Span: represents one specific method call; made up of a set of annotations.
Has a name and an id.
Time
T:0ms Client Send
Span
T:10ms Server Receive T:90ms Server Send
@skr | @thisisfranklin 14
29. ‣ Span: represents one specific method call; made up of a set of annotations.
Has a name and an id.
Time
T:0ms Client Send T:100ms Client Receive
Span
T:10ms Server Receive T:90ms Server Send
@skr | @thisisfranklin 14
30. ‣ Span: represents one specific method call; made up of a set of annotations.
Has a name and an id.
Time
T:0ms Client Send T:100ms Client Receive
Span
T:20ms Read 30 kbytes from file
T:10ms Server Receive T:90ms Server Send
@skr | @thisisfranklin 14
31. ‣ Span: represents one specific method call; made up of a set of annotations.
Has a name and an id.
Time
T:0ms Client Send T:100ms Client Receive
Span
T:20ms Read 30 kbytes from file
T:10ms Server Receive T:90ms Server Send
‣ Trace: a set of spans all associated with the same request
@skr | @thisisfranklin 14
33. • Generate a random i64 trace id
Finagle
http
service
@skr | @thisisfranklin 15
34. • Generate a random i64 trace id
• Decide if we should sample the
trace or not Finagle
http
service
@skr | @thisisfranklin 15
35. • Generate a random i64 trace id
• Decide if we should sample the
trace or not Finagle
http
service
Finagle
thrift
service
@skr | @thisisfranklin 15
36. • Generate a random i64 trace id
• Decide if we should sample the
trace or not Finagle
http
service
• Generate new span id
Finagle
thrift
service
@skr | @thisisfranklin 15
37. • Generate a random i64 trace id
• Decide if we should sample the
trace or not Finagle
http
service
• Generate new span id
• Pass trace header
Finagle
thrift
service
@skr | @thisisfranklin 15
38. • Generate a random i64 trace id
• Decide if we should sample the
trace or not Finagle
http struct RequestHeader {
service i64 trace_id,
i64 span_id,
• Generate new span id optional i64 parent_span_id,
• Pass trace header
optional bool sampled
}
Finagle
thrift
service
@skr | @thisisfranklin 15
39. • Generate a random i64 trace id
• Decide if we should sample the
trace or not Finagle
http struct RequestHeader {
service i64 trace_id,
i64 span_id,
• Generate new span id optional i64 parent_span_id,
• Pass trace header
optional bool sampled
}
Finagle
• Thrift service adopts trace id from thrift
header if it exists service
@skr | @thisisfranklin 15
41. Finagle
http
service
Finagle Finagle
thrift thrift
service service
@skr | @thisisfranklin 15
42. Finagle
http
service
Finagle Finagle
thrift thrift
service service
Finagle
thrift
service
@skr | @thisisfranklin 15
43. Finagle
http
service
S
Finagle Finagle
thrift thrift
S
service service
S
Finagle
thrift
S
service
@skr | @thisisfranklin 15
44. Finagle
http
service
S
Zipkin
collector
Finagle Finagle
thrift thrift
S
service service
S
Cassandra
Finagle
thrift
S
service
@skr | @thisisfranklin 15
45. Finagle
http
service
S
Zipkin
collector
Finagle Finagle
thrift thrift
S
service service
S Zipkin Zipkin
Cassandra
Query UI
Finagle
thrift
S
service
@skr | @thisisfranklin 15
Before we get into what Zipkin is: why we created it\n
Shit is slow, you lose users and money\n
\n
Simplify wildly, there are two parts to web performance\nFront end. The order assets are loaded, minifying and other tricsk\nBack end. How quickly can we generate and push out the HTML/JSON/whatever\n
For the front end we have these nice development tools. Shows us how the assets loaded, what it was waiting for and so on.\nWe want a fancy tool like this for the backend.\nPicked a page where server side was unusually bad, normally it’s mostly FE\n
For the back end we only had these graphs to look at per service.\nNothing that ties them together\n
Capture information about how services in a datacentre is working together to respond to a request. \n
Read this paper from Google called Dapper\n
Read this paper from Google called Dapper\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
\n
Mention dependency on finagle-zipkin\n
Mention dependency on finagle-zipkin\n
Mention dependency on finagle-zipkin\n
Mention dependency on finagle-zipkin\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Trace view\nServices on the left. Time scale on top. Least impactful parts of trace collapsed automatically.\nMention bootstrap and dj\n
\n
It’s all open source, check it out now.\n