Reflecting a year after migrating to apache traffic server

©2013 LinkedIn Corporation. All Rights Reserved.
Reflecting a Year After Migrating to Apache Traffic Server

Have You Looked At Your Access Logs Lately?

Surviving by Proxy

Even Your Registrar Breaks Sometimes

How Apache Traffic Server Changed LinkedIn

Hello!

ATS: Apache Traffic Server
 Fast, scalable and extensible HTTP/1.1 compliant caching proxy server
 Single-process, multi-threaded
 Asynchronous I/O
 Plugin architecture
 Written by Inktomi >10 years ago, Yahoo acquired Inktomi, found the code
on a system collecting dust in a cardboard box, and open-sourced in 2010

ATS: Who’s using it?

When we started…
 4,000 QPS to www.linkedin.com
 120M members
 Citrix NetScaler used for all external load balancing (XLB)
– Load balances requests based on path to frontends
– SSL termination
– Monitors health per frontend
 Features were built as Tomcat filters
– Tomcat required, no solution for alternates
– >70 frontend services deployed across hundreds of hosts

Outgrowing the existing solution
 Need to support multiple frontend frameworks
– DoS protection
– Authentication
– Optimizations
 Complete control over features
– Cookie manipulation
– Advanced routing
 Deployment delays, security-related fixes took days if not weeks
 Even small changes required touching network gear

How about an intelligent HTTP proxy layer?
 Less (re)implementing features into multiple frameworks
 Make decisions higher in the stack
– Faster response time
– Reduce work on the application stack
 Rapid iteration

Where to start?
 Evaluated options
 Requirements:
– Mature
– Scalable
– Language we like
– Plugin support with hooks and documentation, shared libraries a big plus
– Shared runtime information between plugins
– In-house knowledge is a plus
 Apache Traffic Server matched our needs

Preparation
 4 patches out of the gate
 Audit traffic, build configs
 Build metrics, dashboards and alerts
– Huge blocker, new territory for non-Java @ LinkedIn
 Migrate traffic, one service at a time

Let’s migrate!
Started migration in October 2011

Let’s migrate!
Started migration in October 2011
“We’ll be done by Christmas!”
- everyone

Original Plan
XLB
L1 Proxy
(ATS)
VIP
Frontend

Month 1: Public Profile

Request Rules Remap
Cookie-based routing
e.g. logged-in vs. logged-out
Los Angeles
XLB
L1 Proxy
(ATS)
VIP
Frontend
Chicago
XLB
L1 Proxy
(ATS)
VIP
Frontend
www.linkedin.com

Request Rules Remap
if (request_cookie[”foo"] starts_with ”bar”)
return "host:chicago.linkedin.com:8888";
else
return "host:losangeles.linkedin.com:8888";

Month 3: Sentinel (DoS protection)
Prevent abusive requests from reaching frontend
XLB
L1 Proxy
(ATS)
VIP
Frontend

Month 4: Picking up momentum
Largest frontends of the site done
– Homepage
– Profile
– Registration
New ATS tier, Fizzy!

New ATS tier, Fizzy!
 Edge Side Includes on steroids
 UI content aggregator
 Progressive Rendering
– Browser deferred rendering
– Browser deferred fetch
– Server
 Supports Server Side Rendering
of JavaScript templates via V8

Now with Fizzy!
XLB
L1 Proxy
(ATS)
VIP VIP
Frontend
(non-ﬁzzy)
Fizzy
(ATS)
VIP
Frontend

Month 6: Most frontends migrated
 Config generators written
 Caught the attention of other teams
– New plugins developed
 Another new tier, QD Proxy!

Another new tier, QD Proxy!
Quick Deploy Proxy
– Define profiles for dev instances to route to
– Allows multiple users to use the same profile
– Develop without running the entire stack

Quick Deploy Proxy: Frontend
XLB
L1 Proxy
(ATS)
Frontend
Fizzy
(ATS)
Backend
QD Proxy
(ATS)
My
Frontend

Quick Deploy Proxy: Backend
XLB
L1 Proxy
(ATS)
Frontend
Fizzy
(ATS)
Backend
QD Proxy
(ATS)
My
Backend

Month 9: Ramping Fizzy to 100%

 Broke the site

 Broke the site
 HA Proxy saves the day
– “The Reliable, High Performance TCP/HTTP Load Balancer”
– leverage the metadata in Range to generate configs
– reduce network hops by avoiding hardware load balancer
– deploy changes in minutes

… and HA Proxy!
XLB
Frontend
L
1
P
R
O
X
Y HAPROXY
ATS
F
I
Z
Z
Y
HAPROXY
ATS
L
1
P
R
O
X
Y HAPROXY
ATS
F
I
Z
Z
Y
HAPROXY
ATS
Frontend
(non-ﬁzzy)

After all that…
 October 2011: 4,000 QPS, 120M members
 August 2012: 15,000 QPS, 175M members
 Now: 67,000 QPS, 225M members
 Citrix NetScaler still in use
– Load balancing L1 proxy
– SSL termination
 Features built as ATS plugins
– Supports anything behind ATS tiers (L1 Proxy, Fizzy)
– Quick to deploy

Implementation
 October 2011 - August 2012 (10 months)

Implementation
 October 2011 - August 2012
 Unexpected surprises aka outages
 Scope creep
– New tiers and architecture: Fizzy, HA Proxy
– Lots of new plugins
 It takes time to build…
– monitoring
– tooling
– configuration automation

Outages
 Hand edited configs with typos
 Misbehaving node in rotation
 Bad upgrade from 2.x to 3.x due to incompatible hostdb
 Missing slash for a config, sent requests to wrong frontend
 Bonus slash to a healthcheck taking all hosts down
 SysOps re-imaged experimental hosts, broke 10% of Profile
 Saturated load balancer due to additional ATS layer
 Sticky cookie conflict between frontends
 HA Proxy wasn’t started
 Random ATS crashes
 Coal in our stocking for Christmas
 Multiple issues with multiple plugins
 Log4cpp hard-coded to DEBUG at root level for one plugin, overwrote for all plugins
 FD per-user limit unexpectedly changed
 Keep-alive unexpectedly turned on with high timeouts

Outages (>0.1% requests affected)
0% 20% 40% 60% 80% 100%
2011
2012
2013
Plugin ATS Human

How did we improve?

How did we improve? Monitoring!

Monitoring: traffic_logstats
• per-origin breakdown:
– status
– method
– QPS
– bytes
– etc.
• Want JSON output? use -j
• results are COUNTER, and GAUGE if the key ends in _pct

Monitoring: traffic_logstats
HTTP return codes Count Percent Bytes Percent
------------------------------------------------------------------------------
100 Continue 0 0.00% 0.00KB 0.00%
200 OK 1,383,361 93.57% 4.71GB 97.48%
201 Created 5,429 0.37% 3.28MB 0.07%
202 Accepted 0 0.00% 0.00KB 0.00%
203 Non-Authoritative Info 0 0.00% 0.00KB 0.00%
204 No content 12 0.00% 5.63KB 0.00%
205 Reset Content 0 0.00% 0.00KB 0.00%
206 Partial content 0 0.00% 0.00KB 0.00%
2xx Total 1,388,802 93.94% 4.71GB 97.54%
300 Multiple Choices 0 0.00% 0.00KB 0.00%
301 Moved permanently 3,360 0.23% 3.47MB 0.07%
302 Found 38,475 2.60% 35.09MB 0.71%
303 See Other 11 0.00% 3.87KB 0.00%
304 Not modified 29,262 1.98% 12.20MB 0.25%
305 Use Proxy 0 0.00% 0.00KB 0.00%
307 Temporary Redirect 0 0.00% 0.00KB 0.00%
3xx Total 71,108 4.81% 50.76MB 1.03%
...

Monitoring: traffic_line
• Swiss army knife for Traffic Server
• executable to read variables

Monitoring: {stat}
• prefer HTTP over shell?
records.config:
CONFIG proxy.config.http_ui_enabled INT 2
remap.config:
map /_stat/ http://{stat} @action=allow @src_ip=127.0.0.1

Monitoring: {stat}
proxy.node.restarts.manager.start_time
proxy.node.restarts.proxy.start_time

Monitoring: {stat}
proxy.node.current_client_connections
proxy.node.current_server_connections

Monitoring: {stat}
proxy.config.net.connections_throttle
 limit before ATS starts to drop connections
 based on the sum of client and server connections
proxy.process.net.connections_currently_open
 client + server connections

Monitoring: {stat}
Plugin specific
 reviewed prior plugins go to production
Examples
 enforced vs. un-enforced DoS requests
 track cookie usage for a migration
 thread usage of a plugin

Monitoring: outside the app
Core dump rate
– generate crash reports with full stack trace
– monitoring file system for core dumps newer than -24 hours
– alert if > N
TCP
– capture states from netstat
– listen queue overflowing (net.core.somaxconn)
Proc
– review /proc/pid/status
– fetch VmSize and VmSwap
– count # of files in /proc/pid/fd for FD usage

Monitoring: logs
 I HATE dislike the stock logs
 squid.log
– mimics squid access log
– more useful if you’re caching
 common.log, extended.log, extended2.log
– Netscape formats
– not enough detail
 custom logging!

Custom Logging
records.config
CONFIG proxy.config.log.custom_logs_enabled INT 1
logs_xml.config
<LogFormat>
<Name = ”custom_access"/>
<Format = "%<chi> %<{X-Real-Client-IP}cqh> - %<caun> [%<cqtn>] "%<cqhm> %<cquuc>
%<cqhv>" %<pssc> %<pscl> "%<{Referer}cqh>" "%<{User-Agent}cqh>" %<ttms>ms
%<cquc> %<{X-LI-UUID}psh>"/>
</LogFormat>
<LogObject>
<Format = ” custom_access"/>
<Filename = ”access"/>
</LogObject>

Custom logging
%<chi> 172.16.200.10
%<{X-Real-Client-IP}cqh> 65.16.225.8
%<caun> - (http auth'd username)
[%<cqtn>] [01/Nov/2011:23:59:59 +0000]
"%<cqhm> %<cquuc> %<cqhv>" "GET /nhome/ HTTP/1.1"
%<pssc> 200
%<pscl> 34697
%<{Referer}cqh> “http://www.linkedin.com/"
%<{User-Agent}cqh> "Mozilla/4.0 (compatible; ...)"
%<ttms> 327ms
%<cqu> http://origin:port/nhome/

Dashboard: overview
Internal ATS:
 client connections
 server connections
 traffic_cop uptime
 traffic_server uptime
 connection failed
 invalid request
Logs:
 2xx status
 3xx status
 4xx status
 5xx status
 HTTP methods
OS:
 cpu usage
 interface
 tcp state distribution
 # of core dumps
 ATS memory usage
 ATS swap usage
 ATS file descriptor usage

Dashboard: in-depth
 plugin-specific
 per-path histogram of request durations
 per-origin HTTP status breakdown
 HA Proxy
– current sessions
– denied requests
– error requests
– server status

How did we improve? Automation!
Configs are generated, not hand maintained
– Details about a service are stored in metadata store
– YAML configs supplement missing data
Deployment done by Salt
– All deployment actions and verifications are
– integrated with Informed

Informed

Plugins!
header-rewrite
request-rules-remap
sentinel
lix-remap
host_override
postbuffer
mobileredirect
correctcookiedomain
qdproxy
boom
pagespeed
contentsecurityheader
authfilter
oauth-rewrite
stickyrouting

Plugins: header-rewrite
Manipulate headers at any point in the request lifecycle
– read request
– send request
– read response
– send response
 Can use as a remap plugin
– change path, destination, port
 Patched to include variables

cond %{READ_REQUEST_HDR_HOOK} [AND]
cond %{ACCESS:/var/healthcheck} [NOT]
rm-header Connection
add-header Connection "close”

cond %{SEND_RESPONSE_HDR_HOOK} [AND]
cond %{PATH} "/foo.js”
add-header Content-Type "text/javascript”

Plugins: lix-remap
Uses LinkedIn Experiments infrastructure (A/B testing) to make routing
decisions
 Enable NOC to easily send traffic to another data center
 Route specific users, LinkedIn employees or % of users to experimental
tiers
 Used for red-line performance testing of frontends

Plugins: Boom
We don’t want to show users this…

Plugins: Boom
… but based on status code, we can replace it with this:

Plugins: Host Override
Direct your request to a specific host through any ATS tier

Plugins: PageSpeed
Support on-the-fly operations before sending the response

Plugins: PageSpeed
HTML minification
– How many empty new lines are on Profile?

Plugins: PageSpeed
HTML minification
2703

Plugins: PageSpeed
HTML minification
2703
– How many empty new lines are on Homepage?

Plugins: PageSpeed
HTML minification
2703
– How many empty new lines are on Homepage?
9205

Plugins: PageSpeed
HTML minification
Homepage: 78%
Profile: 72%

Plugins: PageSpeed
HTML minification
Homepage: 10%
Profile: 17%
0
10000
20000
30000
40000
Homepage Profile
Compressed bytes

Plugins: PageSpeed
Lazy loading of images below the fold

The awesome patches

The awesome patches
 traffic_server gets restarted if FD > 32

The awesome patches
 infinite emergency throttle

The awesome patches
 infinite emergency throttle
 buffer overflow in the stats system

Contributions back
28 fixes committed back to open-source
19 more pending
LinkedIn ATS committer, Brian Geffon

ATS C++ API
Simplifies the process of writing ATS plugins
https://github.com/linkedin/atscppapi
I wrote a transformation plugin that would probably
have taken me weeks, struggling with virtual I/O
buffers, in just a few hours. Now that I’ve done it
once, it would be even faster.
Doug Young
Sr. Staff Software Engineer

Almost forgot… Media Cache!
Serves profile pictures, cached external content
Pre-ATS
– NetApp filer CPU >50%
– Expected an outage during NetApp failover

Almost forgot… Media Cache!
Serves profile pictures, cached external content
Pre-ATS
– NetApp filer CPU >50%
– Expected an outage during NetApp failover
Post-ATS
– 98% cache hit rate
– $30,000 in gear, saved $400,000
– Bought us time to re-architect the service

So what are the takeaways?
 ATS is a bad ass HTTP proxy
 Small details matter, fight for the users
 HA Proxy is a silver bullet
 Slow down, learn for your mistakes.
 Don’t just use open-source, contribute

Meet the team
Manjesh Nilange Brian Geffon Thomas JacksonNick Berry
Office hours @ 1:15 PM
Exhibit Hall (Table 2)

Links
 This talk:
 Apache Traffic Server:
– http://trafficserver.apache.org
 ATS C++ API:
– https://github.com/linkedin/atscppapi
 New plugins:
– https://github.com/linkedin/ -- coming soon!

Goodbye!

Reflecting a year after migrating to apache traffic server

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Reflecting a year after migrating to apache traffic server

Editor's Notes