Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
0.5mln packets per second with Erlang
Next
Download to read offline and view in fullscreen.

2

Share

0.5mln packets per second with Erlang

Download to read offline

LINCX is an OpenFlow switch written in Erlang and running on LING (Erlang on Xen). It shows some remarkable performance. The presentation discusses various speed-related optimizations.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

0.5mln packets per second with Erlang

  1. 1. 0.5 mln packets per second with Erlang Nov 22, 2014 Maxim Kharchenko CTO/Cloudozer LLP
  2. 2. The road map • Erlang on Xen intro • LINCX project overview • Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters – Static compiler? • Q&A
  3. 3. Erlang on Xen a.k.a. LING • A new Erlang platform that runs without OS • Conceived in 2009 • Highly-compatible with Erlang/OTP • Built from scratch, not a “port” • Optimized for low startup latency • Open sourced in 2014 (github.com/cloudozer/ling) • Local and remote builds Go to erlangonxen.org
  4. 4. Zerg demo: zerg.erlangonxen.org
  5. 5. The road map • Erlang on Xen intro • LINCX project overview • Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters – Static compiler? • Q&A
  6. 6. LINCX: project overview • Started in December, 2013 • Initial scope = porting LINC-Switch to LING • High degree of compatibility demonstrated for LING • Extended scope = fix LINC-Switch fast path • Beta version of LINCX open sourced on March 3, 2014 • LINCX runs 100x faster than the old code LINCX repository: github.com/FlowForwarding/lincx
  7. 7. Raw network interfaces in Erlang • LING adds raw network interfaces: Port = net_vif:open(“eth1”, []), port_command(Port, <<1,2,3>>), receive {Port,{data,Frame}} > ‐ ... • Raw interface receives whole Ethernet frames • LINCX uses standard gen_tcp for the control connection and net_vif - for data ports • Raw interfaces support mailbox_limit option - packets get dropped if the mailbox of the receiving process overflows: Port = net_vif:open(“eth1”, [{mailbox_limit,16384}]), ...
  8. 8. Testbed configuration * Test traffic goes between vm1 and vm2 * LINCX runs as a separate Xen domain * Virtual interfaces are bridged in Dom0
  9. 9. IXIA confirms 460kpps peak rate • 1GbE hw NICs/128 byte packets • IXIA packet generator/analyzer
  10. 10. Processing delay and low-level stats • LING can measure a processing delay for a packet: 1> ling:experimental(processing_delay, []). Processing delay statistics: Packets: 2000 Delay: 1.342us +‐ 0.143 (95%) • LING can collect low-level stats for a network interface: 1> ling:experimental(llstat, 1). %% stop/display Duration: 4868.6ms RX: interrupts: 69170 (0 kicks 0.0%) (freq 14207.4/s period 70.4us) RX: reqs per int: 0/0.0/0 RX: tx buf freed per int: 0/8.5/234 TX: outputs: 1479707 (112263 kicks 7.6) (freq 303928.8/s period 3.3us) TX: tx buf freed per int: 0/0.6/113 TX: rates: 303.9kpps 3622.66Mbps avg pkt size 1489.9B TX: drops: 12392 (freq 2545.3/s period 392.9us) TX: drop rates: 2.5kpps 30.26Mbps avg pkt size 1486.0B
  11. 11. The road map • Erlang on Xen intro • LINCX project overview • Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters – Static compiler? • Q&A
  12. 12. Arguments are registers animal(batman = Cat, Dog, Horse, Pig, Cow, State) > ‐ feed(Cat, Dog, Horse, Pig, Cow, State); animal(Cat, deli = Dog, Horse, Pig, Cow, State) > ‐ pet(Cat, Dog, Horse, Pig, Cow, State); ... • Many arguments do not make a function any slower • But do not reshuffle arguments: %% SLOW animal(batman = Cat, Dog, Horse, Pig, Cow, State) > ‐ feed(Goat, Cat, Dog, Horse, Pig, Cow, State); ...
  13. 13. ETS tables are (mostly) ok • A small ETS table lookup = 10x function activations • Do not use ets:tab2list() inside tight loops • Treat ETS as a database; not a pool of global variables • 1-2 ETS lookups on the fast path are ok • Beware that ets:lookup(), etc create a copy of the data on the heap of the caller, similarly to message passing
  14. 14. Do not overuse records • selelement() creates a copy of the tuple • State#state{foo=Foo1,bar=Bar1,baz=Baz1} creates 3(?) copies of the tuple • Use tuples explicitly in performance-critical sections to control the heap footprint of the code: %% from 9p.erl mixer({rauth,_,_}, {tauth,_,Afid,_,_}, _) ‐> {write_auth,AFid}; mixer({rauth,_,_}, {tauth,_,Afid,_,_,_}, _) ‐> {write_auth,AFid}; mixer({rwrite,_,_}, _, initial) ‐> start_attaching; mixer({rerror,_,_}, _, initial) ‐> auth_failed; mixer({rlerror,_,_}, _, initial) ‐> auth_failed; mixer({rattach,_,Qid}, {tattach,_,Fid,_,_,Aname,_}, initial) > ‐ {attach_more,Fid,AName,qid_type(Qid)}; mixer({rclunk,_}, {tclunk,_,Fid}, initial) ‐> {forget,Fid};
  15. 15. Garbage collection is key to speed • Heap is a list of chunks • 'new heap' is close to its head, 'old heap' - to its tail proc_t • A GC run takes 10μs on average • GC may run 1000s times per second HTOP ...
  16. 16. How to tackle GC-related issues • (Priority 1) Call erlang:garbage_collect() at strategic points • (Priority 2) For the fastest code avoid GC completely – restart the fast process regularly: spawn(F, [{suppress_gc,true}]), %% LING ‐only • (Priority 3) Use fullsweep_after option
  17. 17. gen_server vs barebone process • Message passing using gen_server:call() is 2x slower than Pid ! Msg • For speedy code prefer barebone processes to gen_servers • Design Principles are about high availability, not high performance
  18. 18. NIFs: more pain than gain • A new principle of Erlang development: do not use NIFs • For a small performance boost, NIFs undermine key properties of Erlang: reliability and soft-realtime guarantees • Most of the time Erlang code can be made as fast as C • Most of performance problems of Erlang are traceable to NIFs, or external C libraries, which are similar • Erlang on Xen does not have NIFs and we do not plan to add them
  19. 19. Fast counters • 32-bit or 64-bit unsigned integer counters with overflow - trivial in C, not easy in Erlang • FIXNUMs are signed 29-bit integers, BIGNUMs consume heap and are 10-100x slower • Use two variables for a counter? foo(C1, 16#ffffff, ...) -> foo(C1+1, 0, ...); foo(C1, C2, ...) ‐ > foo(C1, C2+1, ...); ... • LING has a new experimental feature – fast counters: erlang:new_counter(Bits) ‐ > Ref erlang:increment_counter(Ref, Incr) erlang:read_counter(Ref) erlang:release_counter(Ref)
  20. 20. Future: static compiler for Erlang • Scalars and algebraic types • Structural types only – no nominal types • Target compiler efficiency not static type checking • A middle ground between: • “Type is a first class citizen” (Haskell) • “A single type is good enough” (Python, Erlang)
  21. 21. Future: static compiler for Erlang - 2 • Challenges: • Pattern matching compilation • Type inference for recursive types y = {(unit | y), x, (unit | y)} y = nil | {x, y} • Work started in 2013 • Currently the compiler is at the proof-of-concept stage
  22. 22. Questions ?? ? e-mail: maxim.kharchenko@gmail.com
  • eggywat

    May. 3, 2015
  • MatteoRedaelli1

    Nov. 23, 2014

LINCX is an OpenFlow switch written in Erlang and running on LING (Erlang on Xen). It shows some remarkable performance. The presentation discusses various speed-related optimizations.

Views

Total views

1,480

On Slideshare

0

From embeds

0

Number of embeds

108

Actions

Downloads

7

Shares

0

Comments

0

Likes

2

×