Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
0.5 mln packets per second with Erlang 
Nov 22, 2014 
Maxim Kharchenko 
CTO/Cloudozer LLP
The road map 
Erlang on Xen intro 
LINCX project overview 
Speed-related notes 
– Arguments are registers 
– ETS tables ar...
Erlang on Xen a.k.a. LING 
A new Erlang platform that runs without OS 
Conceived in 2009 
Highly-compatible with Erlang/OT...
Zerg demo: zerg.erlangonxen.org
The road map 
Erlang on Xen intro 
LINCX project overview 
Speed-related notes 
– Arguments are registers 
– ETS tables ar...
LINCX: project overview 
Started in December, 2013 
Initial scope = porting LINC-Switch to LING 
High degree of compatibil...
Raw network interfaces in Erlang 
LING adds raw network interfaces: 
Port = net_vif:open(“eth1”, []), 
port_command(Port, ...
Testbed configuration 
* Test traffic goes between vm1 and vm2 
* LINCX runs as a separate Xen domain 
* Virtual interface...
IXIA confirms 460kpps peak rate 
1GbE hw NICs/128 byte packets 
IXIA packet generator/analyzer
Processing delay and low-level stats 
LING can measure a processing delay for a packet: 
1> ling:experimental(processing_d...
The road map 
Erlang on Xen intro 
LINCX project overview 
Speed-related notes 
– Arguments are registers 
– ETS tables ar...
Arguments are registers 
animal(batman = Cat, Dog, Horse, Pig, Cow, State) ‐> 
feed(Cat, Dog, Horse, Pig, Cow, State); 
an...
ETS tables are (mostly) ok 
A small ETS table lookup = 10x function activations 
Do not use ets:tab2list() inside tight lo...
Do not overuse records 
selelement() creates a copy of the tuple 
State#state{foo=Foo1,bar=Bar1,baz=Baz1} creates 3(?) cop...
Garbage collection is key to speed 
Heap is a list of chunks 
'new heap' is close to its head, 'old heap' - to its tail 
p...
How to tackle GC-related issues 
(Priority 1) Call erlang:garbage_collect() at strategic points 
(Priority 2) For the fast...
gen_server vs barebone process 
Message passing using gen_server:call() is 2x slower than Pid ! Msg 
For speedy code prefe...
NIFs: more pain than gain 
A new principle of Erlang development: do not use NIFs 
For a small performance boost, NIFs und...
Fast counters 
32-bit or 64-bit unsigned integer counters with overflow - trivial in C, not 
easy in Erlang 
FIXNUMs are s...
Future: static compiler for Erlang 
Scalars and algebraic types 
Structural types only – no nominal types 
Target compiler...
Future: static compiler for Erlang - 2 
Challenges: 
Pattern matching compilation 
Type inference for recursive types 
y =...
Questions 
? 
e-mail: maxim.kharchenko@gmail.com
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
Erlang On Xen: Redefining the Cloud Software Stack
Next
Upcoming SlideShare
Erlang On Xen: Redefining the Cloud Software Stack
Next
Download to read offline and view in fullscreen.

1

Share

0.5mln packets per second with Erlang

Download to read offline

LINCX is an OpenFlow switch written in Erlang and running on LING (Erlang on Xen). It shows some remarkable performance. The presentation discusses various speed-related optimizations.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

0.5mln packets per second with Erlang

  1. 1. 0.5 mln packets per second with Erlang Nov 22, 2014 Maxim Kharchenko CTO/Cloudozer LLP
  2. 2. The road map Erlang on Xen intro LINCX project overview Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters – Static compiler? Q&A
  3. 3. Erlang on Xen a.k.a. LING A new Erlang platform that runs without OS Conceived in 2009 Highly-compatible with Erlang/OTP Built from scratch, not a “port” Optimized for low startup latency Open sourced in 2014 (github.com/cloudozer/ling) Local and remote builds Go to erlangonxen.org
  4. 4. Zerg demo: zerg.erlangonxen.org
  5. 5. The road map Erlang on Xen intro LINCX project overview Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters – Static compiler? Q&A
  6. 6. LINCX: project overview Started in December, 2013 Initial scope = porting LINC-Switch to LING High degree of compatibility demonstrated for LING Extended scope = fix LINC-Switch fast path Beta version of LINCX open sourced on March 3, 2014 LINCX runs 100x faster than the old code LINCX repository: github.com/FlowForwarding/lincx
  7. 7. Raw network interfaces in Erlang LING adds raw network interfaces: Port = net_vif:open(“eth1”, []), port_command(Port, <<1,2,3>>), receive {Port,{data,Frame}} ‐> ... Raw interface receives whole Ethernet frames LINCX uses standard gen_tcp for the control connection and net_vif - for data ports Raw interfaces support mailbox_limit option - packets get dropped if the mailbox of the receiving process overflows: Port = net_vif:open(“eth1”, [{mailbox_limit,16384}]), ...
  8. 8. Testbed configuration * Test traffic goes between vm1 and vm2 * LINCX runs as a separate Xen domain * Virtual interfaces are bridged in Dom0
  9. 9. IXIA confirms 460kpps peak rate 1GbE hw NICs/128 byte packets IXIA packet generator/analyzer
  10. 10. Processing delay and low-level stats LING can measure a processing delay for a packet: 1> ling:experimental(processing_delay, []). Processing delay statistics: Packets: 2000 Delay: 1.342us +‐ 0.143 (95%) LING can collect low-level stats for a network interface: 1> ling:experimental(llstat, 1). %% stop/display Duration: 4868.6ms RX: interrupts: 69170 (0 kicks 0.0%) (freq 14207.4/s period 70.4us) RX: reqs per int: 0/0.0/0 RX: tx buf freed per int: 0/8.5/234 TX: outputs: 1479707 (112263 kicks 7.6) (freq 303928.8/s period 3.3us) TX: tx buf freed per int: 0/0.6/113 TX: rates: 303.9kpps 3622.66Mbps avg pkt size 1489.9B TX: drops: 12392 (freq 2545.3/s period 392.9us) TX: drop rates: 2.5kpps 30.26Mbps avg pkt size 1486.0B
  11. 11. The road map Erlang on Xen intro LINCX project overview Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters – Static compiler? Q&A
  12. 12. Arguments are registers animal(batman = Cat, Dog, Horse, Pig, Cow, State) ‐> feed(Cat, Dog, Horse, Pig, Cow, State); animal(Cat, deli = Dog, Horse, Pig, Cow, State) ‐> pet(Cat, Dog, Horse, Pig, Cow, State); ... Many arguments do not make a function any slower But do not reshuffle arguments: %% SLOW animal(batman = Cat, Dog, Horse, Pig, Cow, State) ‐> feed(Goat, Cat, Dog, Horse, Pig, Cow, State); ...
  13. 13. ETS tables are (mostly) ok A small ETS table lookup = 10x function activations Do not use ets:tab2list() inside tight loops Treat ETS as a database; not a pool of global variables 1-2 ETS lookups on the fast path are ok Beware that ets:lookup(), etc create a copy of the data on the heap of the caller, similarly to message passing
  14. 14. Do not overuse records selelement() creates a copy of the tuple State#state{foo=Foo1,bar=Bar1,baz=Baz1} creates 3(?) copies of the tuple Use tuples explicitly in performance-critical sections to control the heap footprint of the code: %% from 9p.erl mixer({rauth,_,_}, {tauth,_,Afid,_,_}, _) ‐> {write_auth,AFid}; mixer({rauth,_,_}, {tauth,_,Afid,_,_,_}, _) ‐> {write_auth,AFid}; mixer({rwrite,_,_}, _, initial) ‐> start_attaching; mixer({rerror,_,_}, _, initial) ‐> auth_failed; mixer({rlerror,_,_}, _, initial) ‐> auth_failed; mixer({rattach,_,Qid}, {tattach,_,Fid,_,_,Aname,_}, initial) ‐> {attach_more,Fid,AName,qid_type(Qid)}; mixer({rclunk,_}, {tclunk,_,Fid}, initial) ‐> {forget,Fid};
  15. 15. Garbage collection is key to speed Heap is a list of chunks 'new heap' is close to its head, 'old heap' - to its tail proc_t A GC run takes 10μs on average GC may run 1000s times per second HTO P ...
  16. 16. How to tackle GC-related issues (Priority 1) Call erlang:garbage_collect() at strategic points (Priority 2) For the fastest code avoid GC completely – restart the fast process regularly: spawn(F, [{suppress_gc,true}]), %% LING‐only (Priority 3) Use fullsweep_after option
  17. 17. gen_server vs barebone process Message passing using gen_server:call() is 2x slower than Pid ! Msg For speedy code prefer barebone processes to gen_servers Design Principles are about high availability, not high performance
  18. 18. NIFs: more pain than gain A new principle of Erlang development: do not use NIFs For a small performance boost, NIFs undermine key properties of Erlang: reliability and soft-realtime guarantees Most of the time Erlang code can be made as fast as C Most of performance problems of Erlang are traceable to NIFs, or external C libraries, which are similar Erlang on Xen does not have NIFs and we do not plan to add them
  19. 19. Fast counters 32-bit or 64-bit unsigned integer counters with overflow - trivial in C, not easy in Erlang FIXNUMs are signed 29-bit integers, BIGNUMs consume heap and are 10-100x slower Use two variables for a counter? foo(C1, 16#ffffff, ...) -> foo(C1+1, 0, ...); foo(C1, C2, ...) ‐> foo(C1, C2+1, ...); ... LING has a new experimental feature – fast counters: erlang:new_counter(Bits) ‐> Ref erlang:increment_counter(Ref, Incr) erlang:read_counter(Ref) erlang:release_counter(Ref)
  20. 20. Future: static compiler for Erlang Scalars and algebraic types Structural types only – no nominal types Target compiler efficiency not static type checking A middle ground between: “Type is a first class citizen” (Haskell) “A single type is good enough” (Python, Erlang)
  21. 21. Future: static compiler for Erlang - 2 Challenges: Pattern matching compilation Type inference for recursive types y = {(unit | y), x, (unit | y)} y = nil | {x, y} Work started in 2013 Currently the compiler is at the proof-of-concept stage
  22. 22. Questions ? e-mail: maxim.kharchenko@gmail.com
  • ssuser6d53d5

    Feb. 25, 2015

LINCX is an OpenFlow switch written in Erlang and running on LING (Erlang on Xen). It shows some remarkable performance. The presentation discusses various speed-related optimizations.

Views

Total views

588

On Slideshare

0

From embeds

0

Number of embeds

2

Actions

Downloads

4

Shares

0

Comments

0

Likes

1

×