5. BGP zombie / ghost route
„an active routing table entry for a prefix that has been withdrawn
by its origin network”
source: https://labs.ripe.net/Members/romain_fontugne/bgp-zombies (2019)
see also: „BGP Zombies: an Analysis of Beacons Stuck Routes” (2019),
https://www.iij-ii.co.jp/en/members/romain/pdf/romain_pam2019.pdf
not a new phenomenon
Ghost Route Hunter (2003): https://www.sixxs.net/tools/grh/what/
„An overview of the global IPv6 routing table” (2005):
https://meetings.ripe.net/ripe-50/presentations/ripe50-plenary-tue-ipv6-routing.pdf
may take hours/days to „expire”
6. BGP zombie / ghost route
Who cares?
It was withdrawn anyway!
Unless we are talking about
partial withdrawal and some ingress traffic goes via different path
you may expect / does not converge or even loops
more-specific route and zombie sits in Tier1/Tier2/NSP/IXP
infrastructure causing partial or complete outage
7. More-specific prefix usage examples
Traffic engineering
Announce 10.0.0.0/23 into global table
Announce 10.0.0.0/24 to some IXP peers to override their local prefs
Customer delegation
ISP1 announces 10.0.0.0/16 PA block
ISP1 delegates 10.1.2.0/24 to customer
Customer runs own BGP, announces 10.1.2.0/24 via ISP1, ISP2 and IXP
9. 2016 (TPNET-OTI loop)
Orange PL (5617) – Opentransit (5511)
Zombie AS path: 5511 1299 24724 57811 201029 x
Looking glass:
TPNET sees (zombie) more specific via OTI
OTI has less specific via TPNET
I gave up after 20 minute outage and reannounced
more specific to save „x”
Withdrawn later with no issues
11. 2016 (Interoute/AS8928 hijack)
• zombie /24 route via NTT at former
Interoute/Madrid hijacked significant part of
ingress traffic
• luckily, no loop; trace reaches customer in
Warsaw
• many hours, finally „fixed” by
announce/withdraw flaps
14. 2018 (Telia loop)
• 1299 announces zombie route
• hijacks and loops large portion of ingress traffic
• we reproduced this problem with another, non-production prefix
• ~two days of disaster!
• „Routeprocessor Switchover in one of our backbone router in Chicago
solved the issue”
15. 2020 (TATA-Level3 loop)
Router: gin-n0v-tcore1
Site: US, New York, N0V
Command: traceroute inet4 x as-number-lookup
traceroute to x (x), 30 hops max, 52 byte packets
1 if-ae-7-5.tcore1.nto-newyork.as6453.net (63.243.128.141) 2.990 ms 1.545 ms 1.369 ms
MPLS Label=415563 CoS=0 TTL=1 S=1
2 if-ae-9-2.tcore1.n75-newyork.as6453.net (63.243.128.122) 1.653 ms 1.704 ms 1.439 ms
3 ae-7.edge2.NewYorkCity6.Level3.net (4.68.39.49) [AS 3356] 3.038 ms 1.118 ms 3.086 ms
4 ae-1-3103.ear3.Frankfurt1.Level3.net (4.69.163.86) [AS 3356] 82.672 ms 81.989 ms 82.221 ms
5 ix-ae-18-0.tcore1.fr0-frankfurt.as6453.net (195.219.50.49) 82.072 ms 81.949 ms 81.731 ms
6 if-ae-4-2.tcore2.fnm-frankfurt.as6453.net (195.219.87.17) 87.154 ms if-ae-59-2.tcore2.fnm-
frankfurt.as6453.net (195.219.87.194) 87.064 ms 87.038 ms
MPLS Label=486720 CoS=0 TTL=1 S=1
7 if-ae-30-2.tcore1.pvu-paris.as6453.net (80.231.153.89) 86.645 ms if-ae-9-3.tcore1.pvu-
paris.as6453.net (195.219.87.14) 87.036 ms if-ae-9-2.tcore1.pvu-paris.as6453.net (195.219.87.10)
87.412 ms
MPLS Label=345609 CoS=0 TTL=1 S=1
8 if-ae-11-2.tcore1.pye-paris.as6453.net (80.231.153.50) 87.357 ms 87.522 ms 86.774 ms
MPLS Label=525823 CoS=0 TTL=1 S=1
9 if-ae-3-2.tcore1.l78-london.as6453.net (80.231.154.143) 87.089 ms 86.984 ms 87.120 ms
MPLS Label=558832 CoS=0 TTL=1 S=1
10 if-ae-66-2.tcore2.nto-newyork.as6453.net (80.231.130.106) 86.711 ms 86.872 ms 87.689 ms
MPLS Label=300093 CoS=0 TTL=1 S=1
11 if-ae-12-2.tcore1.n75-newyork.as6453.net (66.110.96.5) 86.838 ms 86.749 ms 86.667 ms
12 ae-7.edge2.NewYorkCity6.Level3.net (4.68.39.49) [AS 3356] 87.039 ms 86.777 ms 108.465 ms
13 ae-1-3103.ear3.Frankfurt1.Level3.net (4.69.163.86) [AS 3356] 167.903 ms 167.436 ms 167.919
ms
14 ix-ae-18-0.tcore1.fr0-frankfurt.as6453.net (195.219.50.49) 167.316 ms 167.016 ms 167.156 ms
15 if-ae-4-2.tcore2.fnm-frankfurt.as6453.net (195.219.87.17) 172.082 ms 172.347 ms if-ae-59-
2.tcore2.fnm-frankfurt.as6453.net (195.219.87.194) 172.688 ms
MPLS Label=486720 CoS=0 TTL=1 S=1
16 if-ae-9-3.tcore1.pvu-paris.as6453.net (195.219.87.14) 172.403 ms if-ae-9-2.tcore1.pvu-
paris.as6453.net (195.219.87.10) 177.623 ms 172.588 ms
MPLS Label=345609 CoS=0 TTL=1 S=1
17 if-ae-11-2.tcore1.pye-paris.as6453.net (80.231.153.50) 173.956 ms 176.402 ms 172.581
ms
MPLS Label=525823 CoS=0 TTL=1 S=1
18 if-ae-3-2.tcore1.l78-london.as6453.net (80.231.154.143) 172.784 ms 172.592 ms 172.921
ms
MPLS Label=558832 CoS=0 TTL=1 S=1
19 if-ae-66-2.tcore2.nto-newyork.as6453.net (80.231.130.106) 172.660 ms 172.503 ms
172.937 ms
MPLS Label=300093 CoS=0 TTL=1 S=1
20 if-ae-12-2.tcore1.n75-newyork.as6453.net (66.110.96.5) 172.258 ms 172.540 ms 171.995
ms
21 ae-7.edge2.NewYorkCity6.Level3.net (4.68.39.49) [AS 3356] 183.732 ms 171.950 ms
172.068 ms
22 ae-1-3103.ear3.Frankfurt1.Level3.net (4.69.163.86) [AS 3356] 252.748 ms 252.855 ms
252.719 ms
23 ix-ae-18-0.tcore1.fr0-frankfurt.as6453.net (195.219.50.49) 253.215 ms 253.049 ms
252.474 ms
24 if-ae-59-2.tcore2.fnm-frankfurt.as6453.net (195.219.87.194) 258.598 ms if-ae-4-
2.tcore2.fnm-frankfurt.as6453.net (195.219.87.17) 258.467 ms 257.584 ms
MPLS Label=486720 CoS=0 TTL=1 S=1
25 if-ae-9-3.tcore1.pvu-paris.as6453.net (195.219.87.14) 257.906 ms 257.857 ms if-ae-9-
2.tcore1.pvu-paris.as6453.net (195.219.87.10) 258.308 ms
MPLS Label=345609 CoS=0 TTL=1 S=1
26 if-ae-11-2.tcore1.pye-paris.as6453.net (80.231.153.50) 257.546 ms 257.812 ms 268.691
ms
MPLS Label=525823 CoS=0 TTL=1 S=1
27 if-ae-3-2.tcore1.l78-london.as6453.net (80.231.154.143) 261.149 ms 257.873 ms 258.124
ms
MPLS Label=558832 CoS=0 TTL=1 S=1
28 if-ae-66-2.tcore2.nto-newyork.as6453.net (80.231.130.106) 257.746 ms 257.491 ms
258.035 ms
MPLS Label=300093 CoS=0 TTL=1 S=1
29 if-ae-12-2.tcore1.n75-newyork.as6453.net (66.110.96.5) 257.737 ms 258.226 ms 257.614
ms
30 ae-7.edge2.NewYorkCity6.Level3.net (4.68.39.49) [AS 3356] 257.587 ms 259.322 ms
258.347 ms
16. 2020 (TATA-Level3 loop)
…
20 if-ae-12-2.tcore1.n75-newyork.as6453.net (66.110.96.5) 172.258 ms 172.540 ms 171.995 ms
21 ae-7.edge2.NewYorkCity6.Level3.net (4.68.39.49) [AS 3356] 183.732 ms 171.950 ms 172.068 ms
22 ae-1-3103.ear3.Frankfurt1.Level3.net (4.69.163.86) [AS 3356] 252.748 ms 252.855 ms 252.719 ms
23 ix-ae-18-0.tcore1.fr0-frankfurt.as6453.net (195.219.50.49) 253.215 ms 253.049 ms 252.474 ms
24 if-ae-59-2.tcore2.fnm-frankfurt.as6453.net (195.219.87.194) 258.598 ms if-ae-4-2.tcore2.fnm-
frankfurt.as6453.net (195.219.87.17) 258.467 ms 257.584 ms
MPLS Label=486720 CoS=0 TTL=1 S=1
25 if-ae-9-3.tcore1.pvu-paris.as6453.net (195.219.87.14) 257.906 ms 257.857 ms if-ae-9-2.tcore1.pvu-
paris.as6453.net (195.219.87.10) 258.308 ms
MPLS Label=345609 CoS=0 TTL=1 S=1
26 if-ae-11-2.tcore1.pye-paris.as6453.net (80.231.153.50) 257.546 ms 257.812 ms 268.691 ms
MPLS Label=525823 CoS=0 TTL=1 S=1
27 if-ae-3-2.tcore1.l78-london.as6453.net (80.231.154.143) 261.149 ms 257.873 ms 258.124 ms
MPLS Label=558832 CoS=0 TTL=1 S=1
28 if-ae-66-2.tcore2.nto-newyork.as6453.net (80.231.130.106) 257.746 ms 257.491 ms 258.035 ms
MPLS Label=300093 CoS=0 TTL=1 S=1
29 if-ae-12-2.tcore1.n75-newyork.as6453.net (66.110.96.5) 257.737 ms 258.226 ms 257.614 ms
…
17. 2020 (TATA-Level3 loop)
1. TATA/US „sees” more specific via Level3/US
2. Level3/US does not have this zombie route and
uses „cold potato” routing to reach
Level3/Frankfurt
3. Level3 passes packets to TATA in Frankfurt (less
specific route, destination is TATAs customer in
Poland)
4. once passed to TATA, „zombie more specific via
Level3” kicks in – traffic goes to Tata/US where
it is passed to Level3/US once again…
18. 2020 (Level3 loop and zombie resurrection)
• First outage directly after withdrawal
• Finally BGP converges
• However, few hours later zombie route resurrects in AS3356 core and causes
another 1h outage
20. 2020 Aug (well known Centurylink/Level3-related outage)
NANOG mailing list threads:
„Centurylink having a bad morning?”
„[outages] Major Level3 (CenturyLink) Issues”
https://mailman.nanog.org/pipermail/nanog/2020-August/thread.html
https://mailman.nanog.org/pipermail/nanog/2020-September/thread.html
https://puck.nether.net/pipermail/outages/2020-August/013204.html
21. 2020 Aug (well known Centurylink/Level3-related outage)
Analysis:
https://blog.thousandeyes.com/centurylink-level-3-outage-analysis/
„Level 3 continues to advertise stale routes despite services withdrawing routes”
https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
https://radar.qrator.net/blog/another-centurylink-bgp-incident
34. Zombie risk mitigation
Fix all Tier1 routers
Gradual more specific withdrawal
stage 1: withdraw from distant locations and transits
stage 2: withdraw from local/national peerings
Selective more specific announcements
by continent/peer
no transit, just peerings
gratis: faster convergence!
35. Selective announcements / traffic steering
Use the communities, Luke!
Features
excellent customer BGP communities (NTT, Telia, GTT, DE-CIX)
good enough
~nothing (HE)
secret
Transition
transparent
partial clear/override
full clear
overlap risk! (EC/LC still not widely adopted)
36. Example: add GTT leak to the mix (via RETN)
Note: covers all RETN, Telia, GTT and
TATA customers (not visible here)
37. Example: leak to Telia (via Level3)
Note: leaks to all Level3 customers
(incl. RETN) and Telia customers
38. Per customer announcement tailoring (BIRD filter syntax)
case bgp_path.last {
# ASx Customer Foo (uses: Level3, Telia)
x:
if pop = "PLIX" then bgp_community.add(level3_yes_telia);
if pop = "THINX" then bgp_community.add(retn_yes_telia);
if pop = "LINX" then {…}
# ASy Customer Bar (uses: GTT, Cogent)
y:
if pop = "PLIX" then bgp_community.add(level3_yes_cogent);
if pop = "THINX" then bgp_community.add(retn_yes_gtt);
if pop = "LINX" then {…}
# ASz Customer Baz...
}
docs: https://bird.network.cz/?get_doc&v=20&f=bird-5.html#ss5.4
39. Summary
Still not well understood
BGP update queueing, races/reordering, losses?
BGP optimizers/stabilizers, broken damping?
In $vendors we trust
Avoid more-specifics in global table
Monitor your reachability/visibility