3. High Availability and Fast Convergence
Analyzing potential problems you could face trying to deploy fast convergence.
Several techniques that have been developed to allow high availability
and fast convergence, including:
§ Graceful restart
§ Fast down detection
§ Exponential backoff
§ Speeding up route selection
4. Considerations in Fast Convergence
§ Scale and speed are contradictory goals.
§ The faster a network converges the less stable it is likely to be.
Fast reactions to changes in the network topology tend to create positive
feedback loops, which result in a network that simply will not converge.
The pieces of a network you need to be concerned about when considering
subsecond (fast) convergence:
§ The physical layer how fast can a down link be detected ?
§ Routing protocol convergence how fast can a routing protocol react to the
topology change ?
§ Forwarding how fast can the forwarding engine on each router in the network
adjust to the new paths that the routing protocol calculates?
5. Network Meltdown Definition
A state in which a network grinds to a halt due to excessive traffic.
A network meltdown generally starts as a broadcast storm that gets out of control
but even legitimate network messages can cause a meltdown
if the network hasn't been designed to accommodate that level of traffic.
6. Network Meltdowns
Link between Routers D and G flaps, it cycles between the "down" and "up" states
slow enough
§ for a routing adjacency to be formed
§ for the new link to be advertised as part of the topology too quickly
§ for the link to be used
Adjacency between D and G forms and tears down as quickly as the routing protocol allows
B
C
A
D G
F
E
7. Slow down
How to work around this sort of a problem in the routing protocol ?
The answer is simple: Slow down !
Methods of slowing down:
§ Not reporting all interface transitions from the physical layer up to the routing
protocol. This is called debouncing the interface.
§ Slow down neighbor timers.
§ Slow down the distribution of information about topology changes.
§ Slow down the time that the routing protocol reacts to information about
topology changes.
8. To provide stability within a routing system
Methods are typically used in routing protocol design and implementation to provide
stability within a routing system
§ IS-IS
§ a timer regulates how often a router can originate new routing information
lsp-gen-interval { level-1 | level-2 } lsp-max-wait [ lsp-initial-wait lsp-second-wait ]
lsp-max-wait maximum interval between two consecutive occurrences of an LSP being generated
lsp-initial-wait initial LSP generation delay
lsp-second-wait hold time between the first and second LSP generation
§ how often a router can run the shortest path first (SPF) algorithm
that calculates the best paths through the network
spf-interval [level-1 | level-2] spf-max-wait [spf-initial-wait spf-second-wait]
spf-max-wait maximum interval between two consecutive SPF calculations
spf-initial-wait initial SPF calculation delay after a topology change
spf-second-wait hold time between the first and second SPF calculation
9. To provide stability within a routing system (cont)
§ OSPF
§ similar timers regulate the rate at which topology information can be
transmitted and the frequency at which the shortest path first algorithm
can be run.
§ EIGRP
§ the simple rule “No route may be advertised until it is installed in
the local routing table” dampens the the speed at which routing
information is propagated through the network.
§ routing information is also paced when being transmitted through the
network based on the bandwidth between two routers. EIGRP uses
50% of the bandwidth reported by the software.
10. Do not report everything
Reporting the changes more slowly when they occur quickly or not report some
events at all makes routing converge much faster providing the expected stability
§ Router should not immediately report all the events of which it is aware:
§ link failure
§ neighbor failures
§ Let’s sort out which events are in some sense
§ important
§ not
§ Example:
§ if a router loses contact with an adjacent router because the adjacent
router restarted for some reason do not report the resulting change in
topology until it’s clear the neighbor is not coming back
11. The classic questions
§ How long do you wait before deciding the problem is real ?
§ What happens to traffic you would normally forward to that neighbor
while you are waiting ?
§ How do you reconnect in a way that allows the network to continue
operating correctly ?
Two technologies incorporated in routing protocols can answer
these questions:
§ Graceful Restart (GR)
§ Non-Stop Forwarding (NSF)
12. Control plane / forwarding plane
What happens to traffic received by a router while it is restarting ?
well, normally
§ this traffic is dropped
§ any applications that are impacted must retransmit lost data
Prevent this by taking advantage of the separation between
the control plane and the forwarding plane:
if the control plane fails or restarts for any reason, the data plane can continue
forwarding traffic based on the last known good information.
14. Non-Stop Forwarding
NSF implemented through Stateful Switchover (SSO) in Cisco products.
NSF allows continuous forwarding to take place regardless of the state of the
control plane.
When the control plane resets it sends a signal to the data plane that it should
clear its tables and reset.
With NSF enabled this signal from the control plane acts as a signal to mark the
current data as stale and to begin aging out the information.
15. Non-Stop Forwarding (cont)
After we have gotten this far Route Processor (RP) should be able to
§ bring the control plane back up
§ resynchronize the routing protocol databases
§ rebuild the routing table
without disturbing the packets that are still being switched by the data plane on
the router.
This is accomplished through Graceful Restart.
18. Graceful Restart for any routing protocol (cont)
§ Router A & B exchange some form of signaling noting that they are capable of
understanding GR signaling and are responding to it correctly.
§ This signaling does not imply that the router is capable of restarting gracefully
or forwarding traffic through a local failure
Only that it can support a neighboring router performing Graceful Restart
§ However a router where the control and data plane are not cleanly separated,
cannot fully support GR it can support the signaling that is necessary for a
neighboring router to restart gracefully.
21. OSPF Graceful Restart
Two styles of OSPF Graceful Restart are available:
§ Graceful Restart using link local signaling
§ Graceful Restart using opaque link-state advertisements (LSAs)
23. OSPF Graceful Restart using Link Local Signalling
This method of signaling GR, described in the IETF Internet-Draft,
“OSPF Restart Signaling,” (draft-nguyen-ospf-restart-04.txt)
relies on two mechanisms:
§ Link Local Signaling (LLS)
a mechanism described in the IETF Internet-Draft,
“OSPF Link-local Signaling” (draft-nguyen-ospf-lls-02.txt).
This draft extends the OSPF hello packet format to include TLVs, which can then
be used to include additional signaling of various types, such as graceful restart
capability and a graceful restart.
§ Out of Band Resynchronization
a mechanism described in the IETF Internet-Draft,
“OSPF Out-of-Band LSDB Resynchronization” (draft-nguyen-ospf-oob-resync-04.txt).
This draft describes a mechanism through which two OSPF routers can resynchronize
their link-state databases at any point.
27. Fast Down Detection
Before you can route around a failed link or device,
however, you need to detect its failure.
Detecting failure is a major concern in the highly available network.
You can detect a neighbor or link failure in two ways:
§ Polling through fast hellos or other packets, transmitted at Layer 2 or Layer 3
§ Event-driven notification through monitoring some link property, such as the
link carrier
28. Detecting a Link or Adjacency Failure Using Polling
One common method to detect a link or adjacency failure is polling,
or periodically sending hello packets to the adjacent device and
expecting a periodic hello packet in return.
The two determining factors in the speed at which polling can discover a failed
link or device are as follows:
§ The rate at which hello packets are transmitted
§ The number of hello packets missed before declaring a link or adjacency as
failed
29. How Fast Does Polling Detect a Down Neighbor ?
A Bhellos transmitted
A B C D
last hellos transmitted
10 second hello interval
30 second hold interval
E F
30. Fast hellos
Using faster times than the defaults in most protocols:
§ OSPF can transmit a hello every 330 milliseconds and set the dead interval to
1 second
ip ospf dead-interval minimal hello-multiplier multiplier
§ IS-IS can transmit a hello every 330 millisecond and set the dead interval to
1 second
isis hello-interval minimal [level-1 | level-2]
isis hello-multiplier multiplier [level-1 | level-2]
the hello multiplier is set to 3 by default.
§ EIGRP can transmit a hello every second and set the dead interval to 3 sec
ip hello-interval eigrp [autonomous system] [seconds]
ip hold-time eigrp [autonomous system] [seconds]
31. Bidirectional Forwarding Detection - BFD
What's BFD ?
§ Lightweight hello protocol designed to run over multiple transport protocols
§ Designed for sub-second Layer 3 failure detection
§ Any interested client
§ EIGRP
§ IS-IS
§ OSPF
§ etc
registers with BFD and is notified as soon as BFD detects a neighbor loss
§ All registered clients benefit from uniform failure detection
§ Runs on physical, virtual and bundle interfaces
§ Uses UDP port 3784 / 3785 (for echo)
33. Event-driven notification through monitoring link
Rather than periodically polling rely on event-driven notification of link failures.
Rely on lower-layer devices to monitor the link status and notify the routing
protocol when the link fails.
§ SONET/SDH
§ DWDM
probably the best known of the fast convergence technologies available;
it not only allows the fast detection of down links and devices, but it also provides
for link protection, which allows traffic to quickly be switched to a backup fiber link
if the primary path fails.
APS protected link
unprotected link
35. Exponential Backoff in Link-State Protocols
step 2
2nd link flap
step 1
1st link flap
initial timer set to 1 sec
send notification
add increment of 1 sec and set timer here
send notification double time and set timer here
step 3
3d link flap
send notification set timer to max of 5 sec
A B C
flapping link
step 4
set timer to initial
2x maximum (10 seconds)
36. Exponential Backoff in Link-State Protocols (cont)
Exponential backoff mechanizm can be applied to two different timers
in link-state protocols:
§ The Link-state generation timer, the case just examined
§ The SPF timer, which determines how often a router runs the SPF algorithm
in response to changes in the network
37. OSPF Exponential Backoff for LSA Generation
OSPF exponential backoff for LSA generation is called LSA throttling
Two configuration commands are related to this capability:
§ timers throttle lsa all [start-interval] [hold-interval] [max-interval]
start-interval is the initial time
hold-interval is the increment
max-interval is the maximum time
§ timers lsa arrival [milliseconds]
the rate at which a router accepts LSAs with the same LSA-ID
38. OSPF Exponential Backoff for Running SPF
OSPF exponential backoff for SPF is implemented as OSPF SPF throttling
§ timers throttle spf spf-start spf-hold spf-max-wait
§ spf-start is the initial SPF schedule delay in milliseconds
§ spf-hold is the minimum hold time between two consecutive SPF calculations
§ spf-max-wait is the maximum wait time between two consecutive SPF calculations
39. IS-IS Exponential Backoff for Running SPF
IS-IS also implements exponential backoff as throttling
Three commands are used to configure:
§ LSP generation
lsp-gen-interval [level-1 | level-2]
lsp-max-wait [lsp-initial-wait lsp-second-wait]
§ SPF run
spf-interval [level-1 | level-2] spf-max-wait [spf-initial-wait spf-second-wait]
§ PRC throttling
prc-interval prc-max-wait [prc-initial-wait prc-second-wait]
41. Calculating the Route Faster
Another area where the convergence speed of a network could be decreased is
in route calculation.
How long does it take to calculate the best path to a destination in the network
after you have detected and reported an event ?
Consider tuning:
§ feasible successors in EIGRP
§ link-state partial SPF
§ link-state incremental SPF
42. EIGRP Feasible Successors
EIGRP calculates not only the best path to each reachable destination but
also feasible successors, which are known as loop-free routes to the same destination.
The route to 172.17.1.0/24
§ through 172.17.3.1 has reported distance of 2167296
§ through 172.18.8.4 feasible distance of 2172416
router#show ip eigrp topo 172.17.1.0
IP-EIGRP (AS 100): Topology entry for 172.17.1.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 2172416
Routing Descriptor Blocks:
172.17.2.1 (Serial0/0), from 172.18.8.4, Send flag is 0x0
Composite metric is (2172416/18944), Route is Internal
....
172.17.1.0 (Serial0/3), from 172.17.3.1, Send flag is 0x0
Composite metric is (2684416/2167296), Route is Internal
Because the reported distance through 172.17.3.1 is less than the feasible distance
through 172.18.8.4, the route through 172.17.3.1 must be loop free.
It is a feasible successor.
43. How EIGRP determines that a nonfeasible
successor is loop free
It always takes time to query neighbors and to receive replies which slows down
network convergence.
Apply this knowledge to network design by considering not only the best path to each
destination from a given area in the network
but also where the feasible successors are and how to tweak the metrics so that you
have a feasible successor where possible.
44. How EIGRP determines that a nonfeasible
successor is loop free (cont)
One such possible situation
with a pair of equal cost links:
§ A to B link
§ A to C link
router-b#show ip eigrp topo 172.17.1.0
IP-EIGRP (AS 100): Topology entry for 172.17.1.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 2172416
Routing Descriptor Blocks:
10.1.1.1 (Serial0/0), from 10.1.1.1, Send flag is 0x0
Composite metric is (2172416/18944), Route is Internal
....
10.3.3.1 (Serial0/3), from 10.1.3.1, Send flag is 0x0
Composite metric is (2684416/2172416), Route is Internal
The feasible distance through Router A is equal to the reported distance through Router C,
so the route through Router C is not considered a feasible successor. If the Router A to B
link or the Router A to C link fails, at least one query is required to re-converge.
172.17.1.0/24
B C
A
10.1.1.1
10.1.2.1 10.1.3.1
45. Modifying the Delay to Create an EIGRP-Feasible
Successor
Modifying the metrics on the Router A to C link
by decreasing the delay slightly
produces the results
router-b#show ip eigrp topo 172.17.1.0
IP-EIGRP (AS 100): Topology entry for 172.17.1.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 2172416
Routing Descriptor Blocks:
10.1.1.1 (Serial0/0), from 10.1.1.1, Send flag is 0x0
Composite metric is (2172416/18944), Route is Internal
....
10.1.3.1 (Serial0/3), from 10.1.3.1, Send flag is 0x0
Composite metric is (2684416/2167296), Route is Internal
The reported distance through Router C is now lower than the feasible distance through
Router A, so the path through Router C is considered a feasible successor.
172.17.1.0/24
B C
A
10.1.1.1
10.1.2.1 10.1.3.1
46. Link-State Partial SPF
Three types of objects along directed graph
built using SPF:
§ Nodes
§ Edges
§ Leaves
IS-IS treats all IP subnets as leaves off the SPF tree
§ 172.17.1.0/24 leaf
§ 172.17.2.0/24 leaf
OSPF treats an external (redistributed) as leaves
§ 172.17.1.0/24 leaf
§ 172.17.2.0/24 treated as a node in OSPF (network statement)
172.17.1.0/24
B
C
A
D
172.17.2.0/24
redistributed
brought into OSPF
through
a network statement
47. Node and a Leaf in the SPF
Removing and adding leaf nodes without recalculating
the entire SPF tree is called Partial SPF
§ a feature of implementation of OSPF and IS-IS
§ the distinction between a node and
a leaf in the SPF matters !!!
§ changes in leaves in the SPF tree
do not cause a complete recalculation
of the SPF tree
§ if 172.17.1.0/24 fails
it is simply removed from the SPF tree
§ parts of the tree that contain the nodes A, B, C, and D
are not impacted by this change
172.17.1.0/24
B
C
A
D
172.17.2.0/24
redistributed
brought into OSPF
through
a network statement
48. Link-State Incremental SPF
Incremental SPF takes the concept of a partial SPF one step further.
If a specific piece of the SPF tree changes, rather than recalculating the entire tree
recompute just a section of the tree
§ link to router B fails
§ no alternate path exists to router B
§ it is unnecessary to recalculate the entire SPF tree
§ Instead, SPF can safely remove the branch behind router B
§ adjust the routing table accordingly without further calculations
172.17.1.0/24
B
C
A
D
E
49. Link-State Incremental SPF (cont)
In summary:
§ iSPF is more efficient than the full SPF algorithm thereby allowing OSPF/IS-IS
to converge faster
§ iSPF also provides a significant advantage when the changes in the network
topology are further away from the root of the SPT - the larger the network the
more significant the impact
§ iSPF provides greater improvements in convergence time for networks with a high
number of nodes and links
a segment of 400-1000 nodes should see improvements
50. Video
Russian Cisco Support Community
Data Center
VoiceSecurity
Routing and Switching
Contact Center
Unified Communications
Воспользуйтесь возможностью и задайте вопросы на форуме Технической
Поддержки Cisco - http://russiansupportforum.cisco.com
Голосовая связь
Системы унифицированных коммуникаций Маршрутизация и коммутация
Видео
Контакт центры
Центры Обработки данных
Безопасность
CUCMCUBE
UCCX
UCCE
Telepresence
ASA VPN IPS
ISR44xx/43xx
Nexus 7000 Cat 4900
4500 76006500
VSS
Протоколы маршрутизации
IOS XEIOS IOS XR
ISR ISR G2
ASR1000
FWSM
ASR90x
ASR9000
GSR12000 CRS