Cisconinja’s Blog

Static Routing – differences between specifying an exit interface, an IP address, and both

Posted by Andy on April 5, 2009

In this post we will look at some of the differences between configuring static routes with an exit interface, an IP address, and both an exit interface and an IP address.  The topology and initial configurations are shown below:

static-routing-topology

R1:
ipv6 unicast-routing
!
interface FastEthernet0/0
 mac-address 0000.0000.0001
 ip address 10.0.13.1 255.255.255.0
 ipv6 address 2001:DB8:0:13::1/64
 ipv6 address FE80::1 link-local
!
interface Serial0/0
 ip address 10.0.12.1 255.255.255.0
 ipv6 address 2001:DB8:0:12::1/64
 ipv6 address FE80::1 link-local

R2:
ipv6 unicast-routing
!
interface Serial0/0
 ip address 10.0.12.2 255.255.255.0
 ipv6 address 2001:DB8:0:12::2/64
 ipv6 address FE80::2 link-local
!
interface Serial0/1
 ip address 10.0.23.2 255.255.255.0
 ipv6 address 2001:DB8:0:23::2/64
 ipv6 address FE80::2 link-local

R3:
ipv6 unicast-routing
!
interface FastEthernet0/0
 mac-address 0000.0000.0003
 ip address 10.0.13.3 255.255.255.0
 ipv6 address 2001:DB8:0:13::3/64
 ipv6 address FE80::3 link-local
!
interface Serial0/0
 ip address 10.0.23.3 255.255.255.0
 ipv6 address 2001:DB8:0:23::3/64
 ipv6 address FE80::3 link-local
!
interface Loopback0
 ip address 3.3.3.3 255.255.255.0
 ipv6 address 2001:DB8:0:3::3/64
 ipv6 address FE80::3 link-local

First we will configure a static route to R3’s loopback on R1 by specifying an outgoing interface:

R1:
ip route 3.3.3.0 255.255.255.0 FastEthernet0/0
ipv6 route 2001:DB8:0:3::/64 FastEthernet0/0

R1#debug ip routing
R1#debug ipv6 routing

Mar 1 00:08:07.339: RT: SET_LAST_RDB for 3.3.3.0/24
NEW rdb: is directly connected

Mar 1 00:08:07.343: RT: add 3.3.3.0/24 via 0.0.0.0, static metric [1/0]
Mar 1 00:08:07.343: RT: NET-RED 3.3.3.0/24

Mar 1 00:08:55.963: IPv6RT0: static, Route add 2001:DB8:0:3::/64 [new]
Mar 1 00:08:55.963: IPv6RT0: static, Add 2001:DB8:0:3::/64 to table
Mar 1 00:08:55.967: IPv6RT0: static, Adding next-hop :: over FastEthernet0/0 for 2001:DB8:0:3::/64, [1/0]
Mar 1 00:08:55.967: IPv6RT0: Event: 2001:DB8:0:3::/64, Add, owner static, previous None

The IPv4 route shows up in the routing table as directly connected to F0/0:

1-r1-show-ip-route

Some sources say that the default AD for IPv4 static routes with only an exit interface specified is 0, but as shown below the AD is 1, so this may have been changed at some point in IOS:

1-r1-show-ip-route-3

The IPv6 route is listed in the routing table as pointing to an unspecified address with an exit interface of F0/0:

1-r1-show-ipv6-route

IPv4 static routes configured with only an exit interface on a broadcast network rely on proxy-ARP to determine the next hop.  If we attempt to send traffic to 5 different addresses on the 3.3.3.0/24 subnet, R1 sends an ARP for each of them and R3 replies using proxy-ARP:

R1#ping 3.3.3.1
R1#ping 3.3.3.2
R1#ping 3.3.3.3
R1#ping 3.3.3.4
R1#ping 3.3.3.5

1-r1-show-arp

This could result in a large amount of broadcast traffic and a large ARP cache on R1, especially if the static route was a default route used for internet traffic.

IPv6 uses NDP for address resolution, which does not have a similar function to proxy-ARP for answering requests on behalf of other nodes.  Therefore, R3 does not reply to neighbor solicitations sent by R1 for addresses on the 2001:db8:0:3::/64 subnet.  Even if the NS is for 2001:db8:0:3::3 (R3’s loopback address), R3 does not send an NA because NDP messages are link-local in scope and the NS arrives on the wrong interface:

R1#ping 2001:db8:0:3::3
. . . . .

If we add a static neighbor cache entry on R1 for the address, R1 does not need to send an NS to determine the layer-2 address and R3’s loopback is now reachable:

R1:
ipv6 neighbor 2001:DB8:0:3::3 FastEthernet0/0 0000.0000.0003

1-r1-show-neighbor

R1#ping 2001:db8:0:3::3
!!!!!

Obviously this is not scalable as it would require a static neighbor cache entry using R3’s layer-2 address for every IPv6 address that could potentially be reached by the static route on R1.

If F0/0 goes down on R1, both static routes are removed from the routing table:

R1:
interface FastEthernet0/0
 shutdown

Mar 1 00:25:01.907: IPv6RT0: connected, Delete 2001:DB8:0:13::1/128 from table
Mar 1 00:25:01.911: IPv6RT0: connected, Delete 2001:DB8:0:13::/64 from table
Mar 1 00:25:01.915: IPv6RT0: static, Delete 2001:DB8:0:3::/64 from table
Mar 1 00:25:01.919: RT: is_up: FastEthernet0/0 0 state: 6 sub state: 1 line: 1 has_route: True
Mar 1 00:25:01.923: RT: interface FastEthernet0/0 removed from routing table
Mar 1 00:25:01.923: RT: del 10.0.13.0/24 via 0.0.0.0, connected metric [0/0]
Mar 1 00:25:01.923: RT: delete subnet route to 10.0.13.0/24
Mar 1 00:25:01.927: RT: NET-RED 10.0.13.0/24
Mar 1 00:25:01.927: RT: Pruning routes for FastEthernet0/0 (1)
Mar 1 00:25:01.959: RT: delete route to 3.3.3.0 via 0.0.0.0, FastEthernet0/0
Mar 1 00:25:01.959: RT: no routes to 3.3.3.0, flushing
Mar 1 00:25:01.959: RT: NET-RED 3.3.3.0/24
Mar 1 00:25:01.963: RT: delete network route to 3.0.0.0
Mar 1 00:25:01.963: RT: NET-RED 3.0.0.0/8
Mar 1 00:25:01.967: IPv6RT0: Event: 2001:DB8:0:13::1/128, Del, owner connected, previous None
Mar 1 00:25:01.967: IPv6RT0: Event: 2001:DB8:0:13::/64, Del, owner connected, previous None
Mar 1 00:25:01.967: IPv6RT0: Event: 2001:DB8:0:3::/64, Del, owner static, previous None

2-r1-show-ip-route

2-r1-show-ipv6-route

 

Next we will change the static route statements to use a next-hop address instead of an exit interface:

R1:
interface FastEthernet0/0
 no shutdown
!
no ip route 3.3.3.0 255.255.255.0 FastEthernet0/0
no ipv6 route 2001:DB8:0:3::/64 FastEthernet0/0
ip route 3.3.3.0 255.255.255.0 10.0.13.3
ipv6 route 2001:DB8:0:3::/64 2001:DB8:0:13::3

The IPv4 and IPv6 routes are both listed in the routing table with a next-hop address only.  In order to find the outgoing interface, a recursive lookup is performed on the next-hop address:

3-r1-show-ip-route

3-r1-show-ipv6-route1

If we send traffic to 5 different addresses on the 3.3.3.0/24 subnet again, R3 only needs to send an ARP request for the first one to determine the layer-2 address for the specified next-hop, 10.0.13.3:

R1#ping 3.3.3.1
R1#ping 3.3.3.2
R1#ping 3.3.3.3
R1#ping 3.3.3.4
R1#ping 3.3.3.5

3-r1-show-arp

Likewise, if we send traffic to 5 different addresses on the 2001:db8:0:3::/64 subnet, R1 only needs to determine the L2 address for the specified next-hop, 2001:db8:0:3::3:

R1#ping 2001:db8:0:3::1
R1#ping 2001:db8:0:3::2
R1#ping 2001:db8:0:3::3
R1#ping 2001:db8:0:3::4
R1#ping 2001:db8:0:3::5

3-r1-show-neighbor

The static routes remain in the routing table as long as the next-hop can be resolved to a valid route in the routing table.  In this case, if F0/0 goes down on R1, the directly connected routes to 10.0.13.0/24 and 2001:db8:0:13::/64 are removed.  Since these are the only routes that can be used to resolve the next hop addresses, the static routes are also removed:

R1:
interface FastEthernet0/0
 shutdown

4-r1-show-ip-route2

4-r1-show-ipv6-route4

However, consider what would happen instead if RIP were being used on the 2 serial links and the fast ethernet link.  We will enable RIP and RIPng for each of these links on each router:

R1:
router rip
 network 10.0.0.0
!
ipv6 router rip test
!
interface FastEthernet0/0
 ipv6 rip test enable
!
interface Serial0/0
 ipv6 rip test enable

R2:
router rip
 network 10.0.0.0
!
ipv6 router rip test
!
interface Serial0/0
 ipv6 rip test enable
!
interface Serial0/1
 ipv6 rip test enable

R3:
router rip
 network 10.0.0.0
!
ipv6 router rip test
!
interface FastEthernet0/0
 ipv6 rip test enable
!
interface Serial0/0
 ipv6 rip test enable

Now we will shut down R1’s F0/0 again:

R1:
interface FastEthernet0/0
 shutdown

R1 deletes the routing table entries for 3.3.3.0/24 and 2001:db8:0:3::/64 like before:

R1#debug ip routing
R1#debug ipv6 routing

Mar 1 01:02:07.947: IPv6RT0: connected, Delete 2001:DB8:0:13::1/128 from table
Mar 1 01:02:07.951: IPv6RT0: static, Route add 2001:DB8:0:3::/64 [owner]
Mar 1 01:02:07.951: IPv6RT0: connected, Delete 2001:DB8:0:13::/64 from table
Mar 1 01:02:07.951: IPv6RT0: rip test, Backup call for 2001:DB8:0:13::/64
Mar 1 01:02:07.955: IPv6RT0: rip test, Route add 2001:DB8:0:13::/64 [new]
Mar 1 01:02:07.955: IPv6RT0: rip test, Add 2001:DB8:0:13::/64 to table
Mar 1 01:02:07.955: IPv6RT0: rip test, Adding next-hop FE80::3 over FastEthernet0/0 for 2001:DB8:0:13::/64, [120/2]
Mar 1 01:02:07.963: IPv6RT0: rip test, Delete 2001:DB8:0:13::/64 from table
Mar 1 01:02:07.963: IPv6RT0: Delete next-hop FE80::3/FastEthernet0/0 for 2001:DB8:0:23::/64
Mar 1 01:02:07.963: IPv6RT0: static, Delete 2001:DB8:0:3::/64 from table
Mar 1 01:02:07.971: RT: is_up: FastEthernet0/0 0 state: 6 sub state: 1 line: 1 has_route: True
Mar 1 01:02:07.971: RT: del 10.0.23.0/24 via 10.0.13.3, rip metric [120/1]
Mar 1 01:02:07.975: RT: NET-RED 10.0.23.0/24
Mar 1 01:02:07.975: RT: interface FastEthernet0/0 removed from routing table
Mar 1 01:02:07.975: RT: del 10.0.13.0/24 via 0.0.0.0, connected metric [0/0]
Mar 1 01:02:07.979: RT: delete subnet route to 10.0.13.0/24
Mar 1 01:02:07.979: RT: NET-RED 10.0.13.0/24
Mar 1 01:02:07.983: IPv6RT0: Event: 2001:DB8:0:13::1/128, Del, owner connected, previous None
Mar 1 01:02:07.983: IPv6RT0: Event: 2001:DB8:0:13::/64, Del, owner connected, previous None
Mar 1 01:02:07.983: IPv6RT0: Event: 2001:DB8:0:23::/64, Path, owner rip, previous None
Mar 1 01:02:07.987: IPv6RT0: Event: 2001:DB8:0:3::/64, Del, owner static, previous None
Mar 1 01:02:08.607: %SYS-5-CONFIG_I: Configured from console by console
Mar 1 01:02:08.979: RT: del 3.3.3.0/24 via 10.0.13.3, static metric [1/0]
Mar 1 01:02:08.979: RT: delete subnet route to 3.3.3.0/24
Mar 1 01:02:08.983: RT: NET-RED 3.3.3.0/24
Mar 1 01:02:08.983: RT: delete network route to 3.0.0.0
Mar 1 01:02:08.983: RT: NET-RED 3.0.0.0/8
Mar 1 01:02:09.943: %LINK-5-CHANGED: Interface FastEthernet0/0, changed state to administratively down
Mar 1 01:02:09.947: RT: is_up: FastEthernet0/0 0 state: 6 sub state: 1 line: 1 has_route: False
Mar 1 01:02:10.943: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/0, changed state to down
Mar 1 01:02:10.947: RT: is_up: FastEthernet0/0 0 state: 6 sub state: 1 line: 1 has_route: False

A few seconds later, R1 receives a RIPng update from R2 which includes a route to 2001:db8:0:13::/64 from R3. R1 also receives a RIP update from R2 which includes a route to 10.0.13.0/24 that was learned from R3. R1 adds these routes to it’s routing table:


Mar 1 01:02:19.919: IPv6RT0: rip test, Route add 2001:DB8:0:13::/64 [new]
Mar 1 01:02:19.919: IPv6RT0: rip test, Add 2001:DB8:0:13::/64 to table
Mar 1 01:02:19.919: IPv6RT0: rip test, Adding next-hop FE80::2 over Serial0/0 for 2001:DB8:0:13::/64, [120/3]
Mar 1 01:02:19.919: IPv6RT0: Event: 2001:DB8:0:13::/64, Add, owner rip, previous None

Mar 1 01:02:24.587: RT: SET_LAST_RDB for 10.0.13.0/24
NEW rdb: via 10.0.12.2

Mar 1 01:02:24.591: RT: add 10.0.13.0/24 via 10.0.12.2, rip metric [120/2]
Mar 1 01:02:24.591: RT: NET-RED 10.0.13.0/24

A few seconds later, R1 realizes that it now has a valid route to the next hop for each of it’s static routes and adds both routes to the routing table:

Mar 1 01:02:56.135: RT: SET_LAST_RDB for 3.3.3.0/24
NEW rdb: via 10.0.13.3

Mar 1 01:02:56.135: RT: add 3.3.3.0/24 via 10.0.13.3, static metric [1/0]
Mar 1 01:02:56.139: RT: NET-RED 3.3.3.0/24

Mar 1 01:03:00.223: IPv6RT0: static, Route add 2001:DB8:0:3::/64 [new]
Mar 1 01:03:00.223: IPv6RT0: static, Add 2001:DB8:0:3::/64 to table
Mar 1 01:03:00.223: IPv6RT0: static, Adding next-hop 2001:DB8:0:13::3 over Null for 2001:DB8:0:3::/64, [1/0]
Mar 1 01:03:00.227: IPv6RT0: Event: 2001:DB8:0:3::/64, Add, owner static, previous None

5-r1-show-ip-route

5-r1-show-ipv6-route1

R2 does not have a route to 3.3.3.0/24 or 2001:db8:0:3::/64 so traffic to these subnets from R1 will fail.  Even if R2 did have a route, it may be undesirable for these routes to be installed on R1 in the event that F0/0 on R1 goes down because traffic must traverse 2 slow serial links to reach it’s destination.  Another example of a situation that could occur is if R1 had a default route pointing out of a different interface to an ISP and F0/0 went down, the next-hop for the static route would resolve to the default route and R1 would keep the static routes in the routing table and send traffic destined for R3’s subnet out the interface connected to the ISP.

 

By specifying both an exit interface and a next-hop address, this behavior can be changed.  We will remove both static routes on R1 and add new static routes specifying both:

R1:
interface FastEthernet0/0
 no shutdown
!
no ip route 3.3.3.0 255.255.255.0 10.0.13.3
no ipv6 route 2001:DB8:0:3::/64 2001:DB8:0:13::3
ip route 3.3.3.0 255.255.255.0 FastEthernet0/0 10.0.13.3
ipv6 route 2001:DB8:0:3::/64 FastEthernet0/0 2001:DB8:0:13::3

The IPv4 and IPv6 static routes are listed in the routing table with both their outgoing interface and next-hop address:

6-r1-show-ip-route

6-r1-show-ipv6-route1

Using this method avoids the problems that we saw occur when specifying only the exit interface on a broadcast network.  It also will only keep the static routes in the routing table if the exit interface is up (which may or may not be desirable, depending on the situation).  If we shutdown F0/0 on R1 again, R1 learns the routes to 10.0.13.0/24 and 2001:db8:0:13::/64 again via RIP and RIPng.  However, R1 does not install the static routes even though their next-hop addresses resolve to known routes because F0/0 is down:

R1:
interface FastEthernet0/0
 shutdown

7-r1-show-ip-route

7-r1-show-ipv6-route

Having reachability to the next-hop address does not actually have anything to do with the static routes being installed or not when both an exit interface and an IP address are used.  We will bring F0/0 back up on R1 and then change the static routes to use a next-hop that is known out of a different interface:

R1:
interface FastEthernet0/0
 no shutdown
!
no ip route 3.3.3.0 255.255.255.0 FastEthernet0/0 10.0.13.3
no ipv6 route 2001:DB8:0:3::/64 FastEthernet0/0 2001:DB8:0:13::3
ip route 3.3.3.0 255.255.255.0 FastEthernet0/0 10.0.12.2
ipv6 route 2001:DB8:0:3::/64 FastEthernet0/0 2001:DB8:0:12::2

The IP address in the static route statements points to a directly connected network known out of S0/0.  Nevertheless, both routes are installed in the routing table and traffic matching the route will use F0/0:

8-r1-show-ip-route

8-r1-show-ipv6-route

Even if the next-hop is an address for which R1 has no matching route, the static route is still installed in the routing table:

R1:
no ip route 3.3.3.0 255.255.255.0 FastEthernet0/0 10.0.12.2
no ipv6 route 2001:DB8:0:3::/64 FastEthernet0/0 2001:DB8:0:12::2
ip route 3.3.3.0 255.255.255.0 FastEthernet0/0 5.5.5.5
ipv6 route 2001:DB8:0:3::/64 FastEthernet0/0 5555:5555:5555:5555::5

9-r1-show-ip-route

9-r1-show-ipv6-route

Traffic to these static routes will fail since R1 attempts to use ARP/NDP to perform address resolution on the invalid next-hop address out of the configured exit interface, but if the exit interface were instead a point-to-point interface, the configuration would work even though the next-hop address is incorrect.

Posted in IPv6, Static Routing | Leave a Comment »

IPv6 Default Router Preference

Posted by Andy on March 12, 2009

In my post on IPv6 Neighbor Discovery, we saw that a Cisco router acting as a host would, by default, use the first router that it learned of as it’s default router for sending traffic to off-link destinations.  The default router preference (DRP) extension allows routers to use previously unused bits in the RA message to tell hosts what preference they should use for that router.  The DRP extension was not introduced until 12.4(2)T.  We will look at what happens when one router (R1) implements the DRP extension and another router (R2) does not.  We will also look at how 3 hosts using various IOS versions react.  The topology is shown below:

ipv6-default-router-preference

First we will configure link-local addresses on each device, global addresses on R1 and R2, and configure R2 to route IPv6 unicast packets:

R1:
interface FastEthernet0/0
 ipv6 address FE80::1 link-local
 ipv6 address 2001:DB8:0:1::1/64

R2:
interface FastEthernet0/0
 ipv6 address FE80::2 link-local
 ipv6 address 2001:DB8:0:1::2/64
!
ipv6 unicast-routing

HostA:
interface FastEthernet0/0
 ipv6 address FE80::A link-local

HostB:
interface FastEthernet0/0
 ipv6 address FE80::B link-local

HostC:
interface FastEthernet0/0
 ipv6 address FE80::C link-local

Next we will configure each host to obtain an address through stateless autoconfiguration:

HostA:
interface FastEthernet0/0
 ipv6 address autoconfig

HostB:
interface FastEthernet0/0
 ipv6 address autoconfig

HostC:
interface FastEthernet0/0
 ipv6 address autoconfig

Each host autoconfigures itself an address on the 2001:db8:0:1::/64 subnet and learns R2 as the default router:

1-hosta-show-int

2-hostb-show-int1

3-hostc-show-int

Now we will enable R1 to begin sending RA messages on the link:

R1:
ipv6 unicast-routing

The hosts learn of R1 as a router, but still prefer R2 as their default router because it was learned first:

4-hosta-show-routers

5-hostb-show-routers

6-hostc-show-routers

7-hosta-show-int1

8-hostb-show-int1

9-hostc-show-int

Next we will try to change the DRP so that R1 is preferred.  R2 does not support the DRP extension so we will have to make the change on R1 by setting the router preference to high:

R1:
interface FastEthernet0/0
 ipv6 nd router-preference high

After changing the router preference on R1, R1’s RA messages look like this:

10-r1-ra-wireshark1

R2’s RA messages look like this:

11-r2-ra-wireshark

The 4th and 5th bits of the flags are used to carry the router preference.  R1 sets the bits to 01 to indicate a preference of high.  A router that does not implement DRP sets the bits to 00, which is interpreted by devices that understand DRP as a preference of medium.  A preference of low would set the bits to 10.  HostA understands the router preference bits and changes it’s default router to be R1:

12-hosta-show-routers

HostA#debug ipv6 nd
Mar 1 00:17:55.879: ICMPv6-ND: Received RA from FE80::1 on FastEthernet0/0
Mar 1 00:17:55.879: ICMPv6-ND: Selected new default router FE80::1 on FastEthernet0/0

13-hosta-show-int

HostB understands the router preference, but does not change it’s default router:

14-hostb-show-routers

15-hostb-show-int

HostC has no knowledge of the router preference bits or their meaning and also does not change it’s default router:

16-hostc-show-routers

17-hostc-show-int

Posted in IPv6 | Leave a Comment »

IPv6 General Prefixes

Posted by Andy on March 10, 2009

This post will look at how general prefixes in IPv6 work.  We will use 1 router configured with 2 different subnets X:X:X:Y::/64, where X is the 48 bit prefix assigned by the ISP and Y is the 16 bit subnet ID/Site-Level Aggregator.  We will also configure 2 hosts to use stateless autoconfiguration, 1 for each subnet:

ipv6-general-prefix-topology

First we will configure link-local addresses on each device:

R1:
interface FastEthernet0/0
 ipv6 address FE80::1 link-local
!
interface FastEthernet0/1
 ipv6 address FE80::2 link-local

HostA:
interface FastEthernet0/0
 ipv6 address FE80::A link-local

HostB:
interface FastEthernet0/0
 ipv6 address FE80::B link-local

Suppose that we’ve been given the prefix 2001:db8:aaaa::/48 from an ISP.  We will define this prefix globally as a general prefix:

R1:
ipv6 general-prefix ISP-prefix 2001:DB8:AAAA::/48

Next, we will configure global unicast addresses on R1 by using the ISP prefix that we just defined:

R1:
interface FastEthernet0/0
 ipv6 address ISP-prefix ::1:0:0:0:1/64

R1#debug ipv6 nd
*Mar 1 00:59:51.019: ICMPv6-ND: Adding prefix 2001:DB8:AAAA:1::1/64 to FastEthernet0/0
*Mar 1 00:59:51.019: ICMPv6-ND: Sending NS for 2001:DB8:AAAA:1::1 on FastEthernet0/0
*Mar 1 00:59:52.019: ICMPv6-ND: DAD: 2001:DB8:AAAA:1::1 is unique.
*Mar 1 00:59:52.019: ICMPv6-ND: Sending NA for 2001:DB8:AAAA:1::1 on FastEthernet0/0
*Mar 1 00:59:52.023: ICMPv6-ND: Address 2001:DB8:AAAA:1::1/64 is up on FastEthernet0/0

R1 prepends the general prefix to the address that was entered to obtain the full address.  We could enter anything in the first 48-bits of the interface address with the same result since the first 48-bits will be overwritten by the general prefix:

R1:
interface FastEthernet0/1
 ipv6 address ISP-prefix 2345:6789:ABCD:2::2/64

R1#debug ipv6 nd
*Mar 1 01:00:27.099: ICMPv6-ND: Adding prefix 2001:DB8:AAAA:2::2/64 to FastEthernet0/1
*Mar 1 01:00:27.099: ICMPv6-ND: Sending NS for 2001:DB8:AAAA:2::2 on FastEthernet0/1
*Mar 1 01:00:28.099: ICMPv6-ND: DAD: 2001:DB8:AAAA:2::2 is unique.
*Mar 1 01:00:28.099: ICMPv6-ND: Sending NA for 2001:DB8:AAAA:2::2 on FastEthernet0/1
*Mar 1 01:00:28.103: ICMPv6-ND: Address 2001:DB8:AAAA:2::2/64 is up on FastEthernet0/1

R1 now has the following addresses configured:

1-r1-show-int-brief

Before configuring R1 to act as a router, we will make some changes to a few default timers so that later on we will be able to see the effects of changing prefixes faster.  By default, IPv6 prefixes are advertised in RA messages with a valid lifetime of 30 days and a preferred lifetime of 7 days.  Even if the prefix is removed from RAs, hosts will continue to use addresses that have been autoconfigured with the prefix until the lifetime expires.  We will reduce the valid lifetime to 300 seconds and the preferred lifetime to 200 seconds.  We will also reduce the interval that unsolicited RAs are sent at from the default of 200 seconds to 60 seconds:

R1:
interface FastEthernet0/0
 ipv6 nd ra-interval 60
 ipv6 nd prefix default 300 200
!
interface FastEthernet0/1
 ipv6 nd ra-interval 60
 ipv6 nd prefix default 300 200

Next, we will enable unicast IPv6 routing and sending of RA messages on R1:

R1:
ipv6 unicast-routing

R1 begins sending RA messages with the configured lifetime values in the prefix option TLV:

R1#debug ipv6 nd
*Mar 1 01:24:42.007: ICMPv6-ND: Sending RA to FF02::1 on FastEthernet0/0
*Mar 1 01:24:42.007: ICMPv6-ND: MTU = 1500
*Mar 1 01:24:42.007: ICMPv6-ND: prefix = 2001:DB8:AAAA:1::/64 onlink autoconfig
*Mar 1 01:24:42.011: ICMPv6-ND: 300/200 (valid/preferred)
*Mar 1 01:24:42.011: ICMPv6-ND: Sending RA to FF02::1 on FastEthernet0/1
*Mar 1 01:24:42.011: ICMPv6-ND: MTU = 1500
*Mar 1 01:24:42.011: ICMPv6-ND: prefix = 2001:DB8:AAAA:2::/64 onlink autoconfig
*Mar 1 01:24:42.015: ICMPv6-ND: 300/200 (valid/preferred)

Next, we will configure HostA and HostB to use stateless autoconfiguration:

HostA:
interface FastEthernet0/0
 ipv6 address autoconfig

HostB:
interface FastEthernet0/0
 ipv6 address autoconfig

Each host obtains the expected address and learns the lifetime of the prefix:

2-hosta-show-int

3-hostb-show-int

Now suppose that we have changed ISPs and our new ISP gives us the prefix 2001:db8:bbbb::/48.  We will add this prefix to the general prefix that we defined on R1:

R1:
ipv6 general-prefix ISP-prefix 2001:DB8:BBBB::/48

R1 configures itself an address in the new prefix as well and begins advertising both prefixes in RAs:

4-r1-show-int-brief

After hearing the next RA, both hosts configure themselves an address in the new prefix as well:

5-hosta-show-int1

6-hostb-show-int

Next we will remove the prefix of the first ISP on R1:

R1:
no ipv6 general-prefix ISP-prefix 2001:DB8:AAAA::/48

R1#debug ipv6 nd
*Mar 1 01:51:00.083: ICMPv6-ND: Deleting prefix 2001:DB8:AAAA:1::1/64 from FastEthernet0/0
*Mar 1 01:51:00.083: ICMPv6-ND: Address 2001:DB8:AAAA:1::1/64 is down on FastEthernet0/0
*Mar 1 01:51:00.083: ICMPv6-ND: Deleting prefix 2001:DB8:AAAA:2::2/64 from FastEthernet0/1
*Mar 1 01:51:00.087: ICMPv6-ND: Address 2001:DB8:AAAA:2::2/64 is down on FastEthernet0/1

R1 removes the address and stops advertising the prefix in RAs.  After 200 seconds, the preferred lifetime for the prefix expires on the hosts and the address is marked as deprecated:

HostA#debug ipv6 nd
*Mar 1 01:54:14.927: ICMPv6-ND: Deprecating 2001:DB8:AAAA:1::A from FastEthernet0/0

HostB#debug ipv6 nd
*Mar 1 01:53:50.035: ICMPv6-ND: Deprecating 2001:DB8:AAAA:2::B from FastEthernet0/0

7-hosta-show-int

8-hostb-show-int

300 seconds after the last RA containing the old prefix was received, the valid lifetime for the prefix expires and the old address is removed:

R1#debug ipv6 nd
*Mar 1 01:55:54.927: ICMPv6-ND: Invalidating 2001:DB8:AAAA:1::A from FastEthernet0/0
*Mar 1 01:55:54.927: ICMPv6-ND: Address 2001:DB8:AAAA:1::A/64 is down on FastEthernet0/0

R2#debug ipv6 nd
*Mar 1 01:55:30.035: ICMPv6-ND: Invalidating 2001:DB8:AAAA:2::B from FastEthernet0/0
*Mar 1 01:55:30.035: ICMPv6-ND: Address 2001:DB8:AAAA:2::B/64 is down on FastEthernet0/0

9-hosta-show-int

10-hostb-show-int

 

Although this method is very easy for changing prefixes, it could result in a period of unreachability during the transition.  Let’s look at what happens in the scenario we just used when a host tries to reach a remote network during the transition.  We will start with the configuration prior to removing the ISP A prefix on R1:

R1:
ipv6 unicast-routing
ipv6 general-prefix ISP-prefix 2001:DB8:AAAA::/48
ipv6 general-prefix ISP-prefix 2001:DB8:BBBB::/48
!
interface FastEthernet0/0
 ipv6 address ISP-prefix ::1:0:0:0:1/64
 ipv6 address FE80::1 link-local
 ipv6 nd ra-interval 60
 ipv6 nd prefix default 300 200
!
interface FastEthernet0/1
 ipv6 address ISP-prefix 2345:6789:ABCD:2::2/64
 ipv6 address FE80::2 link-local
 ipv6 nd ra-interval 60
 ipv6 nd prefix default 300 200

HostA:
interface FastEthernet0/0
 ipv6 address FE80::A link-local
 ipv6 address autoconfig

HostB:
interface FastEthernet0/0
 ipv6 address FE80::B link-local
 ipv6 address autoconfig

Next we will create a loopback on R1 to simulate a remote network:

R1:
interface Loopback0
 ipv6 address 2001:DB8:0:1::1/64

Now we will initiate a continuous ping to R1’s loopback from HostA:

HostA#ping 2001:db8:0:1::1 repeat 1000000
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
(continues)

Hosts (or at least Cisco routers acting as hosts) seem to prefer using the address that they autoconfigured first as the source address for reaching remote networks if multiple global addresses are configured. In this case, HostA uses the ISP A address, 2001:db8:aaaa:1::a, as the source address for traffic sent to R1’s loopback:

11-hosta-wireshark

Now we will remove the ISP A prefix from the general prefix on R1 like before:

R1:
no ipv6 general-prefix ISP-prefix 2001:DB8:AAAA::/48

The host immediately stops receiving ping replies:

HostA#............................... (continues)

Return traffic in this case is dropped at R1 because R1 no longer has a route back to the 2001:db8:aaaa:1::/64 network.  This continues until the preferred lifetime for the old prefix expires on the host and the host begins using the ISP B address to reach remote networks:

HostA#debug ipv6 nd
Mar 1 02:09:33.924: ICMPv6-ND: Deprecating 2001:DB8:AAAA:1::A from FastEthernet0/0
!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 98 percent (7799/7895), round-trip min/avg/max = 0/27/1852 ms

96 echo-requests timed out waiting for a reply.  With the default echo-request timeout of 2 seconds, this is roughly equal to the 200 seconds that it took for the preferred lifetime of the ISP A prefix to expire on the host.

 

Let’s look at a way that this could be avoided if we needed to maintain full reachability during the transition period.  We will start from the same point again, with the ISP B prefix just added to R1:

R1:
ipv6 unicast-routing
ipv6 general-prefix ISP-prefix 2001:DB8:AAAA::/48
ipv6 general-prefix ISP-prefix 2001:DB8:BBBB::/48
!
interface FastEthernet0/0
 ipv6 address ISP-prefix ::1:0:0:0:1/64
 ipv6 address FE80::1 link-local
 ipv6 nd ra-interval 60
 ipv6 nd prefix default 300 200
!
interface FastEthernet0/1
 ipv6 address ISP-prefix 2345:6789:ABCD:2::2/64
 ipv6 address FE80::2 link-local
 ipv6 nd ra-interval 60
 ipv6 nd prefix default 300 200

HostA:
interface FastEthernet0/0
 ipv6 address FE80::A link-local
 ipv6 address autoconfig

HostB:
interface FastEthernet0/0
 ipv6 address FE80::B link-local
 ipv6 address autoconfig

We will initiate the ping from HostA to R1’s loopback again

HostA#ping 2001:db8:0:1::1 repeat 1000000
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
(continues)

Before removing the prefix on R1 this time, we will prevent it from being advertised in RAs:

R1:
interface FastEthernet0/0
 ipv6 nd prefix 2001:DB8:AAAA:1::/64 no-advertise
!
interface FastEthernet0/1
 ipv6 nd prefix 2001:DB8:AAAA:2::/64 no-advertise

R1#debug ipv6 nd
*Mar 1 02:20:51.403: ICMPv6-ND: Updating prefix 2001:DB8:AAAA:1::/64 to FastEthernet0/0
*Mar 1 02:20:51.403: ICMPv6-ND: Prefix change 0x401 -> 0x104
*Mar 1 02:20:51.407: ICMPv6-ND: Prefix Information change 0xE0 -> 0xE0
*Mar 1 02:21:16.271: ICMPv6-ND: Updating prefix 2001:DB8:AAAA:2::/64 to FastEthernet0/1
*Mar 1 02:21:16.271: ICMPv6-ND: Prefix change 0x401 -> 0x104
*Mar 1 02:21:16.271: ICMPv6-ND: Prefix Information change 0xE0 -> 0xE0

R1 stops including a TLV for the ISP A prefixes in it’s RAs.  After approximately 200 seconds, HostA’s preferred lifetime for the prefix expires and the address is deprecated:

Mar 1 02:23:46.460: ICMPv6-ND: Deprecating 2001:DB8:AAAA:1::A from FastEthernet0/0

At this point, HostA begins immediately using the address from the ISP B prefix instead of the ISP A prefix as it’s source address for traffic sent to remote destinations:

12-hosta-wireshark

The host continues receiving ping replies with no problems:

HostA#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! (continues)

Now we will remove the ISP A prefix from R1:

R1:
no ipv6 general-prefix ISP-prefix 2001:DB8:AAAA::/48

R1 removes the addresses associated with the ISP A prefix from both interfaces:

13-r1-show-int

Not a single packet is lost during the transition:

HostA#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (9134/9134), round-trip min/avg/max = 0/28/224 ms

Posted in IPv6 | 1 Comment »

IPv6 Neighbor Discovery

Posted by Andy on March 7, 2009

This post will look at how Neighbor Discovery works in IPv6.  The topology is shown below:

ipv6-nd-topology

First we will configure the MAC address of each router’s interface to 0000.0000.XXXX where X is the router number in order to make EUI-64 format addresses easier to work with:

R1:
interface FastEthernet0/0
 mac-address 0000.0000.1111

R2:
interface FastEthernet0/0
 mac-address 0000.0000.2222

R3:
interface FastEthernet0/0
 mac-address 0000.0000.3333

Next we will configure a global unicast address on R1:

R1:
interface FastEthernet0/0
 ipv6 address 2001:123::1/64

R1 assigns itself the configured address as well as a link-local address using the EUI-64 format:

1-r1-showint

We can see the sequence of events that takes place when the address configuration command is entered using debug ipv6 nd:


R1#debug ipv6 nd
*Mar 1 00:48:09.683: ICMPv6-ND: Adding prefix 2001:123::1/64 to FastEthernet0/0
*Mar 1 00:48:10.683: ICMPv6-ND: Sending NS for FE80::200:FF:FE00:1111 on FastEthernet0/0
*Mar 1 00:48:11.683: ICMPv6-ND: DAD: FE80::200:FF:FE00:1111 is unique.
*Mar 1 00:48:11.683: ICMPv6-ND: Sending NA for FE80::200:FF:FE00:1111 on FastEthernet0/0
*Mar 1 00:48:11.683: ICMPv6-ND: Address FE80::200:FF:FE00:1111/10 is up on FastEthernet0/0
*Mar 1 00:48:11.691: ICMPv6-ND: Sending NS for 2001:123::1 on FastEthernet0/0
*Mar 1 00:48:12.691: ICMPv6-ND: DAD: 2001:123::1 is unique.
*Mar 1 00:48:12.691: ICMPv6-ND: Sending NA for 2001:123::1 on FastEthernet0/0
*Mar 1 00:48:12.691: ICMPv6-ND: Address 2001:123::1/64 is up on FastEthernet0/0

We will examine this process step by step in a moment when we configure R2 to act as a host on the network.  For now, we will configure R1 to act as a router on the network:

R1:
ipv6 unicast-routing

After entering this, we can see that several new lines appear at the bottom of the show ipv6 interface output:

2-r1-showint

This information relates to the sending of Router Advertisement (RA) messages by R1.  As soon as ipv6 unicast-routing is configured, R1 begins sending unsolicited RA messages approximately every 200 seconds:

R1#debug ipv6 nd
*Mar 1 00:55:55.031: ICMPv6-ND: Sending RA to FF02::1 on FastEthernet0/0
*Mar 1 00:55:55.031: ICMPv6-ND: MTU = 1500
*Mar 1 00:55:55.035: ICMPv6-ND: prefix = 2001:123::/64 onlink autoconfig
*Mar 1 00:55:55.035: ICMPv6-ND: 2592000/604800 (valid/preferred)

These RAs are sourced from R1’s link local address and sent to the all IPv6 nodes multicast group:

3-r1-wireshark-ra

The ICMPv6 Type value identifies the message as an RA.  There are several other pieces of information within the RA that could be important to hosts on the link.  The main one we are interested in right now is the first flag bit, which has not been set.  This tells hosts they should attempt to use stateless autoconfiguration to obtain their IPv6 addresses.  We can also see that 3 optional TLVs have been included in the RA message in order to tell hosts on the link about the router’s MAC address, the MTU of the link, and the global prefix used on the link:

4-r1-wireshark-ra

Next, let’s configure R2 to automatically obtain an IPv6 address through stateless autoconfiguration:

R2:
interface FastEthernet0/0
 ipv6 address autoconfig

A similar sequence of events takes place on R2 after entering the address command to what we saw earlier on R1:

 R2#debug ipv6 nd
*Mar 1 01:45:52.175: ICMPv6-ND: Sending NS for FE80::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 01:45:53.175: ICMPv6-ND: DAD: FE80::200:FF:FE00:2222 is unique.
*Mar 1 01:45:53.175: ICMPv6-ND: Sending NA for FE80::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 01:45:53.175: ICMPv6-ND: Address FE80::200:FF:FE00:2222/10 is up on FastEthernet0/0
*Mar 1 01:45:55.179: ICMPv6-ND: Sending RS on FastEthernet0/0
*Mar 1 01:45:55.199: ICMPv6-ND: Received RA from FE80::200:FF:FE00:1111 on FastEthernet0/0
*Mar 1 01:45:55.203: ICMPv6-ND: DELETE -> INCMP: FE80::200:FF:FE00:1111
*Mar 1 01:45:55.203: ICMPv6-ND: INCMP -> STALE: FE80::200:FF:FE00:1111
*Mar 1 01:45:55.207: ICMPv6-ND: Sending NS for 2001:123::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 01:45:55.207: ICMPv6-ND: Autoconfiguring 2001:123::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 01:45:56.207: ICMPv6-ND: DAD: 2001:123::200:FF:FE00:2222 is unique.
*Mar 1 01:45:56.207: ICMPv6-ND: Sending NA for 2001:123::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 01:45:56.211: ICMPv6-ND: Address 2001:123::200:FF:FE00:2222/64 is up on FastEthernet0/0

Let’s look at this process step by step.  After telling R2 to use autoconfiguration to obtain an address, R2 first calculates it’s EUI-64 interface ID on F0/0 from it’s MAC address and appends this to the link local prefix (FE80::/10).  R2 marks this address as tentative (shown as ‘TEN’) and sends a neighbor solicitation (NS) message to this address to verify that it isn’t a duplicate before R2 starts to use it:

6-r2-showint

*Mar 1 01:45:52.175: ICMPv6-ND: Sending NS for FE80::200:FF:FE00:2222 on FastEthernet0/0 

The NS message is sourced from an unspecified IPv6 address (::) since R2 does not have a valid IPv6 address yet and is sent to the solicited node multicast group for the tentative address (FF02::1:FF00:2222).

7-r2-wireshark-ns

Any node with an address that ended in the same 24 bits would be required to join this group and would hear the NS message.  The node would then examine the target address contained within the NS to see if the full address matched it’s own or not, and if it did would prevent R2 from using it.  Since no other nodes are using this address, R2 does not get a reply back.  After waiting 1 second without a reply, R2 determines that the address must be unique:

*Mar 1 01:45:53.175: ICMPv6-ND: DAD: FE80::200:FF:FE00:2222 is unique.

At the same time R2 begins using the address and removes the tentative marker from it.  R2 also sends an unsolicited neighbor advertisement (NA) to advertise it’s new link-local address:

*Mar 1 01:45:53.175: ICMPv6-ND: Sending NA for FE80::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 01:45:53.175: ICMPv6-ND: Address FE80::200:FF:FE00:2222/10 is up on FastEthernet0/0
 

8-r2-showint

 9-r2-wireshark-na1

The source address is set to R2’s link local address and the destination address is the all IPv6 nodes multicast address.  The target address is set to R2’s link local so other nodes know which address the advertisement is for, and the NA message includes an option TLV to specify R2’s MAC address.  The override flag has also been set, telling receivers that this information should override the information in their neighbor cache for this entry.

After a couple seconds, R2 sends a router solicitation (RS) to request that any routers on the link send an RA:

*Mar 1 01:45:55.179: ICMPv6-ND: Sending RS on FastEthernet0/0

R2 uses it’s link-local address as the source and the all routers IPv6 multicast address as the destination.  R2 also includes it’s MAC address as an option TLV:

10-r2-wireshark-rs

R1 receives the RS and sends a solicited RA in response rather than waiting up to 200 seconds for it’s next unsolicited RA.  Some sources say that this is unicast to the IPv6 address of the node that sent the RS, but my tests showed that the destination address (FF02::1) and every other part of the packet was identical to an unsolicted RA.  R2 receives the RA from R1 a few milliseconds after sending it’s RS:

*Mar 1 01:45:55.199: ICMPv6-ND: Received RA from FE80::200:FF:FE00:1111 on FastEthernet0/0

The next two debug messages relate to the neighbor cache, which we will ignore for now.  R2 learns R1’s link-local address and adds it as a default router.  R2 also learns the link prefix from the received RA and prepends it to it’s EUI-64 interface ID.  Just like with the link-local address, R2 marks the address as tentative and sends an NS as part of the Duplicate Address Detection procedure:

11-r2-showint
 
*Mar 1 01:45:55.207: ICMPv6-ND: Sending NS for 2001:123::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 01:45:55.207: ICMPv6-ND: Autoconfiguring 2001:123::200:FF:FE00:2222 on FastEthernet0/0

The NS packet is identical to the one shown earlier, other than the target address now being set to R2’s tentative global unicast address instead of the link-local address.  When no reply for the address comes back after 1 second, R2 determines that this address is also unique:

*Mar 1 01:45:56.207: ICMPv6-ND: DAD: 2001:123::200:FF:FE00:2222 is unique.

R2 then sends an unsolicited NA to advertise it’s new address.  The NA is identical to the one that R2 sent for it’s link-local address earlier other than the source IPv6 address and target address in the ICMPv6 option being set to R2’s global unicast address instead of the link-local address:

*Mar 1 01:45:56.207: ICMPv6-ND: Sending NA for 2001:123::200:FF:FE00:2222 on FastEthernet0/0

Finally, R2 is able to remove the tentative marking on the address:

*Mar 1 01:45:56.211: ICMPv6-ND: Address 2001:123::200:FF:FE00:2222/64 is up on FastEthernet0/0

12-r2-showint

If multiple prefixes are configured on the link, R1 includes an additional TLV in it’s RA messages for each prefix:

R1:
interface FastEthernet0/0
 ipv6 address 3001:123::1/64

R1#debug ipv6 nd
*Mar 1 01:31:19.691: ICMPv6-ND: Sending RA to FF02::1 on FastEthernet0/0
*Mar 1 01:31:19.691: ICMPv6-ND: MTU = 1500
*Mar 1 01:31:19.691: ICMPv6-ND: prefix = 2001:123::/64 onlink autoconfig
*Mar 1 01:31:19.691: ICMPv6-ND: 2592000/604800 (valid/preferred)
*Mar 1 01:31:19.695: ICMPv6-ND: prefix = 3001:123::/64 onlink autoconfig
*Mar 1 01:31:19.695: ICMPv6-ND: 2592000/604800 (valid/preferred)

R2 goes through the same process detailed above for the new prefix and configures itself a global address using that prefix as well:

13-r2-showint

If another node on the link attempts to use the same IPv6 address, Duplicate Address Detection prevents this from happening.  Before testing this out, first we will get rid of the additional prefix that we created:

R1:
interface FastEthernet0/0
 no ipv6 address 3001:123::1/64

R2:
interface FastEthernet0/0
 no ipv6 address autoconfig

A few seconds later…

R2:
interface FastEthernet0/0
 ipv6 address autoconfig

Next we will configure R3 to use the same MAC address as R2 and to obtain an IPv6 address through stateless autoconfiguration:

R3:
interface FastEthernet0/0
 mac-address 0000.0000.2222
 ipv6 address autoconfig

R3#debug ipv6 nd
*Mar 1 01:48:21.775: ICMPv6-ND: Sending NS for FE80::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 01:48:21.907: ICMPv6-ND: Received NA for FE80::200:FF:FE00:2222 on FastEthernet0/0 from FE80::200:FF:FE00:2222
*Mar 1 01:48:21.915: ICMPv6-ND: DAD: duplicate link-local FE80::200:FF:FE00:2222 on FastEthernet0/0,interface stalled
*Mar 1 01:48:21.915: %IPV6-4-DUPLICATE: Duplicate address FE80::200:FF:FE00:2222 on FastEthernet0/0

 
R3 follows the same process shown earlier, calculating it’s EUI-64 interface ID and appending the interface ID to the link-local prefix.  R3 then sends a NS for the tentative address that it came up with to the solicited-node multicast group.  Since R2 has joined this group, it receives the NS and finds it’s own address as the target.  R2 sends an NA immediately to the all nodes multicast address.  R3 receives the NA from R2 before it’s 1 second DAD timer expires, and R3 realizes the address is a duplicate.  R3 marks the address as a duplicate and does not begin using it:

14-r3-showint

The MAC address would have to be changed on R3 or a different IPv6 address would have to be manually configured.  Notice also that R3 does not even attempt to assign itself a global address through autoconfiguration until it has a non-duplicate link-local address.  If we change the link-local address used on R3 manually and continue to use stateless autoconfiguration, this not only fixes the duplicate link-local address but also results in R3 assigning itself a non-duplicate global address:

R3:
interface FastEthernet0/0
 ipv6 address FE80::3 link-local

*Mar 1 02:10:27.483: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0
*Mar 1 02:10:28.483: ICMPv6-ND: DAD: FE80::3 is unique.
*Mar 1 02:10:28.483: ICMPv6-ND: Sending NA for FE80::3 on FastEthernet0/0
*Mar 1 02:10:28.483: ICMPv6-ND: Address FE80::3/10 is up on FastEthernet0/0
*Mar 1 02:10:30.483: ICMPv6-ND: Sending RS on FastEthernet0/0
*Mar 1 02:10:30.523: ICMPv6-ND: Received RA from FE80::200:FF:FE00:1111 on FastEthernet0/0
*Mar 1 02:10:30.523: ICMPv6-ND: DELETE -> INCMP: FE80::200:FF:FE00:1111
*Mar 1 02:10:30.527: ICMPv6-ND: INCMP -> STALE: FE80::200:FF:FE00:1111
*Mar 1 02:10:30.527: ICMPv6-ND: Sending NS for 2001:123::3 on FastEthernet0/0
*Mar 1 02:10:30.531: ICMPv6-ND: Autoconfiguring 2001:123::3 on FastEthernet0/0
*Mar 1 02:10:31.527: ICMPv6-ND: DAD: 2001:123::3 is unique.
*Mar 1 02:10:31.527: ICMPv6-ND: Sending NA for 2001:123::3 on FastEthernet0/0
*Mar 1 02:10:31.531: ICMPv6-ND: Address 2001:123::3/64 is up on FastEthernet0/0

This shows that the interface ID for the link-local address is derived from the MAC address unless manually configured.  The interface ID for a global address, however, is derived from the interface ID used for the link-local address – so if the link-local address has been manually configured, global addresses will use the same interface ID rather than applying the MAC to EUI-64 conversion to obtain an interface ID.  This applies to interfaces that are manually configured to use EUI-64 as well:

R3:
interface FastEthernet0/0
 no ipv6 address

A few seconds later…

R3:
interface FastEthernet0/0
 ipv6 address 2001:123::/64 eui-64

*Mar 1 02:26:49.463: ICMPv6-ND: Adding prefix 2001:123::200:FF:FE00:2222/64 to FastEthernet0/0
*Mar 1 02:26:50.463: ICMPv6-ND: Sending NS for FE80::200:FF:FE00:2222 on FastEthernet0/0
*Mar 1 02:26:50.515: ICMPv6-ND: Received NA for FE80::200:FF:FE00:2222 on FastEthernet0/0 from FE80::200:FF:FE00:2222
*Mar 1 02:26:50.523: ICMPv6-ND: DAD: duplicate link-local FE80::200:FF:FE00:2222 on FastEthernet0/0,interface stalled
*Mar 1 02:26:50.523: %IPV6-4-DUPLICATE: Duplicate address FE80::200:FF:FE00:2222 on FastEthernet0/0

15-r3-showint

R3:
interface FastEthernet0/0
 ipv6 address FE80::3 link-local

*Mar 1 02:30:37.027: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0
*Mar 1 02:30:38.027: ICMPv6-ND: DAD: FE80::3 is unique.
*Mar 1 02:30:38.027: ICMPv6-ND: Sending NA for FE80::3 on FastEthernet0/0
*Mar 1 02:30:38.027: ICMPv6-ND: Address FE80::3/10 is up on FastEthernet0/0
*Mar 1 02:30:38.031: ICMPv6-ND: Sending NS for 2001:123::3 on FastEthernet0/0
*Mar 1 02:30:39.031: ICMPv6-ND: DAD: 2001:123::3 is unique.
*Mar 1 02:30:39.031: ICMPv6-ND: Sending NA for 2001:123::3 on FastEthernet0/0
*Mar 1 02:30:39.031: ICMPv6-ND: Address 2001:123::3/64 is up on FastEthernet0/0

16-r3-showint

Next, we will add a second router to the link.  R1 and R2 will be configured as routers:

R1:
ipv6 unicast-routing
!
interface FastEthernet0/0
 mac-address 0000.0000.1111
 ipv6 address 2001:123::1/64
 ipv6 address FE80::1 link-local

R2:
ipv6 unicast-routing
!
interface FastEthernet0/0
 mac-address 0000.0000.2222
 ipv6 address 2001:123::2/64
 ipv6 address FE80::2 link-local

R3 will be configured to use stateless autoconfiguration:

R3:
interface FastEthernet0/0
 mac-address 0000.0000.3333
 ipv6 address FE80::3 link-local
 ipv6 address autoconfig

When R3 sends out an RS, it receives an RA from both R1 and R2. In this case, the RA from R2 arrives a few milliseconds before the RA from R1:

R3#debug ipv6 nd
*Mar 1 02:45:26.139: ICMPv6-ND: Received RA from FE80::2 on FastEthernet0/0
*Mar 1 02:45:26.147: ICMPv6-ND: Received RA from FE80::1 on FastEthernet0/0

What a host does when there are multiple routers is implementation specific; for a Cisco router acting as a host, it appears that only the first RA received is used as the default router:

17-r3-showint

 

Next we will look at how Neighbor Unreachability Detection works.  The Layer-2 address for IPv6 nodes is stored in the neighbor cache, similar to the ARP cache of IPv4.  The current neighbor cache of R3 is shown below:

18-r3-neighbors1

R3 has entries for R1 and R2’s link-local addresses, both in a STALE state.  This means that these addresses were reachable previously, but the Reachable Time has expired since the last confirmation of their reachability.  Let’s try to ping R1’s link-local address on R3:

19-r3-ping

R3#debug ipv6 icmp
R3#debug ipv6 nd

Mar 1 03:47:37.073: ICMPv6: Sending echo request to FE80::1
Mar 1 03:47:37.073: ICMPv6-ND: STALE -> DELAY: FE80::1
Mar 1 03:47:37.117: ICMPv6: Received echo reply from FE80::1
Mar 1 03:47:42.073: ICMPv6-ND: DELAY -> PROBE: FE80::1
Mar 1 03:47:42.073: ICMPv6-ND: Sending NS for FE80::1 on FastEthernet0/0
Mar 1 03:47:42.169: ICMPv6: Received ICMPv6 packet from FE80::1, type 135
Mar 1 03:47:42.173: ICMPv6-ND: Received NS for FE80::3 on FastEthernet0/0 from FE80::1
Mar 1 03:47:42.173: ICMPv6-ND: Sending NA for FE80::3 on FastEthernet0/0
Mar 1 03:47:42.173: ICMPv6: Received ICMPv6 packet from FE80::1, type 136
Mar 1 03:47:42.173: ICMPv6-ND: Received NA for FE80::1 on FastEthernet0/0 from FE80::1
Mar 1 03:47:42.177: ICMPv6-ND: PROBE -> REACH: FE80::1
Mar 1 03:48:12.179: ICMPv6-ND: REACH -> STALE: FE80::1

R1#debug ipv6 icmp
R1#debug ipv6 nd

Mar 1 03:47:37.095: ICMPv6: Received echo request from FE80::3
Mar 1 03:47:37.095: ICMPv6: Sending echo reply to FE80::3
Mar 1 03:47:37.095: ICMPv6-ND: STALE -> DELAY: FE80::3
Mar 1 03:47:42.095: ICMPv6-ND: DELAY -> PROBE: FE80::3
Mar 1 03:47:42.095: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0
Mar 1 03:47:42.115: ICMPv6: Received ICMPv6 packet from FE80::3, type 135
Mar 1 03:47:42.119: ICMPv6-ND: Received NS for FE80::1 on FastEthernet0/0 from FE80::3
Mar 1 03:47:42.119: ICMPv6-ND: Sending NA for FE80::1 on FastEthernet0/0
Mar 1 03:47:42.215: ICMPv6: Received ICMPv6 packet from FE80::3, type 136
Mar 1 03:47:42.215: ICMPv6-ND: Received NA for FE80::3 on FastEthernet0/0 from FE80::3
Mar 1 03:47:42.215: ICMPv6-ND: PROBE -> REACH: FE80::3
Mar 1 03:48:12.215: ICMPv6-ND: REACH -> STALE: FE80::3

R3 sends an echo request to R1 immediately, using the Layer-2 address from the STALE entry in the neighbor cache.  R3 also changes the state of the entry from STALE to DELAY, meaning at least 1 packet has been sent to this address and R3 is now awaiting confirmation of reachability:

R3:
Mar 1 03:47:37.073: ICMPv6: Sending echo request to FE80::1
Mar 1 03:47:37.073: ICMPv6-ND: STALE -> DELAY: FE80::1

R1 receives the echo request a few milliseconds later and sends an echo reply back, also using the STALE entry that it has in the neighbor cache for R3.  Sending an echo reply causes the state to change to DELAY since R1 is now awaiting confirmation of reachability:

R1:
Mar 1 03:47:37.095: ICMPv6: Received echo request from FE80::3
Mar 1 03:47:37.095: ICMPv6: Sending echo reply to FE80::3
Mar 1 03:47:37.095: ICMPv6-ND: STALE -> DELAY: FE80::3

R3 receives the echo reply from R1 a few milliseconds later:

R3:
Mar 1 03:47:37.117: ICMPv6: Received echo reply from FE80::1

However, this does not count as a confirmation of reachablity and the cache entry remains in the DELAY state.  The only ways that reachability can be confirmed are:

1.  Hints from an upper-layer protocol show that the connection is making forward progress – for example increasing, non-duplicate TCP acknowledgements are received.

2.  A solicited NA is received in response to an NS.  The NA must be solicited because an unsolicited NA would only confirm 1-way reachability.

Since reachability has not been confirmed, R3 stays in the DELAY state waiting for reachability confirmation for a total of 5 seconds.  After 5 seconds, R3 changes to the PROBE state and actively tries to confirm reachability by sending an NS to the address in question:

R3:
Mar 1 03:47:42.073: ICMPv6-ND: DELAY -> PROBE: FE80::1
Mar 1 03:47:42.073: ICMPv6-ND: Sending NS for FE80::1 on FastEthernet0/0

R1’s 5 second timer expires just a few milliseconds later and it also changes the entry to the PROBE state and sends an NS to R1’s address:

R1:
Mar 1 03:47:42.095: ICMPv6-ND: DELAY -> PROBE: FE80::3
Mar 1 03:47:42.095: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0

R1 receives R3’s NS and sends an NA with the solicited flag set:

R1:
Mar 1 03:47:42.115: ICMPv6: Received ICMPv6 packet from FE80::3, type 135
Mar 1 03:47:42.119: ICMPv6-ND: Received NS for FE80::1 on FastEthernet0/0 from FE80::3
Mar 1 03:47:42.119: ICMPv6-ND: Sending NA for FE80::1 on FastEthernet0/0

R3 also receives R1’s NS and sends an NA with the solicited flag set:

R3:
Mar 1 03:47:42.169: ICMPv6: Received ICMPv6 packet from FE80::1, type 135
Mar 1 03:47:42.173: ICMPv6-ND: Received NS for FE80::3 on FastEthernet0/0 from FE80::1
Mar 1 03:47:42.173: ICMPv6-ND: Sending NA for FE80::3 on FastEthernet0/0

R3 receives the solicited NA from R1.  This confirms 2-way reachability, so R3 changes the state of the entry to REACH:

R3:
Mar 1 03:47:42.173: ICMPv6: Received ICMPv6 packet from FE80::1, type 136
Mar 1 03:47:42.173: ICMPv6-ND: Received NA for FE80::1 on FastEthernet0/0 from FE80::1
Mar 1 03:47:42.177: ICMPv6-ND: PROBE -> REACH: FE80::1

Likewise, R1 receives the solicited NA from R3 and now has confirmed 2-way reachability, so R1 changes the state of the entry to REACH:

R1:
Mar 1 03:47:42.215: ICMPv6: Received ICMPv6 packet from FE80::3, type 136
Mar 1 03:47:42.215: ICMPv6-ND: Received NA for FE80::3 on FastEthernet0/0 from FE80::3
Mar 1 03:47:42.215: ICMPv6-ND: PROBE -> REACH: FE80::3

After the Reachable Time expires (30 seconds by default) without further 2-way reachability confirmation, the state of the entry changes back to STALE:

R3:
Mar 1 03:48:12.179: ICMPv6-ND: REACH -> STALE: FE80::1

R1:
Mar 1 03:48:12.215: ICMPv6-ND: REACH -> STALE: FE80::3

For traffic that does not offer a positive confirmation of 2-way reachability, this results in each device sending an NS and solicited NA to each other approximately every 35 seconds (Reachable Time + Delay time):

R3#ping fe80::1 repeat 1000000

R1#debug ipv6 nd
Mar 1 04:36:48.634: ICMPv6-ND: STALE -> DELAY: FE80::3
Mar 1 04:36:53.622: ICMPv6-ND: Received NS for FE80::1 on FastEthernet0/0 from FE80::3
Mar 1 04:36:53.622: ICMPv6-ND: Sending NA for FE80::1 on FastEthernet0/0
Mar 1 04:36:53.634: ICMPv6-ND: DELAY -> PROBE: FE80::3
Mar 1 04:36:53.634: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0
Mar 1 04:36:53.710: ICMPv6-ND: Received NA for FE80::3 on FastEthernet0/0 from FE80::3
Mar 1 04:36:53.710: ICMPv6-ND: PROBE -> REACH: FE80::3
Mar 1 04:37:23.710: ICMPv6-ND: REACH -> STALE: FE80::3
Mar 1 04:37:23.714: ICMPv6-ND: STALE -> DELAY: FE80::3
Mar 1 04:37:28.666: ICMPv6-ND: Received NS for FE80::1 on FastEthernet0/0 from FE80::3
Mar 1 04:37:28.666: ICMPv6-ND: Sending NA for FE80::1 on FastEthernet0/0
Mar 1 04:37:28.710: ICMPv6-ND: DELAY -> PROBE: FE80::3
Mar 1 04:37:28.710: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0
Mar 1 04:37:28.738: ICMPv6-ND: Received NA for FE80::3 on FastEthernet0/0 from FE80::3
Mar 1 04:37:28.742: ICMPv6-ND: PROBE -> REACH: FE80::3
Mar 1 04:37:58.742: ICMPv6-ND: REACH -> STALE: FE80::3
Mar 1 04:37:58.762: ICMPv6-ND: STALE -> DELAY: FE80::3
Mar 1 04:38:03.754: ICMPv6-ND: Received NS for FE80::1 on FastEthernet0/0 from FE80::3
Mar 1 04:38:03.758: ICMPv6-ND: Sending NA for FE80::1 on FastEthernet0/0
Mar 1 04:38:03.762: ICMPv6-ND: DELAY -> PROBE: FE80::3
Mar 1 04:38:03.762: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0
Mar 1 04:38:03.818: ICMPv6-ND: Received NA for FE80::3 on FastEthernet0/0 from FE80::3
Mar 1 04:38:03.818: ICMPv6-ND: PROBE -> REACH: FE80::3

The Reachable Time can be advertised in RAs so that hosts know what value to use.  However, Cisco routers that are acting as hosts do not seem to learn the timer from their default router.  R1 and R2 begin advertising a 10 second Reachable Time in RAs, but R3 continues using 30 seconds:

R1:
interface FastEthernet0/0
 ipv6 nd reachable-time 10000

R2:
interface FastEthernet0/0
 ipv6 nd reachable-time 10000

20-r1-showint

21-r2-showint

22-r3-showint

R3#ping fe80::1 repeat 1

R1:
Mar 1 05:04:55.730: ICMPv6-ND: PROBE -> REACH: FE80::3
Mar 1 05:05:05.730: ICMPv6-ND: REACH -> STALE: FE80::3

R3:
Mar 1 05:04:55.646: ICMPv6-ND: PROBE -> REACH: FE80::1
Mar 1 05:05:25.647: ICMPv6-ND: REACH -> STALE: FE80::1

The only way to get R3 to use a non-default Reachable Time is to manually configure it in the same way as R1 and R2.

 

Next, let’s try to ping R1’s global unicast address from R3.  R1 and R3 both start with STALE entries for the link-local addresses of the other 2 devices in their neighbor cache:

23-r1-neighbors

24-r3-neighbors

25-r3-ping

The following sequence of events takes place after sending the echo request from R3:

R3#debug ipv6 nd
Mar 1 06:01:00.486: ICMPv6-ND: DELETE -> INCMP: 2001:123::1
Mar 1 06:01:00.486: ICMPv6-ND: Sending NS for 2001:123::1 on FastEthernet0/0
Mar 1 06:01:00.642: ICMPv6-ND: Received NA for 2001:123::1 on FastEthernet0/0 from 2001:123::1
Mar 1 06:01:00.642: ICMPv6-ND: INCMP -> REACH: 2001:123::1
Mar 1 06:01:05.594: ICMPv6-ND: Received NS for 2001:123::3 on FastEthernet0/0 from FE80::1
Mar 1 06:01:05.594: ICMPv6-ND: Sending NA for 2001:123::3 on FastEthernet0/0
Mar 1 06:01:05.594: ICMPv6-ND: STALE -> DELAY: FE80::1
Mar 1 06:01:10.595: ICMPv6-ND: DELAY -> PROBE: FE80::1
Mar 1 06:01:10.595: ICMPv6-ND: Sending NS for FE80::1 on FastEthernet0/0
Mar 1 06:01:10.639: ICMPv6-ND: Received NA for FE80::1 on FastEthernet0/0 from FE80::1
Mar 1 06:01:10.639: ICMPv6-ND: PROBE -> REACH: FE80::1
Mar 1 06:01:15.615: ICMPv6-ND: Received NS for FE80::3 on FastEthernet0/0 from FE80::1
Mar 1 06:01:15.615: ICMPv6-ND: Sending NA for FE80::3 on FastEthernet0/0
Mar 1 06:01:30.716: ICMPv6-ND: REACH -> STALE: 2001:123::1
Mar 1 06:01:40.641: ICMPv6-ND: REACH -> STALE: FE80::1

R1#debug ipv6 nd
Mar 1 06:01:00.578: ICMPv6-ND: Received NS for 2001:123::1 on FastEthernet0/0 from 2001:123::3
Mar 1 06:01:00.578: ICMPv6-ND: DELETE -> INCMP: 2001:123::3
Mar 1 06:01:00.578: ICMPv6-ND: INCMP -> STALE: 2001:123::3
Mar 1 06:01:00.578: ICMPv6-ND: Sending NA for 2001:123::1 on FastEthernet0/0
Mar 1 06:01:00.582: ICMPv6-ND: STALE -> DELAY: 2001:123::3
Mar 1 06:01:05.582: ICMPv6-ND: DELAY -> PROBE: 2001:123::3
Mar 1 06:01:05.582: ICMPv6-ND: Sending NS for 2001:123::3 on FastEthernet0/0
Mar 1 06:01:05.626: ICMPv6-ND: Received NA for 2001:123::3 on FastEthernet0/0 from 2001:123::3
Mar 1 06:01:05.626: ICMPv6-ND: PROBE -> REACH: 2001:123::3
Mar 1 06:01:10.610: ICMPv6-ND: Received NS for FE80::1 on FastEthernet0/0 from FE80::3
Mar 1 06:01:10.610: ICMPv6-ND: Sending NA for FE80::1 on FastEthernet0/0
Mar 1 06:01:10.610: ICMPv6-ND: STALE -> DELAY: FE80::3
Mar 1 06:01:15.610: ICMPv6-ND: DELAY -> PROBE: FE80::3
Mar 1 06:01:15.610: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0
Mar 1 06:01:15.754: ICMPv6-ND: Received NA for FE80::3 on FastEthernet0/0 from FE80::3
Mar 1 06:01:15.754: ICMPv6-ND: PROBE -> REACH: FE80::3
Mar 1 06:01:35.626: ICMPv6-ND: REACH -> STALE: 2001:123::3
Mar 1 06:01:45.754: ICMPv6-ND: REACH -> STALE: FE80::3

There are a few differences here compared to when we pinged R1’s link-local address.  R3 does not know what Layer-2 address to use yet for 2001:123::1.  R3 creates an incomplete (INCMP) entry for the address in the neighbor cache and sends an NS for the address.  When R1 receives the NS, it creates an entry for R3’s global address and answers with a solicited NA.  R3 receives the solicited NA which confirms reachability and tells R3 the Layer-2 address to use.  R3 can now send the echo request to R1.  R1 does not receive confirmation of 2-way reachability to R3’s global unicast address however, so 5 seconds after sending the NA to R3, R1 sends an NS for R3’s global unicast address sourced from R1’s link-local address which starts a chain reaction.  R3 replies to the NS with a solicited NA, which confirms reachability on R1 to R3’s global unicast address and causes the entry on R3 for R1’s link-local to change to the DELAY state.  After another 5 seconds, R3 sends an NS sourced from it’s link-local address for R1’s link-local address since it has not received reachability confirmation yet.  R1 replies with a solicited NA, confirming reachability for R3 to R1’s link-local address and causing R1 to change the state of R3’s link-local entry to DELAY.  After 5 seconds, R1 sends an NS for R3’s link-local and R3 answers with a solicited NA to confirm reachability.  In total, 4 NS and 4 solicited NA were sent between the 2 devices:

26-r3-wireshark

 

Yet another situation occurs when the devices already have an entry for the address in their neighbor cache.  In this case, R1 and R3 both have an entry for the other’s global unicast address since it was just pinged:

27-r1-neighbors

28-r3-neighbors

29-r3-ping

This time, the following sequence of events occurs:

R3#debug ipv6 nd
Mar 1 06:38:39.142: ICMPv6-ND: STALE -> DELAY: 2001:123::1
Mar 1 06:38:39.234: ICMPv6-ND: ULP indication 2001:123::1
Mar 1 06:38:39.234: ICMPv6-ND: DELAY -> REACH: 2001:123::1
Mar 1 06:38:44.146: ICMPv6-ND: Received NS for 2001:123::3 on FastEthernet0/0 from FE80::1
Mar 1 06:38:44.146: ICMPv6-ND: Sending NA for 2001:123::3 on FastEthernet0/0
Mar 1 06:38:44.146: ICMPv6-ND: STALE -> DELAY: FE80::1
Mar 1 06:38:49.146: ICMPv6-ND: DELAY -> PROBE: FE80::1
Mar 1 06:38:49.146: ICMPv6-ND: Sending NS for FE80::1 on FastEthernet0/0
Mar 1 06:38:49.182: ICMPv6-ND: Received NA for FE80::1 on FastEthernet0/0 from FE80::1
Mar 1 06:38:49.182: ICMPv6-ND: PROBE -> REACH: FE80::1
Mar 1 06:38:54.206: ICMPv6-ND: Received NS for FE80::3 on FastEthernet0/0 from FE80::1
Mar 1 06:38:54.206: ICMPv6-ND: Sending NA for FE80::3 on FastEthernet0/0
Mar 1 06:39:09.234: ICMPv6-ND: REACH -> STALE: 2001:123::1
Mar 1 06:39:19.182: ICMPv6-ND: REACH -> STALE: FE80::1

R1#debug ipv6 nd
Mar 1 06:38:39.130: ICMPv6-ND: STALE -> DELAY: 2001:123::3
Mar 1 06:38:44.130: ICMPv6-ND: DELAY -> PROBE: 2001:123::3
Mar 1 06:38:44.130: ICMPv6-ND: Sending NS for 2001:123::3 on FastEthernet0/0
Mar 1 06:38:44.166: ICMPv6-ND: Received NA for 2001:123::3 on FastEthernet0/0 from 2001:123::3
Mar 1 06:38:44.166: ICMPv6-ND: PROBE -> REACH: 2001:123::3
Mar 1 06:38:49.166: ICMPv6-ND: Received NS for FE80::1 on FastEthernet0/0 from FE80::3
Mar 1 06:38:49.166: ICMPv6-ND: Sending NA for FE80::1 on FastEthernet0/0
Mar 1 06:38:49.166: ICMPv6-ND: STALE -> DELAY: FE80::3
Mar 1 06:38:54.166: ICMPv6-ND: DELAY -> PROBE: FE80::3
Mar 1 06:38:54.166: ICMPv6-ND: Sending NS for FE80::3 on FastEthernet0/0
Mar 1 06:38:54.198: ICMPv6-ND: Received NA for FE80::3 on FastEthernet0/0 from FE80::3
Mar 1 06:38:54.198: ICMPv6-ND: PROBE -> REACH: FE80::3
Mar 1 06:39:14.166: ICMPv6-ND: REACH -> STALE: 2001:123::3
Mar 1 06:39:24.198: ICMPv6-ND: REACH -> STALE: FE80::3

This time, R3 does not have to send an NS before it can send the echo request because it already has a STALE entry for R1’s global address.  The most interesting difference, however, is in the 2nd and 3rd debug messages shown on R3:

R3:
Mar 1 06:38:39.234: ICMPv6-ND: ULP indication 2001:123::1
Mar 1 06:38:39.234: ICMPv6-ND: DELAY -> REACH: 2001:123::1

When R3 receives an echo reply from R1, it considers this to be confirmation of reachability (shown as ‘ULP indication’) and immediately changes the entry to the REACH state.  For whatever reason, an echo reply to an echo request from a global unicast address counts as upper-layer protocol confirmation, but an echo reply to an echo request from a link-local address does not (as we saw earlier).  In total 3 NS and 3 solicited NA are sent between the 2 devices:

30-r3-wireshark

 

Next let’s look at an example of redirects being used as part of NDP in IPv6.  R3 is still using R2 as it’s default router, so we will create a loopback on R1 and advertise it to R2 with OSPFv3:

R1:
ipv6 router ospf 1
 router-id 1.1.1.1
!
interface Loopback0
 ipv6 address 2001:1::1/64
 ipv6 ospf network point-to-point
 ipv6 ospf 1 area 0
!
interface FastEthernet0/0
 ipv6 ospf 1 area 0

R2:
ipv6 router ospf 1
 router-id 2.2.2.2
!
interface FastEthernet0/0
 ipv6 ospf 1 area 0

R2 now has a route to R1’s loopback using R1’s link-local as the next hop:

31-r2-showroute

On R3, we attempt to ping R1’s loopback:

32-r3-ping

R3 sends the packet to it’s default router, R2.  R2 sends the packet to R1 and also sends a redirect back to R3 telling it to use R1 in the future for that address:

R2#debug ipv6 icmp
Mar  1 07:00:27.290: ICMPv6: Sending REDIRECT for 2001:1::1, target FE80::1 on FastEthernet0/0

Several NS and solicited NA are also sent between the 3 devices to confirm reachability:

33-r3-wireshark

According to RFC 2461 on IPv6 Neighbor Discovery:

A host receiving a valid redirect SHOULD update its Destination Cache accordingly so that subsequent traffic goes to the specified target. If no Destination Cache entry exists for the destination, an implementation SHOULD create such an entry. 

However, a Cisco router acting as a host apparently does not follow this recommendation, as R3 continues trying to send all traffic for R1’s loopback to R2 and R2 continues sending redirects.

Posted in IPv6 | Leave a Comment »

HSRP/VRRP/GLBP Misconfigurations

Posted by Andy on February 24, 2009

In this post, we will look at what happens when HSRP, VRRP, and GLBP are misconfigured (or attacked) in different ways.  The three different misconfigurations we will look at for each protocol are mismatched virtual IP address, mismatched group numbers, and mismatched authentication.  The topology is shown below:

hsrp-misconfiguration-topology1

SW3 is a router with NM-16ESW acting as a layer-2 switch.  Host4 and Host5 are both routers acting as hosts on the network.  Each device will be configured with IP 10.1.1.X and MAC address 0000.0000.000X where X is the device number for the sake of clarity:

R1:
interface FastEthernet0/0
 mac-address 0000.0000.0001
 ip address 10.1.1.1 255.255.255.0

R2:
interface FastEthernet0/0
 mac-address 0000.0000.0002
 ip address 10.1.1.2 255.255.255.0

R4:
interface FastEthernet0/0
 mac-address 0000.0000.0004
 ip address 10.1.1.4 255.255.255.0
!
no ip routing
ip default-gateway 10.1.1.101

R5:
interface FastEthernet0/0
 mac-address 0000.0000.0005
 ip address 10.1.1.5 255.255.255.0
!
no ip routing
ip default-gateway 10.1.1.101

The MAC address table after generating some traffic from each device and prior to enabling any first hop redundancy protocols is shown below:

1-mac-table

Now we will look at each of the different scenarios, starting with HSRP.

 

 

HSRP IP Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.0c07.ac01

R2

10.1.1.102

1

0000.0c07.ac01

 

R1:
interface FastEthernet0/0
 standby 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 standby 1 ip 10.1.1.102

If both routers are configured at approximately the same time, R2 becomes active because of it’s higher IP address and R1 becomes standby:

1-hsrp-brief-r1

1-hsrp-brief-r2

R1 generates the following log message, but still continues to accept R2 as the active router:

R1:
*Mar 1 03:31:16.919: %HSRP-4-DIFFVIP1: FastEthernet0/0 Grp 1 active routers virtual IP address 10.1.1.102 is different to the locally configured address 10.1.1.101

SW3 learns the HSRP virtual MAC address on F1/2 since only R2 sources traffic from that address:

1-mac-table2

10.1.1.102 is reachable to any hosts on the network, however 10.1.1.101 is not because R1 does not respond to it while in standby.

1-hsrp-ping-r2

1-hsrp-ping-r11

If 10.1.1.101 had been the correct gateway address for hosts to use, it would no longer be reachable.

 

 

HSRP Group Number Mismatch:

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.0c07.ac01

R2

10.1.1.101

2

0000.0c07.ac02

 

R1:
interface FastEthernet0/0
 standby 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 standby 2 ip 10.1.1.101

This time R1 and R2 ignore each other’s hellos and both routers become active:

2-hsrp-brief-r1

2-hsrp-brief-r2

When transitioning to active, HSRP broadcasts gratuitous ARP messages:

R2#debug arp
*Mar 1 03:53:30.955: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 2 state Standby -> Active
*Mar 1 03:53:30.955: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:30.959: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

After hearing each other’s gratuitous ARPs, each router generates a log message reporting a duplicate IP address and broadcasts another gratuitous ARP in an attempt to fix the ARP cache of other devices on the network which may have been changed by the other router’s gratuitous ARP:

R1#debug arp
*Mar 1 03:53:31.035: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac02, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:53:31.039: %IP-4-DUPADDR: Duplicate address 10.1.1.101 on FastEthernet0/0, sourced by 0000.0c07.ac02
*Mar 1 03:53:31.039: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac01,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:31.043: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac01,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

This results in gratuitous ARP broadcasts continually being sent by R1 and R2 in response to receiving each other’s gratuitous ARPs.  It appears that IOS limits gratuitous ARPs to 1 per second based on the ARP debug messages shown below.  R2 receives a gratuitous ARP from R1 at 47.631.  At the same time, a message is shown immediately after it that a gratuitous ARP was throttled and 10.1.1.101 was added to arp_defense_Q.  At 48.207, another debug message shows that 10.1.1.101 was removed from arp_defense_Q and the gratuitous ARP is broadcast using R2’s virtual MAC as the source.  Another gratuitous ARP is received from R1 shortly after that, and R2 again throttles it and waits to send it’s own gratuitous ARP until 49.207 – exactly 1 second after it sent the previous one.  This pattern continues on each router:

R2:
*Mar 1 03:53:47.631: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:53:47.631: IP ARP: Gratuitous ARP throttled.
*Mar 1 03:53:47.635: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar 1 03:53:48.207: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar 1 03:53:48.207: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:48.211: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0
*Mar 1 03:53:48.371: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:53:48.375: IP ARP: Gratuitous ARP throttled.
*Mar 1 03:53:48.375: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar 1 03:53:49.207: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar 1 03:53:49.207: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:49.211: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0
*Mar 1 03:53:49.739: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:53:49.739: IP ARP: Gratuitous ARP throttled.
*Mar 1 03:53:49.739: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar 1 03:53:50.207: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar 1 03:53:50.207: IP ARP: sent rep src 10.1.1.101 0000.0c0c7.ac02,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:50.211: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

As a result, hosts on the network continually update their ARP cache twice per second:

Host4#debug arp
*Mar 1 03:58:35.426: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:58:36.338: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac02, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:58:36.402: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:58:37.314: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac02, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:58:37.462: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0

2-hsrp-host4-sharp1

SW3 learns each of the virtual MACs on the interfaces they are received and the table stays the same over time since R1 and R2 use separate virtual MACs:

2-mac-table

Any traffic that hosts send to a remote network with the virtual IP configured as their gateway will end up being split between the 2 routers (whichever router’s virtual MAC is currently in the hosts ARP cache at that moment will receive the frame).

 

 

HSRP Authentication Mismatch:

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.0c07.ac01

cisco1

R2

10.1.1.101

1

0000.0c07.ac01

cisco2

 

R1:
interface FastEthernet0/0
 standby 1 ip 10.1.1.101
 standby 1 authentication cisco1

R2:
interface FastEthernet0/0
 standby 1 ip 10.1.1.101
 standby 1 authentication cisco2

Again R1 and R2 both become active and do not recognize a standby router for the group:

3-hsrp-brief-r1

3-hsrp-brief-r2

Also the following log message shows up on each router:

R1:
*Mar 1 04:07:03.226: %HSRP-4-BADAUTH: Bad authentication from 10.1.1.2, group 1, remote state Active

Since both routers use the same virtual IP and virtual MAC, they do not generate gratuitous ARPs in response to receiving a gratuitous ARP from the other router.  However, since SW3 receives HSRP hellos sourced from the virtual MAC by both routers, the MAC address flaps back and forth between interfaces as often as R1 and R2 send HSRP hellos (or more if they send other traffic):

3-mac-table

Again, any traffic that hosts send to a remote network with the virtual IP configured as their gateway ends up being split between the 2 routers depending on which interface the virtual MAC address was last learned out of by the switch at the moment a frame is received from a host.

 

 

VRRP IP Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.5e00.0101

R2

10.1.1.102

1

0000.5e00.0101

 

R1:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.102

Unlike when HSRP was configured with different virtual IPs and only 1 became active, both VRRP routers think that they are the master:

4-vrrp-brief-r1

4-vrrp-brief-r2

Since both use the same group number and therefore the same virtual MAC, the MAC flaps back and forth between interfaces on SW3:

4-mac-table

If a host attempts to send traffic to a remote network, all traffic makes it to the destination and back (as long as both routers have a route to it) even though the host is unknowingly sending half of the traffic to a router that is using a different IP than it’s configured gateway address:

4-vrrp-ping-remote

However if the host attempts to send traffic to the gateway address itself, some of the traffic will be dropped:

4-vrrp-ping-gateway

Some of the traffic will be sent to R2 because of the virtual MAC address flapping between interfaces on SW3.  R2 knows that the destination address (10.1.1.101) is on the same subnet where it was received, so R2 sends an ICMP redirect to Host4.  It also thinks that the MAC address being used for 10.1.1.101 by Host4 is incorrect since it is R2’s virtual MAC address, so it sends an ARP request for the destination address:


R2#debug ip icmp
R2#debug arp

*Mar 1 05:58:47.966: ICMP: redirect sent to 10.1.1.4 for dest 10.1.1.101, use gw 10.1.1.101
*Mar 1 05:58:47.966: IP ARP: sent req src 10.1.1.2 0000.0000.0002,
dst 10.1.1.101 0000.0000.0000 FastEthernet0/0

The ICMP redirect tells Host4 to use gateway 10.1.1.101 to reach 10.1.1.101 – not very helpful to Host4, but R2 doesn’t realize that Host4 already thought it was sending directly to 10.1.1.101.  After receiving the ICMP redirect from R2, Host4 also sends out an ARP request for 10.1.1.101.  R1 replies to both ARP requests:


R1#debug arp
*Mar 1 05:58:48.002: IP ARP: rcvd req src 10.1.1.2 0000.0000.0002, dst 10.1.1.101 FastEthernet0/0
*Mar 1 05:58:48.002: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
dst 10.1.1.2 0000.0000.0002 FastEthernet0/0
*Mar 1 05:58:49.934: IP ARP: rcvd req src 10.1.1.4 0000.0000.0004, dst 10.1.1.101 FastEthernet0/0
*Mar 1 05:58:49.938: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
dst 10.1.1.4 0000.0000.0004 FastEthernet0/0

Neither of the ARP replies are helpful.  R2 uses the same MAC for it’s virtual MAC, so it does not add the entry to it’s ARP cache.  R1 receives the same destination MAC address that it already was using for 10.1.1.101.  Host4 continues trying to use the same MAC to reach 10.1.1.101, and the cycle will continue with only some of the traffic making it to R1 because some of it is switched incorrectly at SW3.

 

 

VRRP Group Number Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.5e00.0101

R2

10.1.1.101

2

0000.5e00.0102

 

R1:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 vrrp 2 ip 10.1.1.101

A group number mismatch in VRRP behaves identically to a group number mismatch in HSRP.  R1 and R2 ignore each other’s hellos and both routers become master:

 5-vrrp-brief-r1

5-vrrp-brief-r2

When transitioning to master, VRRP broadcasts gratuitous ARP messages:

R2#debug arp
*Mar 1 06:28:37.346: %VRRP-6-STATECHANGE: Fa0/0 Grp 2 state Backup -> Master
*Mar 1 06:28:37.346: IP ARP: sent rep src 10.1.1.101 0000.5e00.0102,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 06:28:37.350: IP ARP: sent rep src 10.1.1.101 0000.5e00.0102,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

After hearing each other’s gratuitous ARPs, each router generates a log message reporting a duplicate IP address and broadcasts another gratuitous ARP in an attempt to fix the ARP cache of other devices on the network which may have been changed by the other router’s gratuitous ARP:

R1#debug arp
*Mar 1 06:28:37.366: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar 1 06:28:37.366: %IP-4-DUPADDR: Duplicate address 10.1.1.101 on FastEthernet0/0, sourced by 0000.5e00.0102
*Mar 1 06:28:37.370: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 06:28:37.370: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

This results in gratuitous ARP broadcasts continually being sent by R1 and R2 in response to receiving each other’s gratuitous ARPs.  It appears that IOS limits gratuitous ARPs to 1 per second based on the ARP debug messages shown below.  R1 receives a gratuitous ARP from R2 at 39.290.  At the same time, a message is shown immediately after it that a gratuitous ARP was throttled and 10.1.1.101 was added to arp_defense_Q.  At 39.350, another debug message shows that 10.1.1.101 was removed from arp_defense_Q and the gratuitous ARP is broadcast using R1’s virtual MAC as the source.  Another gratuitous ARP is received from R2 shortly after that, and R1 again throttles it and waits to send it’s own gratuitous ARP until 40.350 – exactly 1 second after it sent the previous one.  This pattern continues on each router:

 
R1:
*Mar  1 06:28:39.290: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar  1 06:28:39.290: IP ARP: Gratuitous ARP throttled.
*Mar  1 06:28:39.350: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar  1 06:28:39.350: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar  1 06:28:39.354: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0
*Mar  1 06:28:39.450: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102,
dst 10.1.1.101 FastEthernet0/0
*Mar  1 06:28:39.454: IP ARP: Gratuitous ARP throttled.
*Mar  1 06:28:39.454: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar  1 06:28:40.278: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar  1 06:28:40.278: IP ARP: Gratuitous ARP throttled.
*Mar  1 06:28:40.350: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar  1 06:28:40.350: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar  1 06:28:40.354: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0
*Mar  1 06:28:41.210: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar  1 06:28:41.210: IP ARP: Gratuitous ARP throttled.
*Mar  1 06:28:41.210: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar  1 06:28:41.350: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar  1 06:28:41.350: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar  1 06:28:41.354: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

As a result, hosts on the network continually update their ARP cache twice per second:


Host4#debug arp
*Mar 1 07:01:47.518: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0101, dst 10.1.1.101 FastEthernet0/0
*Mar 1 07:01:48.262: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar 1 07:01:48.482: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0101, dst 10.1.1.101 FastEthernet0/0
*Mar 1 07:01:49.258: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar 1 07:01:49.402: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0101, dst 10.1.1.101 FastEthernet0/0

5-hsrp-host4-sharp

SW3 learns each of the virtual MACs on the interfaces they are received and the table stays the same over time since R1 and R2 use separate virtual MACs:

 5-mac-table

Any traffic that hosts send to a remote network with the virtual IP configured as their gateway will end up being split between the 2 routers (whichever router’s virtual MAC is currently in the hosts ARP cache at that moment will receive the frame).

 

 

VRRP Authentication Mismatch:

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.5e00.0101

cisco1

R2

10.1.1.101

1

0000.5e00.0101

cisco2

 

R1:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.101
 vrrp 1 authentication cisco1

R2:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.101
 vrrp 1 authentication cisco2

VRRP authentication mismatch also behaves like HSRP authentication mismatch.  R1 and R2 both think that they are the master for the group:

6-vrrp-brief-r14

6-vrrp-brief-r2

Debugs show that the authentication has failed:


R1#debug vrrp errors
*Mar 1 07:25:34.542: VRRP: Grp 1 Advertisement from 10.1.1.2 has FAILED TEXT authentication

Since both routers use the same virtual IP and virtual MAC, they do not generate gratuitous ARPs in response to receiving a gratuitous ARP from the other router.  However, since SW3 receives VRRP hellos sourced from the virtual MAC by both routers, the MAC address flaps back and forth between interfaces as often as R1 and R2 send VRRP hellos (or more if they send other traffic):

6-mac-table

Any traffic that hosts send to a remote network with the virtual IP configured as their gateway ends up being split between the 2 routers depending on which interface the virtual MAC address was last learned out of by the switch at the moment a frame is received from a host.

 

 

GLBP IP Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0007.b400.0101

R2

10.1.1.102

1

0007.b400.0102

 

R1:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.102

Like HSRP, GLBP accepts a router as active even though the virtual IP differs.  R1 was configured before R2, so it becomes active for the AVG and AVF #1.  R2 becomes standby for the AVG and active for AVF #2:

7-glbp-brief-r1

7-glbp-brief-r21

R2 generates the following log message, but still continues to accept R1 as the AVG:

R2:
*Mar 1 07:57:22.906: %GLBP-4-DIFFVIP1: FastEthernet0/0 Grp 1 active routers virtual
IP address 10.1.1.101 is different to the locally configured
address 10.1.1.102

SW3 learns the AVF #1 virtual MAC address on F1/1 and AVF #2 virtual MAC on F1/2 since R1 and R2 both source hellos from those MAC addresses:

7-mac-table1

Like HSRP, the configured virtual IP of the router that is in standby is not reachable because the standby router does not respond to it.  If this was supposed to be the correct gateway address, hosts on the subnet would no longer be able to reach it or remote destinations:

7-glbp-ping-r2

7-glbp-host5-ping-r2

Additionally, because of the way that GLBP load balances ARP replies, even if the router with the correct virtual IP becomes the AVG hosts may still receive the virtual MAC of the router configured with the wrong virtual IP.  In this case, Host4 performs an ARP request first and receives the virtual MAC of R1.  Host5 performs an ARP request next and receives the virtual MAC of R2:

7-glbp-host4-sharp

7-glbp-host5-sharp

As we saw when VRRP was configured with an IP mismatch, this did not cause a problem for reaching remote networks because the host still uses the correct MAC address to reach one of the two routers.  Both Host4 and Host5 are able to reach an address on a different subnet:

7-glbp-host4-ping-remote1

7-glbp-host5-ping-remote

With mismatched IPs in VRRP, when hosts attempted to reach the gateway address itself, some traffic would make it and some would not due to the address flapping between interfaces, with all hosts being affected equally.  With mismatched IPs in GLBP, only some of the hosts are affected when trying to reach the gateway address.  Those that obtain the virtual MAC of a router that is configured with the same virtual IP as the AVG will always be able to reach the gateway, and those that obtain the virtual MAC of a router that is configured with a different virtual IP than the AVG will not be able to reach the gateway address at all:

7-glbp-host4-ping-gateway

7-glbp-host5-ping-gateway

 

 

GLBP Group Number Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0007.b400.0101

R2

10.1.1.101

2

0007.b400.0201

 

R1:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 glbp 2 ip 10.1.1.101

Like HSRP and VRRP, when 2 GLBP routers are configured with mismatched group numbers both become active:

8-glbp-brief-r1

8-glbp-brief-r2

Each router also uses a different virtual MAC.  GLBP virtual MACs by default are in the form of 0007.b40X.XXYY, where XXX is the group number in hexadecimal and YY is the forwarder number in hexadecimal.  The group numbers for R1 and R2 were set to 1 and 2 respectively, and since each router thinks that it is the only router in the group, they have both assigned themselves AVF #1.  This results in a virtual MAC of 0007.b400.0101 for R1 and 0007.b400.0201 for R2.  Unlike HSRP and VRRP, GLBP does not send gratuitous ARPs when becoming active so gratuitous ARPs are not sent back and forth constantly.  When a host ARPs for it’s gateway, both R1 and R2 reply with their virtual MAC.  The first reply creates an entry in ARP cache, and the second overwrites it, so the host will use the last ARP reply to arrive as it’s gateway.  In this case, the reply from R1 comes last so Host4 uses R1’s virtual MAC:


Host4#debug arp
Host4#ping 10.1.1.101

*Mar 1 08:12:54.489: IP ARP: creating incomplete entry for IP address: 10.1.1.101 interface FastEthernet0/0
*Mar 1 08:12:54.489: IP ARP: sent req src 10.1.1.4 0000.0000.0004,
dst 10.1.1.101 0000.0000.0000 FastEthernet0/0
*Mar 1 08:12:54.573: IP ARP: rcvd rep src 10.1.1.101 0007.b400.0201, dst 10.1.1.4 FastEthernet0/0
*Mar 1 08:12:54.621: IP ARP: rcvd rep src 10.1.1.101 0007.b400.0101, dst 10.1.1.4 FastEthernet0/0

8-glbp-host4-sharp

Hosts will use either one gateway or the other, depending on which ARP reply they receive last.  As long as both routers have full connectivity to other networks, hosts should still be able to reach any address both local and remote.

 

 

GLBP Authentication Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0007.b400.0101

cisco1

R2

10.1.1.101

1

0007.b400.0101

cisco2

 

R1:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.101
 glbp 1 authentication text cisco1

R2:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.101
 glbp 1 authentication text cisco2

A GLBP authentication mismatch behaves just like HSRP and VRRP with an authentication mismatch.  R1 and R2 both think they are the AVG for group 1:

9-glbp-brief-r1

9-glbp-brief-r2

The following log message shows up on each router:

R1:
*Mar 1 08:29:17.833: %GLBP-4-BADAUTH: Bad authentication received from 10.1.1.2, group 1

SW3 receives GLBP hellos from the same virtual MAC by both routers, so the MAC address flaps back and forth between interfaces on SW3 as often as R1 and R2 send hellos (or more if they send other traffic):

9-mac-table

Any traffic that hosts send to a remote network with the virtual IP configured as their gateway ends up being split between the 2 routers depending on which interface the virtual MAC address was last learned out of by the switch at the moment a frame is received from a host.

 

 

Summary:

A brief summary of the differences in each misconfiguration:

 

IP Mismatch

Group Mismatch

Auth. Mismatch

HSRP

  Different vIP, same vMAC

 Only 1 becomes active

 Active vIP is reachable, non-active vIP is not reachable

  If correct vIP becomes active, other networks are reachable

  If incorrect vIP becomes active, hosts cannot reach other networks

 

  Same vIP, different vMAC

  Both become active

 G. ARPs constantly sent

  Hosts constantly update ARP cache

 Traffic from hosts split between both routers, based on current host ARP cache

 Active vIP and other networks are reachable

  Same vIP and vMAC

 Both become active

  vMAC flaps on SW3

 Traffic from hosts split between both routers, based on current SW3 MAC table

 Active vIP and other networks are reachable

VRRP

  Different vIP, same vMAC

  Both become active

  vMAC flaps on SW3

 Both active vIPs are intermittently reachable, based on SW3 MAC table

  Other networks are reachable regardless of SW3 MAC table

 

  Same as above

 Same as above

GLBP

 Different vIP and vMAC

  Only 1 becomes active

  Non-active vIP is not reachable

  If incorrect vIP becomes active, hosts cannot reach other networks

  Active vIP is reachable to some hosts but not others, based on vMAC received from GLBP load balancing

  If correct vIP is active, other networks are reachable regardless of vMAC that the host receives

  Same vIP, different vMAC

  Both become active

  G. ARPs not sent

 Hosts that ARP for gateway receive replies from both AVGs

  Traffic from hosts split between both routers, based on which ARP reply was received last

  Active vIP and other networks are reachable

 

  Same as above

Posted in HSRP VRRP and GLBP | Leave a Comment »

Custom Queueing Byte Count Deficit

Posted by Andy on February 20, 2009

In older IOS versions, the custom queueing byte count deficit was not carried over to the next round robin pass, which could result in bandwidth sharing that was not proportional to the configured byte counts.  This example will demonstrate the behavior of custom queueing with the byte count deficit not being carried over in IOS 12.0(5), and then with the byte count deficit being carried over in IOS 12.4(18).  Since GTS and CB shaping do not support custom queueing on shaping queues, FRTS will be used.  The network topology and configurations are shown below:

topology1

R1:
interface Serial0/0
 no ip address
 encapsulation frame-relay
 load-interval 30
 no keepalive
!
interface Serial0/0.1 point-to-point
 ip address 10.1.12.1 255.255.255.0
 frame-relay interface-dlci 102
!
interface FastEthernet1/0
 ip address 10.1.1.1 255.255.255.0
 load-interval 30
 speed 100
 full-duplex
 no keepalive
 no mop enabled
!
no cdp run

R2:
interface Serial0/0
 no ip address
 encapsulation frame-relay
 load-interval 30
 no keepalive
!
interface Serial0/0.1 point-to-point
 ip address 10.1.12.2 255.255.255.0
 frame-relay interface-dlci 201
!
no cdp run

First we will use IOS 12.0(5) on R1.  PC will be used to generate 2 different types of traffic to UDP ports 1001 and 1002 (for more information on the method of generating traffic, see WFQ tests).  We will send both types of traffic at 8 packets/second with an L3 size of 996 bytes, which will give us 1000 bytes frames with the frame-relay header added.  This will result in 64,000 bps of each traffic type.  The first traffic type (sent to UDP port 1001) will be classified into custom queue 1 and the second traffic type will be classified into custom queue 2.  Queue 1 will be configured with a byte count of 1000 per round robin pass and queue 2 will be configured with a byte count of 1001.  Configuration and verification on R1:

R1:
queue-list 1 protocol ip 1 udp 1001
queue-list 1 protocol ip 2 udp 1002
queue-list 1 queue 1 byte-count 1000
queue-list 1 queue 2 byte-count 1001

cq-config

Next, we will enable FRTS on R1 with a CIR of 64,000 bps and apply the custom queueing configuration to the shaping queues:

R1:
interface Serial0/0
 frame-relay traffic-shaping
!
map-class frame-relay CQ-shaping-policy
 frame-relay cir 64000
 frame-relay custom-queue-list 1
!
interface Serial0/0.1 point-to-point
 frame-relay interface-dlci 102
  class CQ-shaping-policy

We will also configure R2 to measure incoming traffic of each type:

R2:
access-list 101 permit udp any any eq 1001
access-list 102 permit udp any any eq 1002
!
class-map match-all 1001
 match access-group 101
class-map match-all 1002
 match access-group 102
!
policy-map Traffic-Meter
 class 1001
 class 1002
!
interface Serial0/0
 service-policy input Traffic-Meter

Now we’re ready to start the 2 traffic streams:

flood.pl --port=1001 --size=996 --delay=125 10.1.12.2
flood.pl --port=1002 --size=996 --delay=125 10.1.12.2

On R1, we can see the FRTS and CQ information:

120-r1pvc

On R2 we can see the amount of each traffic type received:

120-r2pmap

Although the byte count allocated to queue 2 was only .1% more than queue 1 (1001/1000), it has sent exactly twice as many packets and bytes.  This is due to the byte count deficit not being carried over in 12.0(5).  The round robin cycle in 12.0(5) will go like this:

1. CQ takes a 1000-byte packet from queue 1.  The byte count for queue 1 has been met so CQ moves on to queue 2.

2. CQ takes a 1000-byte packet from queue 2.  The byte count for queue 2 has not been met (1000 < 1001), so CQ will take another packet from queue 2.

3. CQ takes another 1000-byte packet from queue 2.  The byte count for queue 2 has been met (2000 > 1001).  Since there are no other queues, return to Step #1 and service queue 1 again.

 

Now we will replace R1 with a router running IOS 12.4(18) and put the exact same configuration on it.  On R1, we can see the FRTS and CQ information as well as the packets enqueued in each of the CQ queues:

124-r1pvc

124-r1queue

On R2, the amount of each type of traffic received is shown below:

124-r2pmap

The traffic generator was left running for several hours to show how accurately the proportions of actual traffic sent match the byte counts in more modern IOS versions.  With equal sized packets in each queue, a byte count of 1000 configured for queue 1, and a byte count of 1001 configured for queue 2, queue 2 should be able to send 1 extra packet for every 1000 packets sent by queue 1.  We can see that after a little over 150,000 packets sent by queue 1, queue 2 has sent exactly 150 more packets than queue 1.  The dramatic difference in results is due to the byte count deficit being carried over.  The first 3 rounds of the round robin cycle in 12.4(18) will go like this:

1. CQ takes a 1000-byte packet from queue 1.  The byte count for queue 1 has been met so CQ moves on to queue 2.

2. CQ takes a 1000-byte packet from queue 2.  The byte count for queue 2 has not been met (1000 < 1001), so CQ will take another packet from queue 2.

3. CQ takes another 1000-byte packet from queue 2.  The byte count for queue 2 has been met and exceeded (2000 > 1001).  Subtract the excess bytes (999) from the configured byte count in the next round robin pass through queue 2 (1001 – 999 = 2).  Since there are no other queues, service queue 1 again.

4. CQ takes a 1000-byte packet from queue 1.  The byte count for queue 1 has been met so CQ moves on to queue 2.

5. CQ takes a 1000-byte packet from queue 2.  The byte count for queue 2 has been met and exceeded (1000 > 2).  Subtract the excess bytes (998) from the configured byte count in the next round robin pass through queue 2 (1001 – 998 = 3).  Since there are no other queues, service queue 1 again.

6. CQ takes a 1000-byte packet from queue 1.  The byte count for queue 1 has been met so CQ moves on to queue 2.

7. CQ takes a 1000-byte packet from queue 2.  The byte count for queue 2 has been met and exceeded (1000 > 3).  Subtract the excess bytes (997) from the configured byte count in the next round robin pass through queue 2 (1001 – 997 = 4).  Since there are no other queues, service queue 1 again.

As you can see, with the deficit being carried over, queue 2 will only be allowed to send 2 packets in a single pass once every 1000 rounds, rather than every single pass like in older IOS versions.

Posted in QoS | Leave a Comment »

CBWFQ, Routing Protocols, and max-reserved-bandwidth

Posted by Andy on February 18, 2009

Numerous sources, including Cisco documentation, often say that the percentage of bandwidth excluded from max-reserved-bandwidth (25% by default) is reserved for either link queues (routing updates, keepalives, etc.), unclassified best-effort traffic (matched by class-default), or both.  The 12.4 Mainline command reference for max-reserved-bandwidth says:

The sum of all bandwidth allocation on an interface should not exceed 75 percent of the available bandwidth on an interface. The remaining 25 percent of bandwidth is used for overhead, including Layer 2 overhead, control traffic, and best-effort traffic.

As can be seen in the previous CBWFQ tests that were performed, this definitely does not hold true for best-effort traffic that is put into dynamic conversations.  What about routing updates and other important traffic that ends up in the link queues?  To test this out, we will use the same simple topology shown below in IOS 12.4(18):

topology

R1:
interface FastEthernet0/0
 ip address 10.1.1.1 255.255.255.0
 load-interval 30
 speed 100
 full-duplex
 no keepalive
 no mop enabled
!
interface Serial0/0
 ip address 10.1.12.1 255.255.255.0
 load-interval 30
 no keepalive
!
no cdp run

R2:
interface Serial0/0
 ip address 10.1.12.2 255.255.255.0
 load-interval 30
 no keepalive
!
no cdp run

Next we will configure EIGRP on R1 and R2 and decrease the hello and hold timers to give us a little bit more traffic:

R1:
router eigrp 1
 network 10.1.12.1 0.0.0.0
 no auto-summary
!
interface Serial0/0
 ip hello-interval eigrp 1 1
 ip hold-time eigrp 1 3

R2:
router eigrp 1
 network 10.1.12.2 0.0.0.0
 no auto-summary
!
interface Serial0/0
 ip hello-interval eigrp 1 1
 ip hold-time eigrp 1 3

Next we will configure R2 to measure incoming traffic.  A TFTP flow will be used to create congestion, so we will create 3 different classes to measure TFTP, EIGRP hellos, and EIGRP updates (we will see later on why it was a good idea to measure EIGRP hellos and updates separately):

R2:
ip access-list extended EIGRP-Hello
 permit eigrp any host 224.0.0.10
ip access-list extended EIGRP-Update
 permit eigrp any host 10.1.12.2
ip access-list extended TFTP
 permit udp any any eq tftp
!
class-map match-all EIGRP-Hello
 match access-group name EIGRP-Hello
class-map match-all EIGRP-Update
 match access-group name EIGRP-Update
class-map match-all TFTP
 match access-group name TFTP
!
policy-map Traffic-Meter
 class EIGRP-Hello
 class EIGRP-Update
 class TFTP
!
interface Serial0/0
 service-policy input Traffic-Meter

Now let’s shutdown and re-enable R2’s S0/0 interface and examine how the EIGRP adjacency forms in the absence of congestion:

R2:
interface Serial0/0
 shutdown

A few seconds later…

R2:
interface Serial0/0
 no shutdown

wireshark1

cbwfq2-1-r2pmap

We can see that hello packets are being sent every 1 second in each direction.  When the adjacency forms, 3 update packets are exchanged in each direction followed by an acknowledgement from R1 to R2.  Each of the hello packets has size 64 bytes, which is confirmed in both Wireshark and the policy-map on R2.  Each of the update and acknowledgement packets has size 44 bytes.  Therefore we can expect that EIGRP traffic will use about 512 bps from hellos (8 * 64) plus a small amount of additional bandwidth from updates and acknowledgements at the start.  Next we will configure CBWFQ on R1 and shape traffic to 32 kbps:

R1:
ip access-list extended TFTP
 permit udp any any eq tftp
!
class-map match-all TFTP
 match access-group name TFTP
!
policy-map CBWFQ
 class TFTP
  bandwidth percent 75
 class class-default
  fair-queue 4096
policy-map Shaper
 class class-default
  shape average 32000
  service-policy CBWFQ
!
interface Serial0/0
 service-policy output Shaper

TFTP has been given 75% bandwidth, and the max-reserved-bandwidth has not been changed from the default of 75%.  If the remaining 25% (8,000 bps) is actually used for link queues, EIGRP should have way more bandwidth than it needs.  Now we will generate 64 kbps of TFTP traffic, more than enough to saturate the link and cause CBWFQ to begin:

flood.pl --port=69 --size=996 --delay=125 10.1.12.2

Within a few seconds, the following log messages repeatedly show up:

R1:
*Mar 1 04:25:31.998: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:25:32.930: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency
*Mar 1 04:25:40.302: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:25:41.242: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency
*Mar 1 04:25:48.278: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:25:49.222: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency
*Mar 1 04:25:56.538: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:25:57.530: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency
*Mar 1 04:26:04.782: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:26:05.730: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency

R2:
*Mar 1 04:26:34.506: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:26:37.506: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired
*Mar 1 04:26:42.774: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:26:45.770: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired
*Mar 1 04:26:51.242: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:26:54.238: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired
*Mar 1 04:26:59.298: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:27:02.298: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired
*Mar 1 04:27:07.522: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:27:10.518: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired

On R2, we can see that the hold timer keeps expiring and approximately every 8 to 8.5 seconds the adjacency is reforming.  Next take a look at the queues on R1 and the traffic  received on R2:

cbwfq2-2-r1queue

cbwfq2-2-r2pmap

As we saw in previous CBWFQ tests, CBWFQ uses a constant that is based on the number of WFQ dynamic queues when calculating the weights for user defined conversations as shown in the following table:

Number of flows

Constant

16

64

32

64

64

57

128

30

256

16

512

8

1024

4

2048

2

4096

1

We configured WFQ to use 4096 dynamic conversations, which results in the smallest possible constant.  Using the formula to calculate weights for user defined conversations, TFTP is assigned a weight of:

1  * (100 / 75)   = 1.33

When rounded up this becomes 2 as shown in the show traffic-shape queue output.  We also see 2 other conversations with packets enqueued.  Both are IP protocol 88, so we know they are both EIGRP.  One of them has destination address 224.0.0.10 and size 64 (EIGRP hellos) and the other has destination address 10.1.12.2 and size 44 (EIGRP updates).  Interestingly, they have both been given very different weight values.  The conversation number for EIGRP hellos (4103) falls within the range of the link queues (N through N+7, where N is the number of WFQ queues) and the weight of 1024 is also the same as the weight for link queues.  The conversation number for EIGRP updates (137) however falls within the range of dynamic queues (0 through N-1), and we can see that it’s weight of 4626 is consistent with a dynamic conversation that has IP Precedence of 6 ((32384 / (6 + 1)).  Because the link queue’s weight of 1024 is 512 times larger than TFTP’s weight of 2, TFTP will be able to send 512 times as many bytes.  Looking at the byte count of received traffic on R2, we can see that the results match this (1,149,000 / 2,240 = 512.9).  This also explains why the EIGRP adjacency reformed every 8 to 8.5 seconds.  For every 1 hello that EIGRP is allowed to send on R1 (512 bits), TFTP is allowed to send 262,144 bits (512 * 512).  The total time required for this is ((512 + 262,144) / 32,000) or about 8.2 seconds.  This is somewhat of an extreme example since we configured the max amount of WFQ dynamic queues in order to minimize TFTP’s weight and also shaped to a very low rate, but the main point here is that the 25% unallocated bandwidth is not reserved for anything.  Any non-priority queue can consume as much of the bandwidth as it’s weight relative to the other queues allows it to.

 

For one final test, let’s decrease the number of WFQ dynamic queues to a value such that the link queue can send at least 512 bps for EIGRP hellos:

R1:
policy-map CBWFQ
 class class-default
  fair-queue 256

Using 256 dynamic queues, the weight of TFTP will be:

16 * (100 / 75)   = 21.33

IOS rounds this to 21:

cbwfq2-3-r1queue1

The share that each queue receives is inversely proportional to it’s weight.  One way of finding the share that EIGRP hellos will receive is:

1 / ((1024 / 21) + (1024 / 1024) + (1024 / 4626))   =  2%, or about 640 bps

This should be a little more than enough to allow a hello packet every second, and looking at the output above we can see that only 1 packet is in the queue.  However, now there is a new problem:

R2:
*Mar 1 05:45:01.462: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:45:01.478: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 05:46:20.998: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:46:21.226: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 05:47:40.746: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:47:41.242: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 05:49:00.758: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:49:01.502: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 05:50:21.014: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:50:21.226: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency

Approximately every 1 minute and 20 seconds, we see ‘retry limit exceeded’ followed by the adjacency coming back up less than 1 second later.  A debug eigrp packets update shows what is happening:

R2:
*Mar  1 06:08:22.550: EIGRP: Sending UPDATE on Serial0/0 nbr 10.1.12.1, retry 14, RTO 5000
*Mar  1 06:08:22.550:   AS 1, Flags 0x1, Seq 1186/0 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/2
*Mar  1 06:08:27.554: EIGRP: Sending UPDATE on Serial0/0 nbr 10.1.12.1, retry 15, RTO 5000
*Mar  1 06:08:27.554:   AS 1, Flags 0x1, Seq 1186/0 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/2
*Mar  1 06:08:32.558: EIGRP: Sending UPDATE on Serial0/0 nbr 10.1.12.1, retry 16, RTO 5000
*Mar  1 06:08:32.558:   AS 1, Flags 0x1, Seq 1186/0 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/2
*Mar  1 06:08:37.566: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar  1 06:08:37.758: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency

R2 does not receive an acknowledgement from R1 for it’s update so it retries a total of 16 times at 5 second intervals.  Remember that EIGRP updates and acknowledgments were assigned to a dynamic conversation and given a weight of 4,626 based on their IPP of 6 – as a result, they cannot be scheduled in time, and after the last retry R2 takes the adjacency down.  Since we manipulated the TFTP queue weight so that a link queue has just more than enough bandwidth to send an EIGRP hello every second, the adjacency comes back up less than a second later, resulting in a very strange overall behavior.

Posted in EIGRP, QoS | 2 Comments »

Rate-limit ACLs

Posted by Andy on February 16, 2009

In this post we will examine how rate-limit ACLs work with CAR.  The topology and method of generating traffic will be the same as I used for testing WFQ.  The topology and inital configurations are shown below:

car-acl-topology

R1:
interface FastEthernet0/0
 ip address 10.1.1.1 255.255.255.0
 load-interval 30
 speed 100
 full-duplex
 no keepalive
 no mop enabled
!
interface Serial0/0
 ip address 10.1.12.1 255.255.255.0
 load-interval 30
 no keepalive
!
no cdp run

R2:
interface Serial0/0
 ip address 10.1.12.2 255.255.255.0
 load-interval 30
 no keepalive
!
no cdp run

The point of this example will be to examine how rate-limit ACLs work, and in particular the mask feature, so we will need traffic with various IP precedence values.  This particular traffic generator does not allow ToS byte values to be specified, so we will mark the traffic inbound on R1 F0/0.  We will generate 8 different traffic streams to ports 1000 – 1007, and each type of traffic will be marked with IPP X, where X is the last digit in the port number.  With an interpacket delay of 125 ms and packet size of 1500 bytes, this will give us 96 kbps of each IPP value (1500 * 8 * 8).  R2 will be configured to measure incoming traffic.  Let’s configure and verify this before setting up the rate-limit ACL:

R1:
access-list 100 permit udp any any eq 1000
access-list 101 permit udp any any eq 1001
access-list 102 permit udp any any eq 1002
access-list 103 permit udp any any eq 1003
access-list 104 permit udp any any eq 1004
access-list 105 permit udp any any eq 1005
access-list 106 permit udp any any eq 1006
access-list 107 permit udp any any eq 1007
!
class-map match-all Prec0
 match access-group 100
class-map match-all Prec1
 match access-group 101
class-map match-all Prec2
 match access-group 102
class-map match-all Prec3
 match access-group 103
class-map match-all Prec4
 match access-group 104
class-map match-all Prec5
 match access-group 105
class-map match-all Prec6
 match access-group 106
class-map match-all Prec7
 match access-group 107
!
policy-map Marker
 class Prec0
  set precedence 0
 class Prec1
  set precedence 1
 class Prec2
  set precedence 2
 class Prec3
  set precedence 3
 class Prec4
  set precedence 4
 class Prec5
  set precedence 5
 class Prec6
  set precedence 6
 class Prec7
  set precedence 7
!
interface FastEthernet0/0
 service-policy input Marker

R2:
class-map match-all Prec0
 match precedence 0
class-map match-all Prec1
 match precedence 1
class-map match-all Prec2
 match precedence 2
class-map match-all Prec3
 match precedence 3
class-map match-all Prec4
 match precedence 4
class-map match-all Prec5
 match precedence 5
class-map match-all Prec6
 match precedence 6
class-map match-all Prec7
 match precedence 7
!
policy-map Traffic-Meter
 class Prec0
 class Prec1
 class Prec2
 class Prec3
 class Prec4
 class Prec5
 class Prec6
 class Prec7
!
interface Serial0/0
 service-policy input Traffic-Meter

flood.pl --port=1000 --size=1496 --delay=125 10.1.12.2
flood.pl --port=1001 --size=1496 --delay=125 10.1.12.2
flood.pl --port=1002 --size=1496 --delay=125 10.1.12.2
flood.pl --port=1003 --size=1496 --delay=125 10.1.12.2
flood.pl --port=1004 --size=1496 --delay=125 10.1.12.2
flood.pl --port=1005 --size=1496 --delay=125 10.1.12.2
flood.pl --port=1006 --size=1496 --delay=125 10.1.12.2
flood.pl --port=1007 --size=1496 --delay=125 10.1.12.2

car-acl-1-r1f0

car-acl-1-r1s0

car-acl-1-r2s0

car-acl-1-r1pmap

car-acl-1-r2pmap

We can see that the input rate on R1 F0/0, output rate on R1 S0/0, and input rate on R2 S0/0 roughly matches the combined bandwidth of the 8 traffic streams.  We can also see that the traffic is being marked with the IPP values we specified on R1 and that R2 is receiving approximately 96 kbps of each type.  Now we can move onto configuring the rate-limit ACL.

Rate-limit ACLs, when used with the mask option, allow a 1-byte mask value to be entered.  Each position in the mask corresponds to an IPP value (MPLS EXP values work the same way), with IPP 7 being the left most value and IPP 0 the right most value.  The values that should be matched are set to 1 in their respective positions and the resulting mask is entered as a hexadecimal value.  Let’s say that we want to limit all IPP 0-4 traffic to a combined rate of 128 kbps.  The mask to match these values will be 00011111, which is hexadecimal 0x1F.  The configuration for this is:

R1:
access-list rate-limit 0 mask 1F
!
interface Serial0/0
 rate-limit output access-group rate-limit 0 128000 8000 8000 conform-action transmit exceed-action drop

On R1, we can see that policing is taking place:

car-acl-2-r1car1

On R2, we can verify the amount of each IP Precedence value received:

car-acl-2-r2pmap

The combined 30 second offered rates for IPP values 0-4 equal roughly 128 kbps, while the other IPP values continued to send 96 kbps each.  This verifies that we have used the mask value correctly.

Posted in ACL, QoS | Leave a Comment »

GLBP Weights, Load Balancing, and Redirection

Posted by Andy on February 11, 2009

This post will take a look at how weighting, load balancing, the redirect timer, and the timeout timer work in GLBP.  The topology for these tests is shown below:

glbp-topology2

All routers will be configured to preempt and each router will be configured with GLBP priority 100 + X where X is the router number so that the highest numbered router that is online will become the AVG for the group.  Each router’s interface MAC address will be changed to 0000.0000.000X where X is the router number for the sake of clarity.  We will configure each of the routers in order (the order is important for how forwarders are assigned) and will wait to configure GLBP on R5 for now.  The configuration is:

R1:
interface FastEthernet0/0
 ip address 10.1.1.1 255.255.255.0
 mac-address 0000.0000.0001
 glbp 1 preempt
 glbp 1 priority 101
 glbp 1 ip 10.1.1.100

R2:
interface FastEthernet0/0
 ip address 10.1.1.2 255.255.255.0
 mac-address 0000.0000.0002
 glbp 1 preempt
 glbp 1 priority 102
 glbp 1 ip 10.1.1.100

R3:
interface FastEthernet0/0
 ip address 10.1.1.3 255.255.255.0
 mac-address 0000.0000.0003
 glbp 1 preempt
 glbp 1 priority 103
 glbp 1 ip 10.1.1.100

R4:
interface FastEthernet0/0
 ip address 10.1.1.4 255.255.255.0
 mac-address 0000.0000.0004
 glbp 1 preempt
 glbp 1 priority 104
 glbp 1 ip 10.1.1.100

R5:
interface FastEthernet0/0
 ip address 10.1.1.5 255.255.255.0
 mac-address 0000.0000.0005

R4 has become the AVG due to it’s higher priority and preemption enabled.  We can also see that R1 is active for AVF #1, R2 for AVF #2, R3 for AVF #3, and R4 for AVF #4 since they were configured in that order:

glbp-1

Now we will bring R5 online:

R5:
interface FastEthernet0/0
 glbp 1 preempt
 glbp 1 priority 105
 glbp 1 ip 10.1.1.100

R5 becomes the AVG but remains in a Listen state for all 4 AVFs:

glbp-2

Since GLBP has a limit of 4 AVFs per group, no new ones were created for R5.  Let’s try increasing R5’s weighting value from the default of 100:

R5:
interface FastEthernet0/0
 glbp 1 weighting 200

We can see that the weighting on R5 is now 200 and that R5 knows each of the other AVFs are using the default value of 100.  We can also see that R5 is (by default) configured to preempt forwarders after a 30 second delay.  Despite these facts, R5 still remains in the Listen state for all 4 AVFs:

glbp-3

Let’s create a loopback interface on R1 and configure GLBP to track it and decrement the weight value when it goes down:

R1:
interface Loopback0
 ip address 1.1.1.1 255.255.255.0
!
track 1 interface Loopback0 line-protocol
!
interface FastEthernet0/0
 glbp 1 weighting 100 lower 90 upper 95
 glbp 1 weighting track 1 decrement 20

This configuration on R1 essentially says “Start with a weight value of 100.  If Loopback0 goes down, decrement the weight by 20.  If the the weight falls below 90, this router is no longer allowed to be an AVF.  Once the weight has fallen below 90, do not allow the router to become the AVF again until the weight is at least 95.”  Now we will shutdown Loopback 0 on R1:

R1:
interface Loopback0
 shutdown

After R5’s 30 second AVF preemption timer expires, R5 takes over the role of active for AVF #1 and R1 transitions to listening:

glbp-5

glbp-4

Now let’s bring R1’s loopback interface back up:

R1:
interface Loopback0
 no shutdown

After 30 seconds, R1 preempts R5 and reclaims AVF #1 even though it’s weight (100) is lower than R5’s (200):

glbp-6

glbp-7

As we can see, the owner ID for AVF #1 is set to R1’s MAC address since it was the first GLBP router to come online.  As soon as R1’s weighting value is greater than or equal to the upper threshold it is allowed to reclaim it’s AVF role as long as forwarder preemption is enabled on it.

Let’s look at another scenario where the AVF that we are trying to preempt is not the owner of that AVF.  We will shutdown R5’s F0/0 interface and then shut down R1’s loopback interface.  This causes R1 to fall below the lower weighting threshold again, and since R5 is not available to take over AVF #1, one of the other routers must take over AVF #1 in addition to it’s own AVF:

R5:
interface FastEthernet0/0
shutdown

R1:
interface Loopback0
shutdown

After the 30 second preemption delay expires on R2, R3, and R4, each of them becomes active for AVF #1 for a few milliseconds before hearing each other’s hellos and deciding on one of them to remain active.  (I’m not sure how this is determined exactly; after shutting down and re-enabling R1’s loopback interface 5 different times, R3 took over AVF #1 two times and R4 took over AVF #1 three times.  Cisco does not seem to have much information available on GLBP so it is hard to know for sure, but it is unimportant for this test as long as 1 of the 3 remaining routers takes over AVF #1.  If we needed it to be deterministic, we could simply configure a lower forwarder preemption delay on 1 of the 3 routers).  We can now see that R4 has taken over AVF #1 in addition to it’s own:

glbp-8

Now we will bring R5 back online:

R5:
interface FastEthernet0/0
 no shutdown

Even though R5 has a higher weight than R4 and R4 is not the owner of AVF #1, R5 still does not preempt R4 and take over AVF #1:

glbp-9

glbp-10

For the last test related to weights, we will look at what happens when the weighting value of a router falls below the lower threshold and no other routers in the group are configured for forwarder preemption.  We will re-enable R1’s loopback and verify that it becomes active for AVF #1 again:

R1:
interface Loopback0
 no shutdown

glbp-111

Next we will disable forwarder preemption on all other routers and then shutdown R1’s loopback, causing the weight on R1 to fall below the lower threshold:

R2:
interface FastEthernet0/0
 no glbp 1 forwarder preempt

R3:
interface FastEthernet0/0
 no glbp 1 forwarder preempt

R4:
interface FastEthernet0/0
 no glbp 1 forwarder preempt

R5:
interface FastEthernet0/0
 no glbp 1 forwarder preempt

R1:
interface Loopback0
 shutdown

R1 indicates that it has crossed the lower threshold, but it still remains active for AVF #1:

glbp-12

Without forwarder preemption configured on any other routers in the group, crossing the lower threshold does not cause that router to lose it’s active state.

 

Next we will look at the different types of load balancing that can be used for ARP replies to clients.  First we will re-enable R1’s loopback and re-enable forwarder preemption on the other routers:

R1:
interface Loopback0
 no shutdown

R2:
interface FastEthernet0/0
 glbp 1 forwarder preempt

R3:
interface FastEthernet0/0
 glbp 1 forwarder preempt

R4:
interface FastEthernet0/0
 glbp 1 forwarder preempt

R5:
interface FastEthernet0/0
 glbp 1 forwarder preempt

At this point R5 is the AVG for the group and R1-4 are each AVFs for the forwarder that they own:

glbp-13 

There are three ways GLBP can be used to load balance ARP replies to clients: round-robin, weighted, and host dependent.  We will look at round-robin (the default) first.  We will also enable the GLBP client-cache, which keeps track of the IP and MAC received from a client in an ARP request and the forwarder to which the client was assigned based on the virtual MAC sent in the ARP reply:

R5:
interface FastEthernet0/0
 glbp 1 client-cache maximum 30

Now we will generate ARP requests from various MAC and IP addresses on a PC attached to the network using ColaSoft Packet Builder.  The ARP requests will use MAC address 0000.0000.00XX and IP address 10.1.1.XX, where XX is a number written in decimal starting at 10 and incrementing to 29.  This will give us a total of 20 unique ARP requests being sent to the GLBP virtual IP address at a rate of 1 per second.  After sending the 20 ARP requests, we can see the results on R5:

glbp-14

show glbp detail gives a lot of useful information (show glbp client-cache can also be used to view just the client cache).  We can see that round-robin is the load balancing method and that 20 of 30 cache entries are being used.  We can also see exactly how the load balancing of ARP replies has taken place: AVF #1 received the first client (10.1.1.10) and every fourth client after that.  Age shows the amount of time since an ARP request was last received and updates shows the number of ARP requests received from the client.  Next let’s send 20 duplicate ARP requests of the first entry only (MAC 0000.0000.0010, IP 10.1.1.10).  The results on R5 look like this:

glbp-15

The ‘Client selection count’ keeps track of the number of replies sent with that vMAC in response to ARP requests, whether those ARP requests are unique or not.  We can see that 10 replies with each vMAC have been sent, which equals the 20 unique ARP requests initially sent plus the 20 duplicates.  The client cache only contains unique client entries; if repeat ARP requests are received from a client the age timer is reset and update counter is incremented.  Since AVF#4 received the last client in the previous test, AVF#1 is next in line to have it’s vMAC used in the ARP reply.  After 20 requests, the round-robin process ends on AVF#4 again and 10.1.1.10 is now a client of AVF#4, resulting in an unequal number of clients per AVF.  Next we will change the load balancing method on R5 to use weighted load balancing.  First the GLBP configuration on R5 is removed to clear the counters of ARP replies issued, then the load balancing method is changed to weighted.  We will also change the weights on some of the other routers since they are all currently using the default of 100:
 
R5:
interface FastEthernet0/0
 no glbp 1 ip
 no glbp 1 priority
 no glbp 1 preempt
 no glbp 1 weighting
 no glbp 1 client-cache

A few seconds later…

R5:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.100
 glbp 1 priority 105
 glbp 1 preempt
 glbp 1 client-cache maximum 30
 glbp 1 load-balancing weighted

R1:
interface FastEthernet0/0
 glbp 1 weighting 200
 

R2:
interface FastEthernet0/0
 glbp 1 weighting 100

R3:
interface FastEthernet0/0
 glbp 1 weighting 80

R4:
interface FastEthernet0/0
 glbp 1 weighting 20

 After sending the same 20 unique ARP requests again, the results on R5 are:

glbp-16

Each AVF has it’s vMAC used in a percentage of the ARP replies equal to it’s weight divided by the sum of all AVF weights.  In this case, that means each AVF receives:

R1:    200 / (200 + 100 + 80 + 20)    = 50%, or 10 clients

R2:    100 / (200 + 100 + 80 + 20)    = 25%, or 5 clients

R3:    80 / (200 + 100 + 80 + 20)    = 20%, or 4 clients

R4:    20 / (200 + 100 + 80 + 20)    = 5%, or 1 client

For the last GLBP load balancing method, we will change it to host dependent after clearing the GLBP counters on R5:

R5:
interface FastEthernet0/0
 no glbp 1 ip
 no glbp 1 priority
 no glbp 1 preempt
 no glbp 1 weighting
 no glbp 1 client-cache
 default glbp 1 load-balancing


 
A few seconds later…

R5:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.100
 glbp 1 priority 105
 glbp 1 preempt
 glbp 1 client-cache maximum 30
 glbp 1 load-balancing host-dependent

After sending the same 20 unique ARP requests again, the results on R5 are:

glbp-171

AVF #1 and #2 have each received 6 clients, while #3 and #4 have each received 4 clients.  Now let’s send 20 duplicate ARP requests like we did earlier from MAC: 0000.0000.0010, IP: 10.1.1.10.  This client is currently assigned to AVF #1.  After sending the ARP requests, R5 looks like this:

glbp-18

The ‘Client selection count’ has increased from 6 to 26 for AVF#1, while the other remained the same.  We can also see in the client cache that this client still uses AVF#1.  Now let’s look at another interesting aspect of host-dependent load balancing.  Notice that the clients were not completely evenly distributed to begin with; instead AVF#1 and #2 received 6 each while #3 and #4 received 4 each.  Also, there appears to be a pattern in the client addresses assigned to each AVF: 

AVF#1 has all clients whose MAC/IP ends in 0, 4, or 8

AVF#2 has all clients whose MAC/IP ends in 1, 5, or 9

AVF#3 has all clients whose MAC/IP ends in 2 or 6

AVF#4 has all clients whose MAC/IP ends in 3 or 7

Let’s test this theory by resetting R5’s GLBP configuration to clear the counters and clients, and then sending ARP requests from only clients whose MAC/IP ends in a 0, 4, or 8.  Just to be certain that there is no way R5 is somehow remembering which AVF the clients were assigned to before, we will use 10 MAC/IP addresses that the router has never seen before starting at MAC: 0000.0000.0030, IP: 10.1.1.30

R5:
interface FastEthernet0/0
 no glbp 1 ip
 no glbp 1 priority
 no glbp 1 preempt
 no glbp 1 client-cache
 default glbp 1 load-balancing

A few seconds later…

R5:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.100
 glbp 1 priority 105
 glbp 1 preempt
 glbp 1 client-cache maximum 30
 glbp 1 load-balancing host-dependent

Here are the results on R5:

glbp-19

All 10 new clients received the vMAC of AVF#1 in their ARP reply.  Therefore, GLBP does not need to have already received an ARP request from a client in order to know which AVF to assign it to – the results are deterministic even if it is a brand new client.  In fact, even if the GLBP client cache is disabled (tested but not shown), the results still come out the same every time when a new client generates an ARP request.  The assignment of clients to AVFs with host-dependent load balancing appears to be based on some function of the client’s MAC and/or IP address rather than a database of which AVF the client was previously assigned to.

 

Next we will look at how the redirect and timeout timers function in GLBP.  The redirect timer in GLBP controls when the AVG will stop replying to client ARP requests with the virtual MAC address of an AVF that has been taken over by a router other than the original owner.  The timeout timer controls when the AVG removes the AVF entirely if the owner has not reclaimed it, and clients that had obtained that virtual MAC address through ARP (either before the AVF was taken over by a router other than the owner, or after it was taken over by a router other than the owner but before the redirection timer expired) must obtain a different virtual MAC address to use for the gateway.  First we will look at what happens when there are 4 or less routers in the GLBP group.  We will take R5 offline so that R4 becomes the new AVG:

 R5:
interface FastEthernet0/0
 shutdown

glbp-20

We will decrease the redirect and timeout timers to 60 and 660 seconds (the timeout must be at least 600 seconds more than the redirect) so that we can see the results more quickly:

R4:
interface FastEthernet0/0
 glbp 1 timers redirect 60 660

Now we will pretend R1 fails by shutting down it’s interface:

R1:
interface FastEthernet0/0
 shutdown

After the 10 second holddown timer expires, R4 becomes active for AVF#1 in addition to it’s own.  The redirection timer for AVF#1 begins counting down from 60 seconds from the time R4 last heard a hello from R1:

glbp-21

If we generate 20 ARP requests from the PC before the redirection timer expires, we find that R4 includes AVF#1 in it’s round robin cycle and each AVF handles 5 client requests:

glbp-22

If we generate another 20 ARP requests after the redirection timer has expired for AVF#1, AVF#1 is no longer included in the round-robin cycle of ARP replies and the new clients are split up between the 3 remaining AVFs:

glbp-23

After 660 total seconds, the timeout timer expires and AVF#1 is removed:

glbp-24

If instead we had 5 routers, when R1 failed R5 would take over AVF#1.  When the timeout timer for AVF#1 expires the AVF is removed, but instead of the group continuing to function with 3 AVFs, a new AVF is created after a few seconds with R5’s MAC as the owner ID:

R5:
02:35:17.098: %GLBP-6-FWDSTATECHANGE: FastEthernet0/0 Grp 1 Fwd 1 state Active -> Disabled
02:35:28.898: %GLBP-6-FWDSTATECHANGE: FastEthernet0/0 Grp 1 Fwd 1 state Listen -> Active

glbp-25

Posted in HSRP VRRP and GLBP | Leave a Comment »

HSRP/VRRP/GLBP Timers and Preemption

Posted by Andy on February 9, 2009

This post will take a look at some of the less common issues related to timers, preemption, and preemption delays in the first hop redundancy protocols.  The topology is shown below:

hsrp-vrrp-glbp1-topology

 

First we will configure HSRP, VRRP, and GLBP on R1 and R2 without changing any of the default settings:

R1:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 vrrp 1 ip 10.1.1.102
 glbp 1 ip 10.1.1.103

R2:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 vrrp 1 ip 10.1.1.102
 glbp 1 ip 10.1.1.103

As expected, R2 becomes the master router for VRRP group 1 and active router for HSRP and GLBP group 1 (as long as it is configured before R1 can transition to active) since R1 and R2 have equal priority and R2 has the higher IP address:

hsrp-vrrp-glbp-13

Next let’s enable preemption for all 3 protocols on R3 and then bring each of them up.  Preemption is enabled by default for VRRP so we only need to enable it for HSRP and GLBP:

R3:
interface FastEthernet1/0
 standby 1 preempt
 standby 1 ip 10.1.1.101
 vrrp 1 ip 10.1.1.102
 glbp 1 preempt
 glbp 1 ip 10.1.1.103

R3 now has an equal priority to the other routers in all 3 groups, the highest IP address in all 3 groups, and is allowed to preempt the master/active router of all 3 groups.  Take a look at the results after applying this configuration:

hsrp-vrrp-glbp-2

R3 has become the master for the VRRP group.  R3 has also overthrown R1 to become the standby router for both HSRP and GLBP, however it does not transition to active.  Of these 3 protocols, only VRRP allows a higher IP address to preempt a master/active router; HSRP and GLBP require a higher priority value in order for preemption to occur.

 

Next we will look at the hello and hold timers for each of the protocols.  First we will increase the HSRP and GLBP priorities on R3 so that it is the master/active router for each protocol:

R3:
interface FastEthernet1/0
 standby 1 priority 101
 glbp 1 priority 101

hsrp-vrrp-glbp-3

We will change the hello and hold timers for HSRP and GLBP to different values on each of the three routers:

R1:
interface FastEthernet1/0
 standby 1 timers 6 18
 glbp 1 timers 6 18

R2:
interface FastEthernet1/0
 standby 1 timers 5 15
 glbp 1 timers 5 15

R3:
interface FastEthernet1/0
 standby 1 timers 4 12
 glbp 1 timers 4 12

This does not cause a problem for HSRP or GLBP since the hello and hold timers are communicated by the active router in hello messages, and timers learned from the active router override manually configured timers.  Since R3 is active for both protocols, it is using a hello timer of 4 seconds and hold timer of 12 seconds:

hsrp-vrrp-glbp-4

hsrp-vrrp-glbp-5

R1 and R2 show that they have learned and are using these timer values, and make a note in parenthesis of what their actual configured values are:

hsrp-vrrp-glbp-6

hsrp-vrrp-glbp-7

If R3 fails or is preempted by a different router, the new active router will use it’s manually configured timers and the other routers will relearn the timers from that router.  In this case, if R3 fails R2 will become active for both HSRP and GLBP since it was already standby because it had a higher IP address than R1.  R1 will then learn the new timers from R2:

R3:
interface FastEthernet1/0
 shutdown

hsrp-vrrp-glbp-8

hsrp-vrrp-glbp-9

Next we will change the hello and hold timers for HSRP and GLBP to use sub-second values.  We will re-enable R3’s interface so that it becomes active for all groups and make the changes on R3:

R3:
interface FastEthernet1/0
 no shutdown
 glbp 1 timers msec 200 msec 800
 standby 1 timers msec 200 msec 800


R1 and R2 learn the new GLBP timers without any problems:

hsrp-vrrp-glbp-10

hsrp-vrrp-glbp-111

HSRP, however, behaves differently:

hsrp-vrrp-glbp-121

hsrp-vrrp-glbp-131

hsrp-vrrp-glbp-14

R1 and R2 still see R3 as the active router, but neither has learned R3’s hello or hold timers – instead R1 and R2 are using their manually configured timers.  Looking at the HSRP packets in Wireshark reveals what the problem is:

hsrp-vrrp-glbp-wireshark-1 

hsrp-vrrp-glbp-wireshark-2

The first picture shows an HSRP hello sent by R2 and the second an HSRP hello sent by R3.  The hellotime and holdtime fields are each 1 byte long and contain the manually configured timers on that router in seconds.  We can see that R3 has set both of them to 0 since there is no way of entering a sub-second value, so R1 and R2 continue to use their manually configured timers.  The hold time for the active router never falls below 17.8 seconds on R1 and 14.8 seconds on R2 since R3 is sending hellos every 200 ms.  If R3 fails, failover will not occur for 15 seconds because R2 is not able to take advantage of the sub-second hold time.  Since R3 is using a holddown timer of 800 ms and R2 (the standby router) is using its manually configured timer to send hellos every 5 seconds, R3 will continually cycle through seeing R2 as the standby router for 800 ms and thinking that there is no standby router for the next 4200 ms:

hsrp-vrrp-glbp-15

HSRP version 2 can be used to overcome this problem.  We will configure HSRP version 2 on each router:

R1:
interface FastEthernet1/0
 standby version 2

R2:
interface FastEthernet1/0
 standby version 2

R3:
interface FastEthernet1/0
 standby version 2

HSRP version 2 sends the hello and hold timers in milliseconds and uses a 4 byte field for each of them:

hsrp-vrrp-glbp-wireshark-3

We can see that R1 and R2 now correctly learn R3’s timers and prefer them over their manually configured ones:

hsrp-vrrp-glbp-16

hsrp-vrrp-glbp-17

Unlike HSRP and GLBP, VRRP does not automatically learn timers from the master router.  Also unlike HSRP and GLBP, VRRP requires that the hello timer of all routers in the group match.  Let’s try configuring a different hello timer on each router:

R1:
interface FastEthernet1/0
 vrrp 1 timers advertise 2

R2:
interface FastEthernet1/0
 vrrp 1 timers advertise 3

R3:
interface FastEthernet1/0
 vrrp 1 timers advertise 4

hsrp-vrrp-glbp-18

hsrp-vrrp-glbp-19

hsrp-vrrp-glbp-20

After applying the configuration, all 3 routers think that they are the master router for the group.  VRRP can be configured to learn timers from the master router so that it behaves similarly to HSRP and GLBP:

R1:
interface FastEthernet1/0
 vrrp 1 timers learn

R2:
interface FastEthernet1/0
 vrrp 1 timers learn

R3:
interface FastEthernet1/0
 vrrp 1 timers learn

If configured with both vrrp timers advertise and vrrp timers learn, timers learned from the master router will override manually configured timers just like with HSRP and GLBP.  R1 and R2 once again see R3 as the master and transition into the backup state.  R1 and R2 also continue to display their manually configured hello timer as well as the hello timer that they have learned from the master:

hsrp-vrrp-glbp-21

hsrp-vrrp-glbp-22

hsrp-vrrp-glbp-23

Like we saw with HSRP and GLBP, if R3 fails, R2 will become the master router and begin advertising its manually configured hello timer of 3 seconds.  R1 will learn the new timer from R2 and accept R2 as the master router.  VRRP has a similar problem to HSRP version 1 when attempting to get other routers in the group to learn sub-second timers.  VRRP even gives a warning if timer learning is enabled on the router that we try to configure sub-second timers on:

R3:
interface FastEthernet1/0
 vrrp 1 timers advertise msec 200

% cannot configure millisecond timers when timer learning enabled

Since R3 is the master router, it won’t be learning timers anyway so the warning is probably there because it assumes our other routers are using timer learning as well.  Let’s see what happens when we disable timer learning on R3 and configure it to advertise millisecond timers to the other routers:

R3:
interface FastEthernet1/0
 no vrrp 1 timers learn
 vrrp 1 timers advertise msec 200

hsrp-vrrp-glbp-24

hsrp-vrrp-glbp-25

hsrp-vrrp-glbp-26

R1 and R2 both see R3 as the master still, but they think that R3’s advertisement interval is 1 second.  Look at what R3’s VRRP packet looks like in Wireshark:

hsrp-vrrp-glbp-wireshark-4

VRRP also uses a 1 byte field for the hello timer in seconds.  In VRRP, a sub-second hello timer results in a hello timer of 1 second being sent.  If the other routers are configured for timer learning, they will learn the received value of 1 second.  If the other routers are not configured for timer learning and their hello timer has been left as the default of 1 second, they will think that the master is using the same hello timer value and accept it.  If the other routers are not configured for timer learning and their hello timer has been manually configured to a nondefault value, the routers will not accept the advertisement from the master configured with a sub-second hello timer and one or more of them will transition to the master state.  One other difference between VRRP and the other 2 protocols in regards to timers is that the master down interval (hold timer) is not configurable.  Instead, VRRP uses a value of 3 times the hello timer + the skew time as the master down interval.  The skew time is calculated as ((256 – VRRP priority) / 256), which will result in higher priority routers having a shorter skew time and master down interval.  First, we will change R3’s hello timer to 1 second so that all 3 routers are using the same hello timer again:

R1:
interface FastEthernet1/0
 vrrp 1 timers advertise 1

Using a hello time of 1 second and default priority of 100, each router calculates the master down interval as:

(3 * 1) + ((256 – 100)/256)  = 3.609 seconds

hsrp-vrrp-glbp-27

If R3 fails, R1 and R2’s master down intervals both expire at the same time and both of them try to seize the role of master router simultaneously:

R3:
interface FastEthernet1/0
 shutdown

hsrp-vrrp-glbp-wireshark-51

Once R1 sees R2’s superior advertisement, it accepts R2 as the master and goes back to the role of backup.  This causes R1 to transition to the master state, and then immediately back to backup a few milliseconds later once R2’s advertisement reaches it:

hsrp-vrrp-glbp-28

If R1 is configured to a lower priority (or R2 and R3 to a higher priority), it’s master down interval will be larger than R2:

R1:
interface FastEthernet1/0
 vrrp 1 priority 10

hsrp-vrrp-glbp-291

R2 now has about 350 ms after it’s master down interval expires for it’s advertisement to reach R1 before R1 tries to claim the role of master.

 

Next we will look at how initialization delay and preemption delay can be used.  HSRP is the only one of the 3 protocols that allows an initialization delay to be configured.  We will use the following HSRP configurations on each router. 

R1:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3

R2:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3

R3:
interface FastEthernet1/0
 shutdown
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3
 standby 1 priority 101
 standby 1 preempt
 standby delay minimum 30

R3 has a higher priority and is configured to preempt, but is shutdown to begin with.  Now we will enable R3’s interface to simulate it coming back online:

R3:
interface FastEthernet1/0
 no shutdown

The configured initialization delay prevents HSRP from progressing beyond the Init state until it expires.  On R3, we can see that the state is Init and a timer counting down the number of seconds left until HSRP can initialize:

hsrp-vrrp-glbp-30

Approximately 30 seconds after the interface comes back up, HSRP transitions to the Listen and then Speak states, and then sends a coup to R2 and transitions to Active:

hsrp-vrrp-glbp-31

 

Preemption delay is supported by all three protocols.  Unlike HSRP intialization delay which kept HSRP in the Init state, preempt delay only prevents the router from transitioning to active/master.  We will take R3 offline and configure a preemption delay of 30 seconds for each protocol on R3.  We will also configure an HSRP intialization delay of 30 seconds to see how it operates with both delays configured.  The configuration of each router is:

R1:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3
 vrrp 1 ip 10.1.1.102
 glbp 1 ip 10.1.1.103
 glbp 1 timers 1 3

R2:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3
 vrrp 1 ip 10.1.1.102
 glbp 1 ip 10.1.1.103
 glbp 1 timers 1 3

R3:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3
 standby 1 priority 101
 standby 1 preempt delay minimum 30
 standby delay minimum 30
 vrrp 1 ip 10.1.1.102
 vrrp 1 preempt delay minimum 30
 glbp 1 ip 10.1.1.103
 glbp 1 timers 1 3
 glbp 1 priority 101
 glbp 1 preempt delay minimum 30
 shutdown

Approximately 7 seconds after enabling the interface, we can see that VRRP is in the backup state and has 23 seconds remaining until it can preempt the current master:

hsrp-vrrp-glbp-321

Approximately 9 seconds after enabling the interface, HSRP remains in the Init state.  The HSRP intialization delay takes effect before the preemption delay, so HSRP will remain in this state for 30 seconds.  We can see that the initialization delay timer has 21 seconds remaining:

hsrp-vrrp-glbp-34

Approximately 10 seconds after enabling the interface, we can see that GLBP has taken over the standby role.  There are 20 seconds remaining until it can preempt the current AVG:

hsrp-vrrp-glbp-331

After 30 seconds, R3 becomes the VRRP master router and GLBP AVG.  The HSRP initialization delay has expired and preempt delay begins.  Approximately 37 seconds after enabling the interface, we can see that HSRP is now in the standby state and has 23 seconds left on the preemption delay before it can become active:

hsrp-vrrp-glbp-35

After a total of 60 seconds, R3 becomes the active HSRP router.  The messages logged to the console also show a timeline of how the events take place:

 

*Mar 1 05:55:40.574: %VRRP-6-STATECHANGE: Fa1/0 Grp 1 state Init -> Backup
*Mar 1 05:55:42.562: %LINK-3-UPDOWN: Interface FastEthernet1/0, changed state to up
*Mar 1 05:55:43.562: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet1/0, changed state to up
*Mar 1 05:56:11.270: %VRRP-6-STATECHANGE: Fa1/0 Grp 1 state Backup -> Master
*Mar 1 05:56:11.614: %GLBP-6-STATECHANGE: FastEthernet1/0 Grp 1 state Standby -> Active
*Mar 1 05:56:11.618: %GLBP-6-FWDSTATECHANGE: FastEthernet1/0 Grp 1 Fwd 1 state Listen -> Active
*Mar 1 05:56:14.094: %HSRP-5-STATECHANGE: FastEthernet1/0 Grp 1 state Speak -> Standby
*Mar 1 05:56:42.078: %HSRP-5-STATECHANGE: FastEthernet1/0 Grp 1 state Standby -> Active


GLBP also allows preemption and preemption delay to be configured for the Active Virtual Forwarders (AVFs).  By default, preemption is enabled with a delay of 30 seconds.  Looking back at the logging messages, we can see that forwarder 1 changed state to active on R3 at almost the exact same time as the AVG changed state to active, since we configured the AVG with a preempt delay of 30 seconds also.  Let’s see what happens if we disable AVF preemption on R3 and then shutdown and re-enable the interface:

R3:
interface FastEthernet1/0
 no glbp 1 forwarder preempt
 shutdown

A few seconds later…

no shutdown

R3 becomes the AVG again after the 30 second AVG preemption timer expires, however it does not become active for any of the forwarders.  Instead R2, which took over forwarder #1 when R3 went offline, continues to remain active for both forwarder #1 and #2:

hsrp-vrrp-glbp-36

hsrp-vrrp-glbp-37

Posted in HSRP VRRP and GLBP | Leave a Comment »