Cisconinja’s Blog

Archive for the ‘HSRP VRRP and GLBP’ Category

HSRP/VRRP/GLBP Misconfigurations

Posted by Andy on February 24, 2009

In this post, we will look at what happens when HSRP, VRRP, and GLBP are misconfigured (or attacked) in different ways.  The three different misconfigurations we will look at for each protocol are mismatched virtual IP address, mismatched group numbers, and mismatched authentication.  The topology is shown below:

hsrp-misconfiguration-topology1

SW3 is a router with NM-16ESW acting as a layer-2 switch.  Host4 and Host5 are both routers acting as hosts on the network.  Each device will be configured with IP 10.1.1.X and MAC address 0000.0000.000X where X is the device number for the sake of clarity:

R1:
interface FastEthernet0/0
 mac-address 0000.0000.0001
 ip address 10.1.1.1 255.255.255.0

R2:
interface FastEthernet0/0
 mac-address 0000.0000.0002
 ip address 10.1.1.2 255.255.255.0

R4:
interface FastEthernet0/0
 mac-address 0000.0000.0004
 ip address 10.1.1.4 255.255.255.0
!
no ip routing
ip default-gateway 10.1.1.101

R5:
interface FastEthernet0/0
 mac-address 0000.0000.0005
 ip address 10.1.1.5 255.255.255.0
!
no ip routing
ip default-gateway 10.1.1.101

The MAC address table after generating some traffic from each device and prior to enabling any first hop redundancy protocols is shown below:

1-mac-table

Now we will look at each of the different scenarios, starting with HSRP.

 

 

HSRP IP Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.0c07.ac01

R2

10.1.1.102

1

0000.0c07.ac01

 

R1:
interface FastEthernet0/0
 standby 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 standby 1 ip 10.1.1.102

If both routers are configured at approximately the same time, R2 becomes active because of it’s higher IP address and R1 becomes standby:

1-hsrp-brief-r1

1-hsrp-brief-r2

R1 generates the following log message, but still continues to accept R2 as the active router:

R1:
*Mar 1 03:31:16.919: %HSRP-4-DIFFVIP1: FastEthernet0/0 Grp 1 active routers virtual IP address 10.1.1.102 is different to the locally configured address 10.1.1.101

SW3 learns the HSRP virtual MAC address on F1/2 since only R2 sources traffic from that address:

1-mac-table2

10.1.1.102 is reachable to any hosts on the network, however 10.1.1.101 is not because R1 does not respond to it while in standby.

1-hsrp-ping-r2

1-hsrp-ping-r11

If 10.1.1.101 had been the correct gateway address for hosts to use, it would no longer be reachable.

 

 

HSRP Group Number Mismatch:

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.0c07.ac01

R2

10.1.1.101

2

0000.0c07.ac02

 

R1:
interface FastEthernet0/0
 standby 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 standby 2 ip 10.1.1.101

This time R1 and R2 ignore each other’s hellos and both routers become active:

2-hsrp-brief-r1

2-hsrp-brief-r2

When transitioning to active, HSRP broadcasts gratuitous ARP messages:

R2#debug arp
*Mar 1 03:53:30.955: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 2 state Standby -> Active
*Mar 1 03:53:30.955: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:30.959: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

After hearing each other’s gratuitous ARPs, each router generates a log message reporting a duplicate IP address and broadcasts another gratuitous ARP in an attempt to fix the ARP cache of other devices on the network which may have been changed by the other router’s gratuitous ARP:

R1#debug arp
*Mar 1 03:53:31.035: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac02, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:53:31.039: %IP-4-DUPADDR: Duplicate address 10.1.1.101 on FastEthernet0/0, sourced by 0000.0c07.ac02
*Mar 1 03:53:31.039: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac01,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:31.043: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac01,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

This results in gratuitous ARP broadcasts continually being sent by R1 and R2 in response to receiving each other’s gratuitous ARPs.  It appears that IOS limits gratuitous ARPs to 1 per second based on the ARP debug messages shown below.  R2 receives a gratuitous ARP from R1 at 47.631.  At the same time, a message is shown immediately after it that a gratuitous ARP was throttled and 10.1.1.101 was added to arp_defense_Q.  At 48.207, another debug message shows that 10.1.1.101 was removed from arp_defense_Q and the gratuitous ARP is broadcast using R2’s virtual MAC as the source.  Another gratuitous ARP is received from R1 shortly after that, and R2 again throttles it and waits to send it’s own gratuitous ARP until 49.207 – exactly 1 second after it sent the previous one.  This pattern continues on each router:

R2:
*Mar 1 03:53:47.631: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:53:47.631: IP ARP: Gratuitous ARP throttled.
*Mar 1 03:53:47.635: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar 1 03:53:48.207: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar 1 03:53:48.207: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:48.211: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0
*Mar 1 03:53:48.371: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:53:48.375: IP ARP: Gratuitous ARP throttled.
*Mar 1 03:53:48.375: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar 1 03:53:49.207: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar 1 03:53:49.207: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:49.211: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0
*Mar 1 03:53:49.739: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:53:49.739: IP ARP: Gratuitous ARP throttled.
*Mar 1 03:53:49.739: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar 1 03:53:50.207: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar 1 03:53:50.207: IP ARP: sent rep src 10.1.1.101 0000.0c0c7.ac02,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 03:53:50.211: IP ARP: sent rep src 10.1.1.101 0000.0c07.ac02,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

As a result, hosts on the network continually update their ARP cache twice per second:

Host4#debug arp
*Mar 1 03:58:35.426: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:58:36.338: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac02, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:58:36.402: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:58:37.314: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac02, dst 10.1.1.101 FastEthernet0/0
*Mar 1 03:58:37.462: IP ARP: rcvd rep src 10.1.1.101 0000.0c07.ac01, dst 10.1.1.101 FastEthernet0/0

2-hsrp-host4-sharp1

SW3 learns each of the virtual MACs on the interfaces they are received and the table stays the same over time since R1 and R2 use separate virtual MACs:

2-mac-table

Any traffic that hosts send to a remote network with the virtual IP configured as their gateway will end up being split between the 2 routers (whichever router’s virtual MAC is currently in the hosts ARP cache at that moment will receive the frame).

 

 

HSRP Authentication Mismatch:

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.0c07.ac01

cisco1

R2

10.1.1.101

1

0000.0c07.ac01

cisco2

 

R1:
interface FastEthernet0/0
 standby 1 ip 10.1.1.101
 standby 1 authentication cisco1

R2:
interface FastEthernet0/0
 standby 1 ip 10.1.1.101
 standby 1 authentication cisco2

Again R1 and R2 both become active and do not recognize a standby router for the group:

3-hsrp-brief-r1

3-hsrp-brief-r2

Also the following log message shows up on each router:

R1:
*Mar 1 04:07:03.226: %HSRP-4-BADAUTH: Bad authentication from 10.1.1.2, group 1, remote state Active

Since both routers use the same virtual IP and virtual MAC, they do not generate gratuitous ARPs in response to receiving a gratuitous ARP from the other router.  However, since SW3 receives HSRP hellos sourced from the virtual MAC by both routers, the MAC address flaps back and forth between interfaces as often as R1 and R2 send HSRP hellos (or more if they send other traffic):

3-mac-table

Again, any traffic that hosts send to a remote network with the virtual IP configured as their gateway ends up being split between the 2 routers depending on which interface the virtual MAC address was last learned out of by the switch at the moment a frame is received from a host.

 

 

VRRP IP Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.5e00.0101

R2

10.1.1.102

1

0000.5e00.0101

 

R1:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.102

Unlike when HSRP was configured with different virtual IPs and only 1 became active, both VRRP routers think that they are the master:

4-vrrp-brief-r1

4-vrrp-brief-r2

Since both use the same group number and therefore the same virtual MAC, the MAC flaps back and forth between interfaces on SW3:

4-mac-table

If a host attempts to send traffic to a remote network, all traffic makes it to the destination and back (as long as both routers have a route to it) even though the host is unknowingly sending half of the traffic to a router that is using a different IP than it’s configured gateway address:

4-vrrp-ping-remote

However if the host attempts to send traffic to the gateway address itself, some of the traffic will be dropped:

4-vrrp-ping-gateway

Some of the traffic will be sent to R2 because of the virtual MAC address flapping between interfaces on SW3.  R2 knows that the destination address (10.1.1.101) is on the same subnet where it was received, so R2 sends an ICMP redirect to Host4.  It also thinks that the MAC address being used for 10.1.1.101 by Host4 is incorrect since it is R2’s virtual MAC address, so it sends an ARP request for the destination address:


R2#debug ip icmp
R2#debug arp

*Mar 1 05:58:47.966: ICMP: redirect sent to 10.1.1.4 for dest 10.1.1.101, use gw 10.1.1.101
*Mar 1 05:58:47.966: IP ARP: sent req src 10.1.1.2 0000.0000.0002,
dst 10.1.1.101 0000.0000.0000 FastEthernet0/0

The ICMP redirect tells Host4 to use gateway 10.1.1.101 to reach 10.1.1.101 – not very helpful to Host4, but R2 doesn’t realize that Host4 already thought it was sending directly to 10.1.1.101.  After receiving the ICMP redirect from R2, Host4 also sends out an ARP request for 10.1.1.101.  R1 replies to both ARP requests:


R1#debug arp
*Mar 1 05:58:48.002: IP ARP: rcvd req src 10.1.1.2 0000.0000.0002, dst 10.1.1.101 FastEthernet0/0
*Mar 1 05:58:48.002: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
dst 10.1.1.2 0000.0000.0002 FastEthernet0/0
*Mar 1 05:58:49.934: IP ARP: rcvd req src 10.1.1.4 0000.0000.0004, dst 10.1.1.101 FastEthernet0/0
*Mar 1 05:58:49.938: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
dst 10.1.1.4 0000.0000.0004 FastEthernet0/0

Neither of the ARP replies are helpful.  R2 uses the same MAC for it’s virtual MAC, so it does not add the entry to it’s ARP cache.  R1 receives the same destination MAC address that it already was using for 10.1.1.101.  Host4 continues trying to use the same MAC to reach 10.1.1.101, and the cycle will continue with only some of the traffic making it to R1 because some of it is switched incorrectly at SW3.

 

 

VRRP Group Number Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.5e00.0101

R2

10.1.1.101

2

0000.5e00.0102

 

R1:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 vrrp 2 ip 10.1.1.101

A group number mismatch in VRRP behaves identically to a group number mismatch in HSRP.  R1 and R2 ignore each other’s hellos and both routers become master:

 5-vrrp-brief-r1

5-vrrp-brief-r2

When transitioning to master, VRRP broadcasts gratuitous ARP messages:

R2#debug arp
*Mar 1 06:28:37.346: %VRRP-6-STATECHANGE: Fa0/0 Grp 2 state Backup -> Master
*Mar 1 06:28:37.346: IP ARP: sent rep src 10.1.1.101 0000.5e00.0102,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 06:28:37.350: IP ARP: sent rep src 10.1.1.101 0000.5e00.0102,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

After hearing each other’s gratuitous ARPs, each router generates a log message reporting a duplicate IP address and broadcasts another gratuitous ARP in an attempt to fix the ARP cache of other devices on the network which may have been changed by the other router’s gratuitous ARP:

R1#debug arp
*Mar 1 06:28:37.366: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar 1 06:28:37.366: %IP-4-DUPADDR: Duplicate address 10.1.1.101 on FastEthernet0/0, sourced by 0000.5e00.0102
*Mar 1 06:28:37.370: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar 1 06:28:37.370: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

This results in gratuitous ARP broadcasts continually being sent by R1 and R2 in response to receiving each other’s gratuitous ARPs.  It appears that IOS limits gratuitous ARPs to 1 per second based on the ARP debug messages shown below.  R1 receives a gratuitous ARP from R2 at 39.290.  At the same time, a message is shown immediately after it that a gratuitous ARP was throttled and 10.1.1.101 was added to arp_defense_Q.  At 39.350, another debug message shows that 10.1.1.101 was removed from arp_defense_Q and the gratuitous ARP is broadcast using R1’s virtual MAC as the source.  Another gratuitous ARP is received from R2 shortly after that, and R1 again throttles it and waits to send it’s own gratuitous ARP until 40.350 – exactly 1 second after it sent the previous one.  This pattern continues on each router:

 
R1:
*Mar  1 06:28:39.290: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar  1 06:28:39.290: IP ARP: Gratuitous ARP throttled.
*Mar  1 06:28:39.350: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar  1 06:28:39.350: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar  1 06:28:39.354: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0
*Mar  1 06:28:39.450: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102,
dst 10.1.1.101 FastEthernet0/0
*Mar  1 06:28:39.454: IP ARP: Gratuitous ARP throttled.
*Mar  1 06:28:39.454: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar  1 06:28:40.278: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar  1 06:28:40.278: IP ARP: Gratuitous ARP throttled.
*Mar  1 06:28:40.350: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar  1 06:28:40.350: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar  1 06:28:40.354: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0
*Mar  1 06:28:41.210: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar  1 06:28:41.210: IP ARP: Gratuitous ARP throttled.
*Mar  1 06:28:41.210: IP ARP: 10.1.1.101 added to arp_defense_Q
*Mar  1 06:28:41.350: IP ARP: 10.1.1.101 removed from arp_defense_Q
*Mar  1 06:28:41.350: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 ffff.ffff.ffff FastEthernet0/0
*Mar  1 06:28:41.354: IP ARP: sent rep src 10.1.1.101 0000.5e00.0101,
                 dst 10.1.1.101 0100.0ccd.cdcd FastEthernet0/0

As a result, hosts on the network continually update their ARP cache twice per second:


Host4#debug arp
*Mar 1 07:01:47.518: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0101, dst 10.1.1.101 FastEthernet0/0
*Mar 1 07:01:48.262: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar 1 07:01:48.482: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0101, dst 10.1.1.101 FastEthernet0/0
*Mar 1 07:01:49.258: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0102, dst 10.1.1.101 FastEthernet0/0
*Mar 1 07:01:49.402: IP ARP: rcvd rep src 10.1.1.101 0000.5e00.0101, dst 10.1.1.101 FastEthernet0/0

5-hsrp-host4-sharp

SW3 learns each of the virtual MACs on the interfaces they are received and the table stays the same over time since R1 and R2 use separate virtual MACs:

 5-mac-table

Any traffic that hosts send to a remote network with the virtual IP configured as their gateway will end up being split between the 2 routers (whichever router’s virtual MAC is currently in the hosts ARP cache at that moment will receive the frame).

 

 

VRRP Authentication Mismatch:

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0000.5e00.0101

cisco1

R2

10.1.1.101

1

0000.5e00.0101

cisco2

 

R1:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.101
 vrrp 1 authentication cisco1

R2:
interface FastEthernet0/0
 vrrp 1 ip 10.1.1.101
 vrrp 1 authentication cisco2

VRRP authentication mismatch also behaves like HSRP authentication mismatch.  R1 and R2 both think that they are the master for the group:

6-vrrp-brief-r14

6-vrrp-brief-r2

Debugs show that the authentication has failed:


R1#debug vrrp errors
*Mar 1 07:25:34.542: VRRP: Grp 1 Advertisement from 10.1.1.2 has FAILED TEXT authentication

Since both routers use the same virtual IP and virtual MAC, they do not generate gratuitous ARPs in response to receiving a gratuitous ARP from the other router.  However, since SW3 receives VRRP hellos sourced from the virtual MAC by both routers, the MAC address flaps back and forth between interfaces as often as R1 and R2 send VRRP hellos (or more if they send other traffic):

6-mac-table

Any traffic that hosts send to a remote network with the virtual IP configured as their gateway ends up being split between the 2 routers depending on which interface the virtual MAC address was last learned out of by the switch at the moment a frame is received from a host.

 

 

GLBP IP Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0007.b400.0101

R2

10.1.1.102

1

0007.b400.0102

 

R1:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.102

Like HSRP, GLBP accepts a router as active even though the virtual IP differs.  R1 was configured before R2, so it becomes active for the AVG and AVF #1.  R2 becomes standby for the AVG and active for AVF #2:

7-glbp-brief-r1

7-glbp-brief-r21

R2 generates the following log message, but still continues to accept R1 as the AVG:

R2:
*Mar 1 07:57:22.906: %GLBP-4-DIFFVIP1: FastEthernet0/0 Grp 1 active routers virtual
IP address 10.1.1.101 is different to the locally configured
address 10.1.1.102

SW3 learns the AVF #1 virtual MAC address on F1/1 and AVF #2 virtual MAC on F1/2 since R1 and R2 both source hellos from those MAC addresses:

7-mac-table1

Like HSRP, the configured virtual IP of the router that is in standby is not reachable because the standby router does not respond to it.  If this was supposed to be the correct gateway address, hosts on the subnet would no longer be able to reach it or remote destinations:

7-glbp-ping-r2

7-glbp-host5-ping-r2

Additionally, because of the way that GLBP load balances ARP replies, even if the router with the correct virtual IP becomes the AVG hosts may still receive the virtual MAC of the router configured with the wrong virtual IP.  In this case, Host4 performs an ARP request first and receives the virtual MAC of R1.  Host5 performs an ARP request next and receives the virtual MAC of R2:

7-glbp-host4-sharp

7-glbp-host5-sharp

As we saw when VRRP was configured with an IP mismatch, this did not cause a problem for reaching remote networks because the host still uses the correct MAC address to reach one of the two routers.  Both Host4 and Host5 are able to reach an address on a different subnet:

7-glbp-host4-ping-remote1

7-glbp-host5-ping-remote

With mismatched IPs in VRRP, when hosts attempted to reach the gateway address itself, some traffic would make it and some would not due to the address flapping between interfaces, with all hosts being affected equally.  With mismatched IPs in GLBP, only some of the hosts are affected when trying to reach the gateway address.  Those that obtain the virtual MAC of a router that is configured with the same virtual IP as the AVG will always be able to reach the gateway, and those that obtain the virtual MAC of a router that is configured with a different virtual IP than the AVG will not be able to reach the gateway address at all:

7-glbp-host4-ping-gateway

7-glbp-host5-ping-gateway

 

 

GLBP Group Number Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0007.b400.0101

R2

10.1.1.101

2

0007.b400.0201

 

R1:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.101

R2:
interface FastEthernet0/0
 glbp 2 ip 10.1.1.101

Like HSRP and VRRP, when 2 GLBP routers are configured with mismatched group numbers both become active:

8-glbp-brief-r1

8-glbp-brief-r2

Each router also uses a different virtual MAC.  GLBP virtual MACs by default are in the form of 0007.b40X.XXYY, where XXX is the group number in hexadecimal and YY is the forwarder number in hexadecimal.  The group numbers for R1 and R2 were set to 1 and 2 respectively, and since each router thinks that it is the only router in the group, they have both assigned themselves AVF #1.  This results in a virtual MAC of 0007.b400.0101 for R1 and 0007.b400.0201 for R2.  Unlike HSRP and VRRP, GLBP does not send gratuitous ARPs when becoming active so gratuitous ARPs are not sent back and forth constantly.  When a host ARPs for it’s gateway, both R1 and R2 reply with their virtual MAC.  The first reply creates an entry in ARP cache, and the second overwrites it, so the host will use the last ARP reply to arrive as it’s gateway.  In this case, the reply from R1 comes last so Host4 uses R1’s virtual MAC:


Host4#debug arp
Host4#ping 10.1.1.101

*Mar 1 08:12:54.489: IP ARP: creating incomplete entry for IP address: 10.1.1.101 interface FastEthernet0/0
*Mar 1 08:12:54.489: IP ARP: sent req src 10.1.1.4 0000.0000.0004,
dst 10.1.1.101 0000.0000.0000 FastEthernet0/0
*Mar 1 08:12:54.573: IP ARP: rcvd rep src 10.1.1.101 0007.b400.0201, dst 10.1.1.4 FastEthernet0/0
*Mar 1 08:12:54.621: IP ARP: rcvd rep src 10.1.1.101 0007.b400.0101, dst 10.1.1.4 FastEthernet0/0

8-glbp-host4-sharp

Hosts will use either one gateway or the other, depending on which ARP reply they receive last.  As long as both routers have full connectivity to other networks, hosts should still be able to reach any address both local and remote.

 

 

GLBP Authentication Mismatch

 

Virtual IP

Group #

Virtual MAC

Authentication

R1

10.1.1.101

1

0007.b400.0101

cisco1

R2

10.1.1.101

1

0007.b400.0101

cisco2

 

R1:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.101
 glbp 1 authentication text cisco1

R2:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.101
 glbp 1 authentication text cisco2

A GLBP authentication mismatch behaves just like HSRP and VRRP with an authentication mismatch.  R1 and R2 both think they are the AVG for group 1:

9-glbp-brief-r1

9-glbp-brief-r2

The following log message shows up on each router:

R1:
*Mar 1 08:29:17.833: %GLBP-4-BADAUTH: Bad authentication received from 10.1.1.2, group 1

SW3 receives GLBP hellos from the same virtual MAC by both routers, so the MAC address flaps back and forth between interfaces on SW3 as often as R1 and R2 send hellos (or more if they send other traffic):

9-mac-table

Any traffic that hosts send to a remote network with the virtual IP configured as their gateway ends up being split between the 2 routers depending on which interface the virtual MAC address was last learned out of by the switch at the moment a frame is received from a host.

 

 

Summary:

A brief summary of the differences in each misconfiguration:

 

IP Mismatch

Group Mismatch

Auth. Mismatch

HSRP

  Different vIP, same vMAC

 Only 1 becomes active

 Active vIP is reachable, non-active vIP is not reachable

  If correct vIP becomes active, other networks are reachable

  If incorrect vIP becomes active, hosts cannot reach other networks

 

  Same vIP, different vMAC

  Both become active

 G. ARPs constantly sent

  Hosts constantly update ARP cache

 Traffic from hosts split between both routers, based on current host ARP cache

 Active vIP and other networks are reachable

  Same vIP and vMAC

 Both become active

  vMAC flaps on SW3

 Traffic from hosts split between both routers, based on current SW3 MAC table

 Active vIP and other networks are reachable

VRRP

  Different vIP, same vMAC

  Both become active

  vMAC flaps on SW3

 Both active vIPs are intermittently reachable, based on SW3 MAC table

  Other networks are reachable regardless of SW3 MAC table

 

  Same as above

 Same as above

GLBP

 Different vIP and vMAC

  Only 1 becomes active

  Non-active vIP is not reachable

  If incorrect vIP becomes active, hosts cannot reach other networks

  Active vIP is reachable to some hosts but not others, based on vMAC received from GLBP load balancing

  If correct vIP is active, other networks are reachable regardless of vMAC that the host receives

  Same vIP, different vMAC

  Both become active

  G. ARPs not sent

 Hosts that ARP for gateway receive replies from both AVGs

  Traffic from hosts split between both routers, based on which ARP reply was received last

  Active vIP and other networks are reachable

 

  Same as above

Advertisements

Posted in HSRP VRRP and GLBP | Leave a Comment »

GLBP Weights, Load Balancing, and Redirection

Posted by Andy on February 11, 2009

This post will take a look at how weighting, load balancing, the redirect timer, and the timeout timer work in GLBP.  The topology for these tests is shown below:

glbp-topology2

All routers will be configured to preempt and each router will be configured with GLBP priority 100 + X where X is the router number so that the highest numbered router that is online will become the AVG for the group.  Each router’s interface MAC address will be changed to 0000.0000.000X where X is the router number for the sake of clarity.  We will configure each of the routers in order (the order is important for how forwarders are assigned) and will wait to configure GLBP on R5 for now.  The configuration is:

R1:
interface FastEthernet0/0
 ip address 10.1.1.1 255.255.255.0
 mac-address 0000.0000.0001
 glbp 1 preempt
 glbp 1 priority 101
 glbp 1 ip 10.1.1.100

R2:
interface FastEthernet0/0
 ip address 10.1.1.2 255.255.255.0
 mac-address 0000.0000.0002
 glbp 1 preempt
 glbp 1 priority 102
 glbp 1 ip 10.1.1.100

R3:
interface FastEthernet0/0
 ip address 10.1.1.3 255.255.255.0
 mac-address 0000.0000.0003
 glbp 1 preempt
 glbp 1 priority 103
 glbp 1 ip 10.1.1.100

R4:
interface FastEthernet0/0
 ip address 10.1.1.4 255.255.255.0
 mac-address 0000.0000.0004
 glbp 1 preempt
 glbp 1 priority 104
 glbp 1 ip 10.1.1.100

R5:
interface FastEthernet0/0
 ip address 10.1.1.5 255.255.255.0
 mac-address 0000.0000.0005

R4 has become the AVG due to it’s higher priority and preemption enabled.  We can also see that R1 is active for AVF #1, R2 for AVF #2, R3 for AVF #3, and R4 for AVF #4 since they were configured in that order:

glbp-1

Now we will bring R5 online:

R5:
interface FastEthernet0/0
 glbp 1 preempt
 glbp 1 priority 105
 glbp 1 ip 10.1.1.100

R5 becomes the AVG but remains in a Listen state for all 4 AVFs:

glbp-2

Since GLBP has a limit of 4 AVFs per group, no new ones were created for R5.  Let’s try increasing R5’s weighting value from the default of 100:

R5:
interface FastEthernet0/0
 glbp 1 weighting 200

We can see that the weighting on R5 is now 200 and that R5 knows each of the other AVFs are using the default value of 100.  We can also see that R5 is (by default) configured to preempt forwarders after a 30 second delay.  Despite these facts, R5 still remains in the Listen state for all 4 AVFs:

glbp-3

Let’s create a loopback interface on R1 and configure GLBP to track it and decrement the weight value when it goes down:

R1:
interface Loopback0
 ip address 1.1.1.1 255.255.255.0
!
track 1 interface Loopback0 line-protocol
!
interface FastEthernet0/0
 glbp 1 weighting 100 lower 90 upper 95
 glbp 1 weighting track 1 decrement 20

This configuration on R1 essentially says “Start with a weight value of 100.  If Loopback0 goes down, decrement the weight by 20.  If the the weight falls below 90, this router is no longer allowed to be an AVF.  Once the weight has fallen below 90, do not allow the router to become the AVF again until the weight is at least 95.”  Now we will shutdown Loopback 0 on R1:

R1:
interface Loopback0
 shutdown

After R5’s 30 second AVF preemption timer expires, R5 takes over the role of active for AVF #1 and R1 transitions to listening:

glbp-5

glbp-4

Now let’s bring R1’s loopback interface back up:

R1:
interface Loopback0
 no shutdown

After 30 seconds, R1 preempts R5 and reclaims AVF #1 even though it’s weight (100) is lower than R5’s (200):

glbp-6

glbp-7

As we can see, the owner ID for AVF #1 is set to R1’s MAC address since it was the first GLBP router to come online.  As soon as R1’s weighting value is greater than or equal to the upper threshold it is allowed to reclaim it’s AVF role as long as forwarder preemption is enabled on it.

Let’s look at another scenario where the AVF that we are trying to preempt is not the owner of that AVF.  We will shutdown R5’s F0/0 interface and then shut down R1’s loopback interface.  This causes R1 to fall below the lower weighting threshold again, and since R5 is not available to take over AVF #1, one of the other routers must take over AVF #1 in addition to it’s own AVF:

R5:
interface FastEthernet0/0
shutdown

R1:
interface Loopback0
shutdown

After the 30 second preemption delay expires on R2, R3, and R4, each of them becomes active for AVF #1 for a few milliseconds before hearing each other’s hellos and deciding on one of them to remain active.  (I’m not sure how this is determined exactly; after shutting down and re-enabling R1’s loopback interface 5 different times, R3 took over AVF #1 two times and R4 took over AVF #1 three times.  Cisco does not seem to have much information available on GLBP so it is hard to know for sure, but it is unimportant for this test as long as 1 of the 3 remaining routers takes over AVF #1.  If we needed it to be deterministic, we could simply configure a lower forwarder preemption delay on 1 of the 3 routers).  We can now see that R4 has taken over AVF #1 in addition to it’s own:

glbp-8

Now we will bring R5 back online:

R5:
interface FastEthernet0/0
 no shutdown

Even though R5 has a higher weight than R4 and R4 is not the owner of AVF #1, R5 still does not preempt R4 and take over AVF #1:

glbp-9

glbp-10

For the last test related to weights, we will look at what happens when the weighting value of a router falls below the lower threshold and no other routers in the group are configured for forwarder preemption.  We will re-enable R1’s loopback and verify that it becomes active for AVF #1 again:

R1:
interface Loopback0
 no shutdown

glbp-111

Next we will disable forwarder preemption on all other routers and then shutdown R1’s loopback, causing the weight on R1 to fall below the lower threshold:

R2:
interface FastEthernet0/0
 no glbp 1 forwarder preempt

R3:
interface FastEthernet0/0
 no glbp 1 forwarder preempt

R4:
interface FastEthernet0/0
 no glbp 1 forwarder preempt

R5:
interface FastEthernet0/0
 no glbp 1 forwarder preempt

R1:
interface Loopback0
 shutdown

R1 indicates that it has crossed the lower threshold, but it still remains active for AVF #1:

glbp-12

Without forwarder preemption configured on any other routers in the group, crossing the lower threshold does not cause that router to lose it’s active state.

 

Next we will look at the different types of load balancing that can be used for ARP replies to clients.  First we will re-enable R1’s loopback and re-enable forwarder preemption on the other routers:

R1:
interface Loopback0
 no shutdown

R2:
interface FastEthernet0/0
 glbp 1 forwarder preempt

R3:
interface FastEthernet0/0
 glbp 1 forwarder preempt

R4:
interface FastEthernet0/0
 glbp 1 forwarder preempt

R5:
interface FastEthernet0/0
 glbp 1 forwarder preempt

At this point R5 is the AVG for the group and R1-4 are each AVFs for the forwarder that they own:

glbp-13 

There are three ways GLBP can be used to load balance ARP replies to clients: round-robin, weighted, and host dependent.  We will look at round-robin (the default) first.  We will also enable the GLBP client-cache, which keeps track of the IP and MAC received from a client in an ARP request and the forwarder to which the client was assigned based on the virtual MAC sent in the ARP reply:

R5:
interface FastEthernet0/0
 glbp 1 client-cache maximum 30

Now we will generate ARP requests from various MAC and IP addresses on a PC attached to the network using ColaSoft Packet Builder.  The ARP requests will use MAC address 0000.0000.00XX and IP address 10.1.1.XX, where XX is a number written in decimal starting at 10 and incrementing to 29.  This will give us a total of 20 unique ARP requests being sent to the GLBP virtual IP address at a rate of 1 per second.  After sending the 20 ARP requests, we can see the results on R5:

glbp-14

show glbp detail gives a lot of useful information (show glbp client-cache can also be used to view just the client cache).  We can see that round-robin is the load balancing method and that 20 of 30 cache entries are being used.  We can also see exactly how the load balancing of ARP replies has taken place: AVF #1 received the first client (10.1.1.10) and every fourth client after that.  Age shows the amount of time since an ARP request was last received and updates shows the number of ARP requests received from the client.  Next let’s send 20 duplicate ARP requests of the first entry only (MAC 0000.0000.0010, IP 10.1.1.10).  The results on R5 look like this:

glbp-15

The ‘Client selection count’ keeps track of the number of replies sent with that vMAC in response to ARP requests, whether those ARP requests are unique or not.  We can see that 10 replies with each vMAC have been sent, which equals the 20 unique ARP requests initially sent plus the 20 duplicates.  The client cache only contains unique client entries; if repeat ARP requests are received from a client the age timer is reset and update counter is incremented.  Since AVF#4 received the last client in the previous test, AVF#1 is next in line to have it’s vMAC used in the ARP reply.  After 20 requests, the round-robin process ends on AVF#4 again and 10.1.1.10 is now a client of AVF#4, resulting in an unequal number of clients per AVF.  Next we will change the load balancing method on R5 to use weighted load balancing.  First the GLBP configuration on R5 is removed to clear the counters of ARP replies issued, then the load balancing method is changed to weighted.  We will also change the weights on some of the other routers since they are all currently using the default of 100:
 
R5:
interface FastEthernet0/0
 no glbp 1 ip
 no glbp 1 priority
 no glbp 1 preempt
 no glbp 1 weighting
 no glbp 1 client-cache

A few seconds later…

R5:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.100
 glbp 1 priority 105
 glbp 1 preempt
 glbp 1 client-cache maximum 30
 glbp 1 load-balancing weighted

R1:
interface FastEthernet0/0
 glbp 1 weighting 200
 

R2:
interface FastEthernet0/0
 glbp 1 weighting 100

R3:
interface FastEthernet0/0
 glbp 1 weighting 80

R4:
interface FastEthernet0/0
 glbp 1 weighting 20

 After sending the same 20 unique ARP requests again, the results on R5 are:

glbp-16

Each AVF has it’s vMAC used in a percentage of the ARP replies equal to it’s weight divided by the sum of all AVF weights.  In this case, that means each AVF receives:

R1:    200 / (200 + 100 + 80 + 20)    = 50%, or 10 clients

R2:    100 / (200 + 100 + 80 + 20)    = 25%, or 5 clients

R3:    80 / (200 + 100 + 80 + 20)    = 20%, or 4 clients

R4:    20 / (200 + 100 + 80 + 20)    = 5%, or 1 client

For the last GLBP load balancing method, we will change it to host dependent after clearing the GLBP counters on R5:

R5:
interface FastEthernet0/0
 no glbp 1 ip
 no glbp 1 priority
 no glbp 1 preempt
 no glbp 1 weighting
 no glbp 1 client-cache
 default glbp 1 load-balancing


 
A few seconds later…

R5:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.100
 glbp 1 priority 105
 glbp 1 preempt
 glbp 1 client-cache maximum 30
 glbp 1 load-balancing host-dependent

After sending the same 20 unique ARP requests again, the results on R5 are:

glbp-171

AVF #1 and #2 have each received 6 clients, while #3 and #4 have each received 4 clients.  Now let’s send 20 duplicate ARP requests like we did earlier from MAC: 0000.0000.0010, IP: 10.1.1.10.  This client is currently assigned to AVF #1.  After sending the ARP requests, R5 looks like this:

glbp-18

The ‘Client selection count’ has increased from 6 to 26 for AVF#1, while the other remained the same.  We can also see in the client cache that this client still uses AVF#1.  Now let’s look at another interesting aspect of host-dependent load balancing.  Notice that the clients were not completely evenly distributed to begin with; instead AVF#1 and #2 received 6 each while #3 and #4 received 4 each.  Also, there appears to be a pattern in the client addresses assigned to each AVF: 

AVF#1 has all clients whose MAC/IP ends in 0, 4, or 8

AVF#2 has all clients whose MAC/IP ends in 1, 5, or 9

AVF#3 has all clients whose MAC/IP ends in 2 or 6

AVF#4 has all clients whose MAC/IP ends in 3 or 7

Let’s test this theory by resetting R5’s GLBP configuration to clear the counters and clients, and then sending ARP requests from only clients whose MAC/IP ends in a 0, 4, or 8.  Just to be certain that there is no way R5 is somehow remembering which AVF the clients were assigned to before, we will use 10 MAC/IP addresses that the router has never seen before starting at MAC: 0000.0000.0030, IP: 10.1.1.30

R5:
interface FastEthernet0/0
 no glbp 1 ip
 no glbp 1 priority
 no glbp 1 preempt
 no glbp 1 client-cache
 default glbp 1 load-balancing

A few seconds later…

R5:
interface FastEthernet0/0
 glbp 1 ip 10.1.1.100
 glbp 1 priority 105
 glbp 1 preempt
 glbp 1 client-cache maximum 30
 glbp 1 load-balancing host-dependent

Here are the results on R5:

glbp-19

All 10 new clients received the vMAC of AVF#1 in their ARP reply.  Therefore, GLBP does not need to have already received an ARP request from a client in order to know which AVF to assign it to – the results are deterministic even if it is a brand new client.  In fact, even if the GLBP client cache is disabled (tested but not shown), the results still come out the same every time when a new client generates an ARP request.  The assignment of clients to AVFs with host-dependent load balancing appears to be based on some function of the client’s MAC and/or IP address rather than a database of which AVF the client was previously assigned to.

 

Next we will look at how the redirect and timeout timers function in GLBP.  The redirect timer in GLBP controls when the AVG will stop replying to client ARP requests with the virtual MAC address of an AVF that has been taken over by a router other than the original owner.  The timeout timer controls when the AVG removes the AVF entirely if the owner has not reclaimed it, and clients that had obtained that virtual MAC address through ARP (either before the AVF was taken over by a router other than the owner, or after it was taken over by a router other than the owner but before the redirection timer expired) must obtain a different virtual MAC address to use for the gateway.  First we will look at what happens when there are 4 or less routers in the GLBP group.  We will take R5 offline so that R4 becomes the new AVG:

 R5:
interface FastEthernet0/0
 shutdown

glbp-20

We will decrease the redirect and timeout timers to 60 and 660 seconds (the timeout must be at least 600 seconds more than the redirect) so that we can see the results more quickly:

R4:
interface FastEthernet0/0
 glbp 1 timers redirect 60 660

Now we will pretend R1 fails by shutting down it’s interface:

R1:
interface FastEthernet0/0
 shutdown

After the 10 second holddown timer expires, R4 becomes active for AVF#1 in addition to it’s own.  The redirection timer for AVF#1 begins counting down from 60 seconds from the time R4 last heard a hello from R1:

glbp-21

If we generate 20 ARP requests from the PC before the redirection timer expires, we find that R4 includes AVF#1 in it’s round robin cycle and each AVF handles 5 client requests:

glbp-22

If we generate another 20 ARP requests after the redirection timer has expired for AVF#1, AVF#1 is no longer included in the round-robin cycle of ARP replies and the new clients are split up between the 3 remaining AVFs:

glbp-23

After 660 total seconds, the timeout timer expires and AVF#1 is removed:

glbp-24

If instead we had 5 routers, when R1 failed R5 would take over AVF#1.  When the timeout timer for AVF#1 expires the AVF is removed, but instead of the group continuing to function with 3 AVFs, a new AVF is created after a few seconds with R5’s MAC as the owner ID:

R5:
02:35:17.098: %GLBP-6-FWDSTATECHANGE: FastEthernet0/0 Grp 1 Fwd 1 state Active -> Disabled
02:35:28.898: %GLBP-6-FWDSTATECHANGE: FastEthernet0/0 Grp 1 Fwd 1 state Listen -> Active

glbp-25

Posted in HSRP VRRP and GLBP | Leave a Comment »

HSRP/VRRP/GLBP Timers and Preemption

Posted by Andy on February 9, 2009

This post will take a look at some of the less common issues related to timers, preemption, and preemption delays in the first hop redundancy protocols.  The topology is shown below:

hsrp-vrrp-glbp1-topology

 

First we will configure HSRP, VRRP, and GLBP on R1 and R2 without changing any of the default settings:

R1:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 vrrp 1 ip 10.1.1.102
 glbp 1 ip 10.1.1.103

R2:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 vrrp 1 ip 10.1.1.102
 glbp 1 ip 10.1.1.103

As expected, R2 becomes the master router for VRRP group 1 and active router for HSRP and GLBP group 1 (as long as it is configured before R1 can transition to active) since R1 and R2 have equal priority and R2 has the higher IP address:

hsrp-vrrp-glbp-13

Next let’s enable preemption for all 3 protocols on R3 and then bring each of them up.  Preemption is enabled by default for VRRP so we only need to enable it for HSRP and GLBP:

R3:
interface FastEthernet1/0
 standby 1 preempt
 standby 1 ip 10.1.1.101
 vrrp 1 ip 10.1.1.102
 glbp 1 preempt
 glbp 1 ip 10.1.1.103

R3 now has an equal priority to the other routers in all 3 groups, the highest IP address in all 3 groups, and is allowed to preempt the master/active router of all 3 groups.  Take a look at the results after applying this configuration:

hsrp-vrrp-glbp-2

R3 has become the master for the VRRP group.  R3 has also overthrown R1 to become the standby router for both HSRP and GLBP, however it does not transition to active.  Of these 3 protocols, only VRRP allows a higher IP address to preempt a master/active router; HSRP and GLBP require a higher priority value in order for preemption to occur.

 

Next we will look at the hello and hold timers for each of the protocols.  First we will increase the HSRP and GLBP priorities on R3 so that it is the master/active router for each protocol:

R3:
interface FastEthernet1/0
 standby 1 priority 101
 glbp 1 priority 101

hsrp-vrrp-glbp-3

We will change the hello and hold timers for HSRP and GLBP to different values on each of the three routers:

R1:
interface FastEthernet1/0
 standby 1 timers 6 18
 glbp 1 timers 6 18

R2:
interface FastEthernet1/0
 standby 1 timers 5 15
 glbp 1 timers 5 15

R3:
interface FastEthernet1/0
 standby 1 timers 4 12
 glbp 1 timers 4 12

This does not cause a problem for HSRP or GLBP since the hello and hold timers are communicated by the active router in hello messages, and timers learned from the active router override manually configured timers.  Since R3 is active for both protocols, it is using a hello timer of 4 seconds and hold timer of 12 seconds:

hsrp-vrrp-glbp-4

hsrp-vrrp-glbp-5

R1 and R2 show that they have learned and are using these timer values, and make a note in parenthesis of what their actual configured values are:

hsrp-vrrp-glbp-6

hsrp-vrrp-glbp-7

If R3 fails or is preempted by a different router, the new active router will use it’s manually configured timers and the other routers will relearn the timers from that router.  In this case, if R3 fails R2 will become active for both HSRP and GLBP since it was already standby because it had a higher IP address than R1.  R1 will then learn the new timers from R2:

R3:
interface FastEthernet1/0
 shutdown

hsrp-vrrp-glbp-8

hsrp-vrrp-glbp-9

Next we will change the hello and hold timers for HSRP and GLBP to use sub-second values.  We will re-enable R3’s interface so that it becomes active for all groups and make the changes on R3:

R3:
interface FastEthernet1/0
 no shutdown
 glbp 1 timers msec 200 msec 800
 standby 1 timers msec 200 msec 800


R1 and R2 learn the new GLBP timers without any problems:

hsrp-vrrp-glbp-10

hsrp-vrrp-glbp-111

HSRP, however, behaves differently:

hsrp-vrrp-glbp-121

hsrp-vrrp-glbp-131

hsrp-vrrp-glbp-14

R1 and R2 still see R3 as the active router, but neither has learned R3’s hello or hold timers – instead R1 and R2 are using their manually configured timers.  Looking at the HSRP packets in Wireshark reveals what the problem is:

hsrp-vrrp-glbp-wireshark-1 

hsrp-vrrp-glbp-wireshark-2

The first picture shows an HSRP hello sent by R2 and the second an HSRP hello sent by R3.  The hellotime and holdtime fields are each 1 byte long and contain the manually configured timers on that router in seconds.  We can see that R3 has set both of them to 0 since there is no way of entering a sub-second value, so R1 and R2 continue to use their manually configured timers.  The hold time for the active router never falls below 17.8 seconds on R1 and 14.8 seconds on R2 since R3 is sending hellos every 200 ms.  If R3 fails, failover will not occur for 15 seconds because R2 is not able to take advantage of the sub-second hold time.  Since R3 is using a holddown timer of 800 ms and R2 (the standby router) is using its manually configured timer to send hellos every 5 seconds, R3 will continually cycle through seeing R2 as the standby router for 800 ms and thinking that there is no standby router for the next 4200 ms:

hsrp-vrrp-glbp-15

HSRP version 2 can be used to overcome this problem.  We will configure HSRP version 2 on each router:

R1:
interface FastEthernet1/0
 standby version 2

R2:
interface FastEthernet1/0
 standby version 2

R3:
interface FastEthernet1/0
 standby version 2

HSRP version 2 sends the hello and hold timers in milliseconds and uses a 4 byte field for each of them:

hsrp-vrrp-glbp-wireshark-3

We can see that R1 and R2 now correctly learn R3’s timers and prefer them over their manually configured ones:

hsrp-vrrp-glbp-16

hsrp-vrrp-glbp-17

Unlike HSRP and GLBP, VRRP does not automatically learn timers from the master router.  Also unlike HSRP and GLBP, VRRP requires that the hello timer of all routers in the group match.  Let’s try configuring a different hello timer on each router:

R1:
interface FastEthernet1/0
 vrrp 1 timers advertise 2

R2:
interface FastEthernet1/0
 vrrp 1 timers advertise 3

R3:
interface FastEthernet1/0
 vrrp 1 timers advertise 4

hsrp-vrrp-glbp-18

hsrp-vrrp-glbp-19

hsrp-vrrp-glbp-20

After applying the configuration, all 3 routers think that they are the master router for the group.  VRRP can be configured to learn timers from the master router so that it behaves similarly to HSRP and GLBP:

R1:
interface FastEthernet1/0
 vrrp 1 timers learn

R2:
interface FastEthernet1/0
 vrrp 1 timers learn

R3:
interface FastEthernet1/0
 vrrp 1 timers learn

If configured with both vrrp timers advertise and vrrp timers learn, timers learned from the master router will override manually configured timers just like with HSRP and GLBP.  R1 and R2 once again see R3 as the master and transition into the backup state.  R1 and R2 also continue to display their manually configured hello timer as well as the hello timer that they have learned from the master:

hsrp-vrrp-glbp-21

hsrp-vrrp-glbp-22

hsrp-vrrp-glbp-23

Like we saw with HSRP and GLBP, if R3 fails, R2 will become the master router and begin advertising its manually configured hello timer of 3 seconds.  R1 will learn the new timer from R2 and accept R2 as the master router.  VRRP has a similar problem to HSRP version 1 when attempting to get other routers in the group to learn sub-second timers.  VRRP even gives a warning if timer learning is enabled on the router that we try to configure sub-second timers on:

R3:
interface FastEthernet1/0
 vrrp 1 timers advertise msec 200

% cannot configure millisecond timers when timer learning enabled

Since R3 is the master router, it won’t be learning timers anyway so the warning is probably there because it assumes our other routers are using timer learning as well.  Let’s see what happens when we disable timer learning on R3 and configure it to advertise millisecond timers to the other routers:

R3:
interface FastEthernet1/0
 no vrrp 1 timers learn
 vrrp 1 timers advertise msec 200

hsrp-vrrp-glbp-24

hsrp-vrrp-glbp-25

hsrp-vrrp-glbp-26

R1 and R2 both see R3 as the master still, but they think that R3’s advertisement interval is 1 second.  Look at what R3’s VRRP packet looks like in Wireshark:

hsrp-vrrp-glbp-wireshark-4

VRRP also uses a 1 byte field for the hello timer in seconds.  In VRRP, a sub-second hello timer results in a hello timer of 1 second being sent.  If the other routers are configured for timer learning, they will learn the received value of 1 second.  If the other routers are not configured for timer learning and their hello timer has been left as the default of 1 second, they will think that the master is using the same hello timer value and accept it.  If the other routers are not configured for timer learning and their hello timer has been manually configured to a nondefault value, the routers will not accept the advertisement from the master configured with a sub-second hello timer and one or more of them will transition to the master state.  One other difference between VRRP and the other 2 protocols in regards to timers is that the master down interval (hold timer) is not configurable.  Instead, VRRP uses a value of 3 times the hello timer + the skew time as the master down interval.  The skew time is calculated as ((256 – VRRP priority) / 256), which will result in higher priority routers having a shorter skew time and master down interval.  First, we will change R3’s hello timer to 1 second so that all 3 routers are using the same hello timer again:

R1:
interface FastEthernet1/0
 vrrp 1 timers advertise 1

Using a hello time of 1 second and default priority of 100, each router calculates the master down interval as:

(3 * 1) + ((256 – 100)/256)  = 3.609 seconds

hsrp-vrrp-glbp-27

If R3 fails, R1 and R2’s master down intervals both expire at the same time and both of them try to seize the role of master router simultaneously:

R3:
interface FastEthernet1/0
 shutdown

hsrp-vrrp-glbp-wireshark-51

Once R1 sees R2’s superior advertisement, it accepts R2 as the master and goes back to the role of backup.  This causes R1 to transition to the master state, and then immediately back to backup a few milliseconds later once R2’s advertisement reaches it:

hsrp-vrrp-glbp-28

If R1 is configured to a lower priority (or R2 and R3 to a higher priority), it’s master down interval will be larger than R2:

R1:
interface FastEthernet1/0
 vrrp 1 priority 10

hsrp-vrrp-glbp-291

R2 now has about 350 ms after it’s master down interval expires for it’s advertisement to reach R1 before R1 tries to claim the role of master.

 

Next we will look at how initialization delay and preemption delay can be used.  HSRP is the only one of the 3 protocols that allows an initialization delay to be configured.  We will use the following HSRP configurations on each router. 

R1:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3

R2:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3

R3:
interface FastEthernet1/0
 shutdown
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3
 standby 1 priority 101
 standby 1 preempt
 standby delay minimum 30

R3 has a higher priority and is configured to preempt, but is shutdown to begin with.  Now we will enable R3’s interface to simulate it coming back online:

R3:
interface FastEthernet1/0
 no shutdown

The configured initialization delay prevents HSRP from progressing beyond the Init state until it expires.  On R3, we can see that the state is Init and a timer counting down the number of seconds left until HSRP can initialize:

hsrp-vrrp-glbp-30

Approximately 30 seconds after the interface comes back up, HSRP transitions to the Listen and then Speak states, and then sends a coup to R2 and transitions to Active:

hsrp-vrrp-glbp-31

 

Preemption delay is supported by all three protocols.  Unlike HSRP intialization delay which kept HSRP in the Init state, preempt delay only prevents the router from transitioning to active/master.  We will take R3 offline and configure a preemption delay of 30 seconds for each protocol on R3.  We will also configure an HSRP intialization delay of 30 seconds to see how it operates with both delays configured.  The configuration of each router is:

R1:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3
 vrrp 1 ip 10.1.1.102
 glbp 1 ip 10.1.1.103
 glbp 1 timers 1 3

R2:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3
 vrrp 1 ip 10.1.1.102
 glbp 1 ip 10.1.1.103
 glbp 1 timers 1 3

R3:
interface FastEthernet1/0
 standby 1 ip 10.1.1.101
 standby 1 timers 1 3
 standby 1 priority 101
 standby 1 preempt delay minimum 30
 standby delay minimum 30
 vrrp 1 ip 10.1.1.102
 vrrp 1 preempt delay minimum 30
 glbp 1 ip 10.1.1.103
 glbp 1 timers 1 3
 glbp 1 priority 101
 glbp 1 preempt delay minimum 30
 shutdown

Approximately 7 seconds after enabling the interface, we can see that VRRP is in the backup state and has 23 seconds remaining until it can preempt the current master:

hsrp-vrrp-glbp-321

Approximately 9 seconds after enabling the interface, HSRP remains in the Init state.  The HSRP intialization delay takes effect before the preemption delay, so HSRP will remain in this state for 30 seconds.  We can see that the initialization delay timer has 21 seconds remaining:

hsrp-vrrp-glbp-34

Approximately 10 seconds after enabling the interface, we can see that GLBP has taken over the standby role.  There are 20 seconds remaining until it can preempt the current AVG:

hsrp-vrrp-glbp-331

After 30 seconds, R3 becomes the VRRP master router and GLBP AVG.  The HSRP initialization delay has expired and preempt delay begins.  Approximately 37 seconds after enabling the interface, we can see that HSRP is now in the standby state and has 23 seconds left on the preemption delay before it can become active:

hsrp-vrrp-glbp-35

After a total of 60 seconds, R3 becomes the active HSRP router.  The messages logged to the console also show a timeline of how the events take place:

 

*Mar 1 05:55:40.574: %VRRP-6-STATECHANGE: Fa1/0 Grp 1 state Init -> Backup
*Mar 1 05:55:42.562: %LINK-3-UPDOWN: Interface FastEthernet1/0, changed state to up
*Mar 1 05:55:43.562: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet1/0, changed state to up
*Mar 1 05:56:11.270: %VRRP-6-STATECHANGE: Fa1/0 Grp 1 state Backup -> Master
*Mar 1 05:56:11.614: %GLBP-6-STATECHANGE: FastEthernet1/0 Grp 1 state Standby -> Active
*Mar 1 05:56:11.618: %GLBP-6-FWDSTATECHANGE: FastEthernet1/0 Grp 1 Fwd 1 state Listen -> Active
*Mar 1 05:56:14.094: %HSRP-5-STATECHANGE: FastEthernet1/0 Grp 1 state Speak -> Standby
*Mar 1 05:56:42.078: %HSRP-5-STATECHANGE: FastEthernet1/0 Grp 1 state Standby -> Active


GLBP also allows preemption and preemption delay to be configured for the Active Virtual Forwarders (AVFs).  By default, preemption is enabled with a delay of 30 seconds.  Looking back at the logging messages, we can see that forwarder 1 changed state to active on R3 at almost the exact same time as the AVG changed state to active, since we configured the AVG with a preempt delay of 30 seconds also.  Let’s see what happens if we disable AVF preemption on R3 and then shutdown and re-enable the interface:

R3:
interface FastEthernet1/0
 no glbp 1 forwarder preempt
 shutdown

A few seconds later…

no shutdown

R3 becomes the AVG again after the 30 second AVG preemption timer expires, however it does not become active for any of the forwarders.  Instead R2, which took over forwarder #1 when R3 went offline, continues to remain active for both forwarder #1 and #2:

hsrp-vrrp-glbp-36

hsrp-vrrp-glbp-37

Posted in HSRP VRRP and GLBP | Leave a Comment »