Cisconinja’s Blog

CBWFQ, Routing Protocols, and max-reserved-bandwidth

Posted by Andy on February 18, 2009

Numerous sources, including Cisco documentation, often say that the percentage of bandwidth excluded from max-reserved-bandwidth (25% by default) is reserved for either link queues (routing updates, keepalives, etc.), unclassified best-effort traffic (matched by class-default), or both.  The 12.4 Mainline command reference for max-reserved-bandwidth says:

The sum of all bandwidth allocation on an interface should not exceed 75 percent of the available bandwidth on an interface. The remaining 25 percent of bandwidth is used for overhead, including Layer 2 overhead, control traffic, and best-effort traffic.

As can be seen in the previous CBWFQ tests that were performed, this definitely does not hold true for best-effort traffic that is put into dynamic conversations.  What about routing updates and other important traffic that ends up in the link queues?  To test this out, we will use the same simple topology shown below in IOS 12.4(18):

topology

R1:
interface FastEthernet0/0
 ip address 10.1.1.1 255.255.255.0
 load-interval 30
 speed 100
 full-duplex
 no keepalive
 no mop enabled
!
interface Serial0/0
 ip address 10.1.12.1 255.255.255.0
 load-interval 30
 no keepalive
!
no cdp run

R2:
interface Serial0/0
 ip address 10.1.12.2 255.255.255.0
 load-interval 30
 no keepalive
!
no cdp run

Next we will configure EIGRP on R1 and R2 and decrease the hello and hold timers to give us a little bit more traffic:

R1:
router eigrp 1
 network 10.1.12.1 0.0.0.0
 no auto-summary
!
interface Serial0/0
 ip hello-interval eigrp 1 1
 ip hold-time eigrp 1 3

R2:
router eigrp 1
 network 10.1.12.2 0.0.0.0
 no auto-summary
!
interface Serial0/0
 ip hello-interval eigrp 1 1
 ip hold-time eigrp 1 3

Next we will configure R2 to measure incoming traffic.  A TFTP flow will be used to create congestion, so we will create 3 different classes to measure TFTP, EIGRP hellos, and EIGRP updates (we will see later on why it was a good idea to measure EIGRP hellos and updates separately):

R2:
ip access-list extended EIGRP-Hello
 permit eigrp any host 224.0.0.10
ip access-list extended EIGRP-Update
 permit eigrp any host 10.1.12.2
ip access-list extended TFTP
 permit udp any any eq tftp
!
class-map match-all EIGRP-Hello
 match access-group name EIGRP-Hello
class-map match-all EIGRP-Update
 match access-group name EIGRP-Update
class-map match-all TFTP
 match access-group name TFTP
!
policy-map Traffic-Meter
 class EIGRP-Hello
 class EIGRP-Update
 class TFTP
!
interface Serial0/0
 service-policy input Traffic-Meter

Now let’s shutdown and re-enable R2’s S0/0 interface and examine how the EIGRP adjacency forms in the absence of congestion:

R2:
interface Serial0/0
 shutdown

A few seconds later…

R2:
interface Serial0/0
 no shutdown

wireshark1

cbwfq2-1-r2pmap

We can see that hello packets are being sent every 1 second in each direction.  When the adjacency forms, 3 update packets are exchanged in each direction followed by an acknowledgement from R1 to R2.  Each of the hello packets has size 64 bytes, which is confirmed in both Wireshark and the policy-map on R2.  Each of the update and acknowledgement packets has size 44 bytes.  Therefore we can expect that EIGRP traffic will use about 512 bps from hellos (8 * 64) plus a small amount of additional bandwidth from updates and acknowledgements at the start.  Next we will configure CBWFQ on R1 and shape traffic to 32 kbps:

R1:
ip access-list extended TFTP
 permit udp any any eq tftp
!
class-map match-all TFTP
 match access-group name TFTP
!
policy-map CBWFQ
 class TFTP
  bandwidth percent 75
 class class-default
  fair-queue 4096
policy-map Shaper
 class class-default
  shape average 32000
  service-policy CBWFQ
!
interface Serial0/0
 service-policy output Shaper

TFTP has been given 75% bandwidth, and the max-reserved-bandwidth has not been changed from the default of 75%.  If the remaining 25% (8,000 bps) is actually used for link queues, EIGRP should have way more bandwidth than it needs.  Now we will generate 64 kbps of TFTP traffic, more than enough to saturate the link and cause CBWFQ to begin:

flood.pl --port=69 --size=996 --delay=125 10.1.12.2

Within a few seconds, the following log messages repeatedly show up:

R1:
*Mar 1 04:25:31.998: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:25:32.930: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency
*Mar 1 04:25:40.302: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:25:41.242: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency
*Mar 1 04:25:48.278: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:25:49.222: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency
*Mar 1 04:25:56.538: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:25:57.530: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency
*Mar 1 04:26:04.782: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is down: Interface Goodbye received
*Mar 1 04:26:05.730: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.2 (Serial0/0) is up: new adjacency

R2:
*Mar 1 04:26:34.506: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:26:37.506: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired
*Mar 1 04:26:42.774: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:26:45.770: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired
*Mar 1 04:26:51.242: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:26:54.238: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired
*Mar 1 04:26:59.298: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:27:02.298: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired
*Mar 1 04:27:07.522: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 04:27:10.518: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: holding time expired

On R2, we can see that the hold timer keeps expiring and approximately every 8 to 8.5 seconds the adjacency is reforming.  Next take a look at the queues on R1 and the traffic  received on R2:

cbwfq2-2-r1queue

cbwfq2-2-r2pmap

As we saw in previous CBWFQ tests, CBWFQ uses a constant that is based on the number of WFQ dynamic queues when calculating the weights for user defined conversations as shown in the following table:

Number of flows

Constant

16

64

32

64

64

57

128

30

256

16

512

8

1024

4

2048

2

4096

1

We configured WFQ to use 4096 dynamic conversations, which results in the smallest possible constant.  Using the formula to calculate weights for user defined conversations, TFTP is assigned a weight of:

1  * (100 / 75)   = 1.33

When rounded up this becomes 2 as shown in the show traffic-shape queue output.  We also see 2 other conversations with packets enqueued.  Both are IP protocol 88, so we know they are both EIGRP.  One of them has destination address 224.0.0.10 and size 64 (EIGRP hellos) and the other has destination address 10.1.12.2 and size 44 (EIGRP updates).  Interestingly, they have both been given very different weight values.  The conversation number for EIGRP hellos (4103) falls within the range of the link queues (N through N+7, where N is the number of WFQ queues) and the weight of 1024 is also the same as the weight for link queues.  The conversation number for EIGRP updates (137) however falls within the range of dynamic queues (0 through N-1), and we can see that it’s weight of 4626 is consistent with a dynamic conversation that has IP Precedence of 6 ((32384 / (6 + 1)).  Because the link queue’s weight of 1024 is 512 times larger than TFTP’s weight of 2, TFTP will be able to send 512 times as many bytes.  Looking at the byte count of received traffic on R2, we can see that the results match this (1,149,000 / 2,240 = 512.9).  This also explains why the EIGRP adjacency reformed every 8 to 8.5 seconds.  For every 1 hello that EIGRP is allowed to send on R1 (512 bits), TFTP is allowed to send 262,144 bits (512 * 512).  The total time required for this is ((512 + 262,144) / 32,000) or about 8.2 seconds.  This is somewhat of an extreme example since we configured the max amount of WFQ dynamic queues in order to minimize TFTP’s weight and also shaped to a very low rate, but the main point here is that the 25% unallocated bandwidth is not reserved for anything.  Any non-priority queue can consume as much of the bandwidth as it’s weight relative to the other queues allows it to.

 

For one final test, let’s decrease the number of WFQ dynamic queues to a value such that the link queue can send at least 512 bps for EIGRP hellos:

R1:
policy-map CBWFQ
 class class-default
  fair-queue 256

Using 256 dynamic queues, the weight of TFTP will be:

16 * (100 / 75)   = 21.33

IOS rounds this to 21:

cbwfq2-3-r1queue1

The share that each queue receives is inversely proportional to it’s weight.  One way of finding the share that EIGRP hellos will receive is:

1 / ((1024 / 21) + (1024 / 1024) + (1024 / 4626))   =  2%, or about 640 bps

This should be a little more than enough to allow a hello packet every second, and looking at the output above we can see that only 1 packet is in the queue.  However, now there is a new problem:

R2:
*Mar 1 05:45:01.462: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:45:01.478: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 05:46:20.998: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:46:21.226: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 05:47:40.746: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:47:41.242: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 05:49:00.758: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:49:01.502: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency
*Mar 1 05:50:21.014: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar 1 05:50:21.226: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency

Approximately every 1 minute and 20 seconds, we see ‘retry limit exceeded’ followed by the adjacency coming back up less than 1 second later.  A debug eigrp packets update shows what is happening:

R2:
*Mar  1 06:08:22.550: EIGRP: Sending UPDATE on Serial0/0 nbr 10.1.12.1, retry 14, RTO 5000
*Mar  1 06:08:22.550:   AS 1, Flags 0x1, Seq 1186/0 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/2
*Mar  1 06:08:27.554: EIGRP: Sending UPDATE on Serial0/0 nbr 10.1.12.1, retry 15, RTO 5000
*Mar  1 06:08:27.554:   AS 1, Flags 0x1, Seq 1186/0 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/2
*Mar  1 06:08:32.558: EIGRP: Sending UPDATE on Serial0/0 nbr 10.1.12.1, retry 16, RTO 5000
*Mar  1 06:08:32.558:   AS 1, Flags 0x1, Seq 1186/0 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/2
*Mar  1 06:08:37.566: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is down: retry limit exceeded
*Mar  1 06:08:37.758: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 1: Neighbor 10.1.12.1 (Serial0/0) is up: new adjacency

R2 does not receive an acknowledgement from R1 for it’s update so it retries a total of 16 times at 5 second intervals.  Remember that EIGRP updates and acknowledgments were assigned to a dynamic conversation and given a weight of 4,626 based on their IPP of 6 – as a result, they cannot be scheduled in time, and after the last retry R2 takes the adjacency down.  Since we manipulated the TFTP queue weight so that a link queue has just more than enough bandwidth to send an EIGRP hello every second, the adjacency comes back up less than a second later, resulting in a very strange overall behavior.

Advertisements

2 Responses to “CBWFQ, Routing Protocols, and max-reserved-bandwidth”

  1. Renz said

    Andy,

    Nice explanation about CBWFQ, so in conclusion are you saying that even there’s an excess of 25% traffic keepalives and routing updates traffic are not guarantee by the router. The traffic is still being consumed by the all user defined class..

    Renz

  2. Andy said

    Renz,

    Yes, the 25% ‘unallocated’ bandwidth is not guaranteed to anything. It is possible for keepalives or routing protocol traffic to be dropped as shown here, however it would be pretty unlikely in a more realistic scenario since:

    1. The available bandwidth in a realistic case would typically be much higher
    2. The user-defined conversation weight would also be much higher if the number of dynamic queues had not been increased with ‘fair-queue 4096’ under the class-default
    3. Keepalives and routing protocol traffic typically don’t need very much bandwidth and their weights (1024 if put into a link queue or 4626 if put into a dynamic conversation queue) are fairly low

    However a much more likely case in real life is if you have a somewhat bandwidth-intensive type of traffic without a bandwidth guarantee and with an IP precedence of 0, since it will receive a weight of 32,384. During congestion it will receive much less than the supposedly reserved 25% if it has to compete with the much lower weights of user defined classes. See these links for some more information and examples:

    http://blog.internetworkexpert.com/2008/08/17/insights-on-cbwfq/

    https://cisconinja.wordpress.com/2009/01/22/class-based-weighted-fair-queueing-and-low-latency-queueing-tests/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: