Cisconinja’s Blog

Weighted Fair Queueing

Posted by Andy on January 21, 2009

In my last post, I talked about using dynamips for testing queueing tools.  This one will take a look at using dynamips for testing weighted fair queueing (WFQ).  The network topology and inital configurations are shown below:

wfq-topology

R1:
interface FastEthernet0/0
 ip address 10.1.1.1 255.255.255.0
 load-interval 30
 speed 100
 full-duplex
 no keepalive
 no mop enabled
!
interface Serial0/0
 ip address 10.1.12.1 255.255.255.0
 load-interval 30
 no keepalive
!
no cdp run

R2:
interface Serial0/0
 ip address 10.1.12.2 255.255.255.0
 load-interval 30
 no keepalive
!
no cdp run
 

R1 and R2 are dynamips routers, and PC is a loopback interface connected to R1 in dynamips.  PC will generate traffic destined for R2’s S0/0 interface which will allow queueing to be tested outbound on R1’s S0/0 interface.  R2 will drop the traffic because it does not have a route to reach PC (this is intentional so that the return traffic does not unnecessarily consume CPU use).

In order to perform the tests, I wanted something simple that could generate a reliable and predictable amount of traffic through a dynamips network, and I decided to use this UDP flood script created by Ivan Pepelnjak.  While performing my inital tests to see if the script would work for my testing, I found that it had a very strange behavior – packets could only be sent at intervals roughly in increments of 1/64 (0.015625) of a second.  The UDP flood script allows the interpacket delay to be specified, however packet captures taken at each interface in the path showed that this was being rounded up to the next 64th of a second.  The following examples show this.  First, I’ll send traffic from the PC to R2 with a destination port of 1000, packet size of 500, and interpacket delay of 50 ms:

flood.pl --port=1000 --size=500 --delay=50 10.1.12.2

Wireshark capture taken from PC (loopback):

wfq-pc-50ms

 

Wireshark capture taken from R2 S0/0:

wfq-r2-50ms

The time display here has been changed from the default of seconds since the beginning of the capture to seconds between packets.  50 ms rounded up to the next 64th of a second results in 4/64, or 0.0625.  As you can see, the interpacket arrival times are extremely close to this.  Now let’s see what happens if we bump the interpacket delay up to 60 ms:

flood.pl --port=1000 --size=500 --delay=60 10.1.12.2

Wireshark capture taken from PC (loopback):

wfq-pc-60ms

 

Wireshark capture taken from R2 S0/0:

wfq-r2-60ms

Even though we increased the interpacket delay in the script by 10 ms, the actual interpacket delays have not changed because 60 ms still falls below 62.5 ms.  Next let’s change the interpacket delay in the script to 63 ms – just over the actual interpacket delay value that we saw in the last 2 tests:

flood.pl --port=1000 --size=500 --delay=63 10.1.12.2

Wireshark capture taken from PC (loopback):

wfq-pc-63ms

 

Wireshark capture taken from R2 S0/0:

wfq-r2-63ms

63 ms rounded up to the next 64th of a second results in 5/64, or 0.078125.  Once again, the actual interpacket delays shown by Wireshark are extremely close to this.  I’m not sure what exactly causes this strange behavior, whether it be a limitation of the loopback interface driver or some other issue, however we can see that the results are very consistent and predictable now that we know how to figure out the actual interpacket delay that is being used. 

For my tests, I chose to set the interpacket delay in the script to 125 ms, since it is exactly equal to 8/64.  This will result in the interpacket delay set in the script and the actual interpacket delay being the same to make things less confusing.  I also decided to use 1500-byte packets, which will result in 1514-byte frames on the Ethernet link and 1504-byte frames on the serial (HDLC) link.  The total bandwidth used by such a flow on each link should be

Ethernet: 1514 bytes/packet * 8 bits/byte * 8 packets/second = 96,896 bps

HDLC: 1504 bytes/packet * 8 bits/byte * 8 packets/second = 96,256 bps

or roughly 96kb/sec each.  Before getting into any WFQ tests, lets try sending 1 flow with these parameters and verify the results:

flood.pl --port=1000 --size=1500 --delay=125 10.1.12.2

Wireshark capture taken from PC (loopback):

wfq-pc-125ms

 

Wireshark capture taken from R2 S0/0:

wfq-r2-125ms

 

Output of show interfaces on R1 F0/0, R1 S0/0, and R2S0/0:

wfq-shint-r1f0

wfq-shint-r1s0

wfq-shint-r2s0

The results are amazingly accurate.  Wireshark shows packets being sent almost exactly every 125 ms and show interfaces shows that the input rate on R1 F0/0, output rate on R1 S0/0, and input rate on R2 S0/0 are exactly 96k.  I checked the output of show interfaces several times over the span of several minutes, and the rate on each interface was never more than 1000 bits above or below 96k.  Now that we have a good way of generating predictable traffic, let’s move onto the WFQ tests.

 

WFQ Test #1

For the first test, we will generate 3 separate UDP flows to ports 67 (DHCP), 69 (TFTP), and 514 (Syslog).  Each flow will use the same parameters as the last test performed above to generate 96k worth of traffic each.  We will simulate a clock rate of 96k on the interface by shaping to 96k.  This will force the 3 flows to compete for 96k of total bandwidth and allow us to see how the WFQ scheduler allocates bandwidth to each flow.  The IP Precedence of each flow will be left at the default of 0.  WFQ assigns a weight to each flow using the formula 32384 / (IPP + 1).  Since the IP Precedence of each flow is 0, they should each be given the same weight, and since the length of packets in each flow is the same, an equal number of packets from each flow should be sent.  Since GTS supports WFQ on shaping queues, we will use that.  On R2, we will create a policy-map to match each type of traffic so that we can measure the amount of each that makes it to R2:

R1:
interface Serial0/0
 traffic-shape rate 96000

R2:
ip access-list extended DHCP
 permit udp any any eq bootps
ip access-list extended SYSLOG
 permit udp any any eq syslog
ip access-list extended TFTP
 permit udp any any eq tftp
!
class-map match-all TFTP
 match access-group name TFTP
class-map match-all SYSLOG
 match access-group name SYSLOG
class-map match-all DHCP
 match access-group name DHCP
!
policy-map Traffic-Meter
 class TFTP
 class DHCP
 class SYSLOG
!
interface Serial0/0
 service-policy input Traffic-Meter


Now we’re ready to start each of the traffic streams:

 

flood.pl --port=67 --size=1500 --delay=125 10.1.12.2

flood.pl --port=69 --size=1500 --delay=125 10.1.12.2

flood.pl --port=514 --size=1500 --delay=125 10.1.12.2

 

Output of show interfaces on R1 F0/0, R1 S0/0, and R2S0/0:

wfq-test1-shint-r1f0

wfq-test1-shint-r1s0

wfq-test1-shint-r2s0

As expected, R1 F0/0 shows 288k worth of input traffic (96k * 3).  R1 S0/0 shows 96k of output traffic and R2 S0/0 shows 96k of input traffic which matches the shaped rate.

Next, look at the currently active flows in the shaping queues on R1:

wfq-test1-r1queue2

This verifies that WFQ is assigning each flow a weight of 32384 by using the formula 32384/(IPP +1), and therefore each of them receive an equal scheduling weight. 

Now let’s look at the traffic coming in on R2:

wfq-test1-r2meter

WFQ has distributed an exactly equal share to each of the 3 flows (1499 packets each and roughly 32kbps each over the last 30 seconds).  Pretty common knowledge, but whats most impressive about this is we were able to test WFQ using dynamips and it was accurate right down to the very packet.

 

WFQ Test #2

Let’s make things a little more interesting and change the IP Precedence and offered rates of our 3 flows.  For Syslog we will change packet size to 1000 and keep delay of 125 ms.  For TFTP we will keep packet size of 1500 and change delay to 60 ms (actual delay will be 62.5 ms, or 16 packets/second).  For DHCP we will keep both parameters the same.  The offered rate for each flow on the Ethernet link will now be:

Syslog: 1014 bytes/packet * 8 bits/byte * 8 packets/second = 64,896 bps

DHCP: 1514 bytes/packet * 8 bits/byte * 8 packets/second = 96,896 bps

TFTP: 1514 bytes/packet * 8 bits/byte * 16 packets/second = 193,792 bps

We’ll pretend that our Syslog flow is actually a high priority type of traffic such as video, DHCP is best effort traffic such as web, and TFTP is less than best effort such as peer-to-peer filesharing.  Syslog will be marked with IPP 4, DHCP with IPP 1, and TFTP with IPP 0.  The marking will be done inbound on R1 F0/0:

R1:
ip access-list extended DHCP
 permit udp any any eq bootps
ip access-list extended SYSLOG
 permit udp any any eq syslog
ip access-list extended TFTP
 permit udp any any eq tftp
!
class-map match-all TFTP
 match access-group name TFTP
class-map match-all SYSLOG
 match access-group name SYSLOG
class-map match-all DHCP
 match access-group name DHCP
!
policy-map Traffic-Marker
 class TFTP
  set ip precedence 0
 class DHCP
  set ip precedence 1
 class SYSLOG
  set ip precedence 4
!
interface FastEthernet0/0
 service-policy input Traffic-Marker

 

Now we can begin sending each of the traffic flows:

 

flood.pl --port=69 --size=1500 --delay=60 10.1.12.2

flood.pl --port=67 --size=1500 --delay=125 10.1.12.2

flood.pl --port=514 --size=1000 --delay=125 10.1.12.2

 

The marking policy-map on R1 verifies the amount of traffic being generated of each type and that they are being marked correctly:

wfq-test2-r1marking1

 

Next let’s look at the shaping queues on R1:

wfq-test2-r1queue

We can see that each of the weights matches what we expected:

Syslog = 32384 / (4 + 1) = 6476

TFTP = 32384 / (0 + 1) = 32384

DHCP = 32384 / (1 + 1) = 16192

Since the weight formula always uses the same numerator, the proportion of bandwidth given to a certain flow is approximately equal to the denominator (IPP +1) of that flow divided by the sum of denominators of all flows.  With a shaped rate of 96k (simulated clock rate of 96k), the bandwidth given to each flow (assuming each flow uses it’s full share that it is given) should be:

Syslog = 5/8 * 96,000 = 60,000

TFTP = 1/8 * 96,000 = 12,000

DHCP = 2/8 * 96,000 = 24,000

The policy map that we created to meter traffic on R2 confirms this:

wfq-test2-r2meter

We can even verify it down to a per-packet granularity like the 1st test.  DHCP sends the exact same sized packets as TFTP and is given half the weight (packets are scheduled twice as fast), and we can see that DHCP has sent exactly twice as many packets as TFTP (1327 * 2 = 2654).  Syslog’s packets are 1004-bytes on the serial link compared to TFTP’s 1504-byte packets.  Syslog is also given 1/5 of TFTP’s weight when assigning sequence numbers.  Therefore Syslog should be able to send (1504 / 1004) * 5, or approximately 7.49 more packets than TFTP sends.  If we calculate the number that syslog should have been able to send using the number that TFTP sent, it works out to:

(1504 / 1004) * 5 * 1327 = 9939.28

This almost exactly matches the value of 9937 that is listed.  (This very small discrepancy is probably due to the fact that the counters in the policy map simply show the results at a single point in time; at any given point syslog could have transmitted as much as 7.49 packets more or less than it should have relative to TFTP, since the ratio is 7.49 packets to 1 and only whole packets are transmitted).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: