How to configure Quality of Service (QoS) on Cisco Catalyst SD-WAN using policy groups.

Overview

In this article, I’ll walk you through the process of configuring QoS on Cisco Catalyst SD-WAN. I will be using configuration and policy groups for this lab as there is already tons of documentation using the feature template method so if you’re using device templates and centralised/localised policies, unfortunately this method won’t work for you.

Prerequisites

Your router is online and onboarded to the control components.
You’ve configured at least one service VPN.
You’ve identified applications of interest that you’d like to apply QoS to.
You know the capacity of your WAN links.
Basic knowledge of QoS theory.

Note: Cisco SD-WAN has gone through a rebranding and is now called Cisco Catalyst SD-WAN. The control components have also been renamed to: the Validator (vBond), the Manager (vManage) and the Controller (vSmart).

Topology

The lab topology consists of a single router per site, each with one TLOC connected to biz-internet. Simplicity is key so I can demonstrate the results but everything discussed easily scales to fully resilient dual-homed deployments.

To illustrate the impact of QoS I will use 4 hosts.

H1 and H4 will be used for iPerf3 to flood the links.
H2 will ping 8.8.8.8 and be prioritised QoS.
H4 will also ping 8.8.8.8 but not prioritised.

Device versions

SD-WAN Controllers – 20.15.1
SD-WAN C8000V routers – 17.5.1a

Configuring Direct Internet Access (DIA)

This is a fresh lab without any policies meaning by default my hosts (inside VPN20) don’t have internet access. Let’s create a policy to configure Direct Internet Access (DIA)

1. Navigate to Configuration > Policy Groups click ‘Add Policy Group’. Name it and pick ‘sdwan’ as the solution.

2. Click ‘+Add’ next to ‘Associated’. Select the devices you wish to attach to this policy in the workflow. If asked to provision devices say ‘I will do it later’.

3. In the ‘Application Priority & SLA’ tab, click ‘Add Application Priority & SLA Policy’. Name it and click ‘Create’. This policy is the equivalent of ‘Topology’, ‘AAR’ and ‘Traffic Data’ policies from the traditional centralised policy.

4. Click ‘…’ and ‘Edit’. Click ‘Advanced Layout’ in the top right. Click ‘+ Add Traffic Policy’, name it, select the VPNs you want to apply this policy to (if you see none you need to deploy a service VPN using a configuration group), pick the ‘service’ direction (traffic coming from LAN) and press ‘Add’.

5. Click ‘+ Add Rules’, name it, leave the match conditions empty (match everything). Change ‘Base Action’ to ‘Accept’, click ‘+ Add Action’ and select ‘NAT VPN’. Once done press ‘Save Match and Actions’. This rule will send all traffic to VPN0 for NAT.

6. Create another rule that matches destination to RFC1918 IPs and set ‘Base Action’ to ‘accept’. This rule will stop your private traffic going to VPN0. Once done hit ‘Save’.

7. In the ‘Policy Group’ tab, expand the policy group, select the policy we just created for ‘Application Priority’, click ‘Save’ and ‘Deploy’. Follow the deployment workflow.

Once complete you should be able to see the policy on your router using the following command.

show sdwan policy from-vsmart

Shape the WAN interface

This step isn’t required but it’s always recommended to shape traffic towards your ISP, especially when using QoS. This will smooth out how data is transmitted onto the wire providing more predictable results.

1. Navigate to Configuration > Configuration Groups > Transport & Management Profile and edit your transport profile.

2. Edit your WAN interface sub feature and configure the ‘Shaping Rate (Kbps)’ in the ‘ACL/QoS’ tab. This will be your WAN link speed in Kbps. I used a variable as these are often different between sites.

3. Deploy your configuration group.

Starting conditions

Now that we have site to site and site to internet routing sorted, lets take some measurements which we can compare after QoS has been enabled.

I set my links to 1Mbps which is confirmed by iPerf3 speeds from H1 to H4:

Normal ping latency to 8.8.8.8 from H2 (left) and H3 (right) is around 6ms.

During an active iPerf3 transfer simulating link contention, ping jumps to between 300-1000ms and 10% packet loss.

Configuring QoS

This QoS policy will aim to prioritise H2 by putting it into the Low Latency Queue (LLQ) which is queue 0.

1. Navigate to Configuration > Policy Groups > Application Priority & SLA and click ‘Groups of Interest’ in the top right. Under ‘Data Prefix’ I created a list matching the IP of H2.

2. In ‘Forwarding Class’ is where we create objects that represent our queues. For reasons explained in step 4 I created three.

3. Click ‘x’ on top right. In the ‘Application Priority and & SLA’ tab click ‘…’ next to the policy and ‘Edit’. On the right click the ‘QoS Queue’ tab and ‘+ Add QoS Policy’.

4. I like seeing the default queue (queue2) so I’ll select 3 queues (this is why I created 3 forwarding classes). Name the policy, define which interface it should be configured on (use variable if your interface is differs per router), select a forwarding class for each queue and define how much bandwidth you’d like to allocate it. Click ‘Save’ when done.

5. Click ‘+ Add Rules’ and add the following rule. This rule will match on all traffic coming from H2, place it in the priority queue and break out to the internet DIA. Make sure to position this before the RFC and DIA rules. I also like to add a counter to the rule to help me identify if its being hit.

6. Click ‘Save’ on the bottom right, go back to the ‘Policy Group’ tab, expand the policy group and click ‘Deploy’.

7. Go through the workflow and wait for the deployment to finish. Preview the config in CLI to get an idea of what’s actually being deployed.

You should now see the updated policy on the device.

show sdwan policy from-vsmart

This command will show the counter associated with the rule. H2 is already matching it as it performs some background tasks.

show sdwan policy data-policy-filter

Using this command I can check out the policy statistics for my WAN interface. We can see that queue 0 already passed some packets, queue 1 passed 0 packets which is correct as we’ve not allocated any traffic to it.

show policy-map interface gi1

At the end of that command there is the ‘class-default’ which catches all traffic even if it hasn’t been marked by the BestEffort forwarding class.

The main thing you want to watch out for now are drops. Run this command to see if any drop counters are increasing. Take special care with queue0 due to the fact that it caps bandwidth instead of guaranteeing minimum like other queues.

show policy-map interface g1 | i drop

Outcome

Below is a screencap of ping results from H2 and H3 whilst iPerf3 is flooding the link. H2 isn’t immune but its far healthier.

The latency still rises but much slower. Because iPerf3 doesn’t act like real life network contention (its constant instead of spiky), it never eases on the queues meaning H2 would fair even better in real life conditions. On top of that this is only a 1Mbps link and I’m not entirely sure how Cisco Modelling Labs throttles the link. There is an argument to be made that downloads and backups would cause similar behaviour of iPerf3 but at that point the discussions should be about scheduling them over night or policing them to a set throughput instead of QoS.

I hope this article has been helpful. As always, please reach out to me if you spot any mistakes.