SD-WAN technology is the new big thing in networking. Promising more powerful performance and scalable deployment, SD-WAN is quickly replacing older network technology, like MPLS.
Businesses often turn to SD-WAN to enhance their network (Internet, cloud, UC) performance.
But, although SD-WAN is more powerful and flexible than its predecessor, it can still face challenges that affect user experience.
It’s crucial to be aware of these potential issues so you can solve them before they lead to a network failure.
What Problems Can Happen?
Before delving into the 3 common SD-WAN problems, it’s crucial to recognize that there are different types of network problems that can impact your SD-WAN network, such as:
- DNS issues
- Defective cables or connectors
- Network congestion
- High resource usage
- Physical/hardware issues
- Equipment software errors
- Incorrect device configurations
To successfully identify and solve any issues affecting SD-WAN network, you need comprehensive visibility of your network. An SD-WAN monitoring tool can provide this visibility.
SD-WAN monitoring is a component of Network Performance Monitoring tool. This tool provides ongoing surveillance of end-to-end SD-WAN network performance from all network locations to detect and resolve problems.
Where Problems Can Happen
Typically, SD-WAN problems arise from network bandwidth congestion or high resource usage on network devices, often happening on the Local Loop or the customer Edge Router.
Moreover, a significant number of problems in an ISP’s backbone that lead to SD-WAN problems stem from congestion on its peering and transit connections with other networks or service providers. Despite ISP backbones being more dependable and robust than other network infrastructures, performance difficulties may still occur.
The diagram below illustrates an SD-WAN network site communicating with a data center, head office, or infrastructure-as-a-service (IaaS). In an SD-WAN networks, problems can come from:
- The Underlay
- The Overlay
- The LAN
The most fragile component in a network is the last mile. This is the final segment of the network which typically has lower speeds, minimal route diversity, and many single points of failure.
Therefore most SD-WAN problems occur on the last mile of the network. To mitigate this, most SD-WAN networks use multiple links to function.
So if a problem arises, it should not affect all links at the same time, and the SD-WAN Edge Router should be able to balance network sessions across the best available link.
However, only relying on link diversity is not enough to prevent all potential issues in SD-WAN networks.
1. Problem on the Local Loop
The first problem we’re discussing is on the ISP Local Loop issue located on the underlay.
Here’s an example:
- On the first graph we see the Internet SD-WAN user experience
- On the 2 bottom graphs we see the experience of the Internet connections (ISP 1 & ISP 2)
We can see:
- ISP #1 is not facing any performance issues. There is low jitter and no packet loss.
- However, ISP #2 is encountering a clear performance issue.
In this situation, the SD-WAN problem is located on the Local Loop, between the ISP Edge and the SD-WAN Edge Equipment. So, the problem is related to the ISP, and it is their responsibility to fix it.
Some resolutions can be:
- Utilizing a Visual Traceroute tool to gather more information about the issue.
- Submit a support ticket to your ISP with screenshots of the dashboards and traceroutes as evidence.
2. High CPU
The second SD-WAN issue is high resource usage on any SD-WAN devices. This typically occurs when a network device does not have enough resources to manage the traffic volume.
Here’s an example:
- On the first graph we see the user experience of the Internet SD-WAN connection.
- On the two bottom graphs we see the experience of the Internet connections (ISP 1 & ISP 2)
We can see:
- High packet loss is causing poor performance for all traffic through the SD-WAN network.
- Both ISP #1 and ISP #2 are being affected.
For both ISPs to be impacted, it means that the network problem must be taking place on a segment shared by both ISPs.
When ISP #1 and ISP #2 are encountering performance issues, the CPU usage is at 100%.
So this is not a local loop issue and there is no need to contact the ISP. This is a local problem where a substantial amount of traffic is being directed to that port, possibly from a different source.
The issue could be within the LAN or on the SD-WAN Edge Router itself. Edge Routers are a common source of problems because they have many features that can be resource-intensive and use a lot of CPU.
Some resolutions can be:
- Upgrading to a larger network device or updating the device firmware.
- Examining firewall logs to determine if the traffic is legitimate or not.
- Prioritizing certain traffic in your firewall.
3. High Bandwidth
The final SD-WAN issue is on the the underlay of ISP #2 and happens due to high bandwidth usage.
Here’s an example:
- On the first graph we see Internet SD-WAN user experience
- On the 2 bottom graphs we see experience of the Internet connections (ISP 1 & ISP 2)
We can see:
- ISP #1 is not facing any performance issues and has minimal jitter. The latency is stable and the packet loss is consistently below 2%.
- However, ISP #2 is encountering a clear performance problem caused by high packet loss.
When ISP #2 is facing high packet loss, we can see that the bandwidth usage exceeds the available 500mb service. This allows us to conclude that the high bandwidth usage is causing the packet loss.
This is not an issue with the local loop and there is no need to contact the ISP. This is a local problem where a large amount of traffic is being sent to that port, possibly from a different source.
Some resolutions can be:
- Prioritizing certain traffic in your firewall.
- Changing backup schedules.
- Rate limiting the flow of traffic in the network.
- Upgrading your Internet connection bandwidth.
The Troubleshooting Process
Although SD-WAN vendors promise many things about the performance of SD-WAN, like any network, they’re going to face problems.
Whether the problem is high bandwidth or high CPU or related to your network equipment, or the local loop, you need to be ready.
An SD-WAN Monitoring Tool is your best friend to help you proactively detect SD-WAN problems so you can begin the troubleshooting process before your end-users even feel the impacts on user-experience.