In this post we will check the role of CNI for data-intensive CNFs and what can be done to optimize the performance of such CNFs in a Kubernetes (K8s) or cloud-native environment.
Lets briefly dive into the background of CNFs. Cloud-native Network functions (CNFs) are the evolution of how network functions are built using micro-service architecture. The following figure shows the evolutionary journey of network functions thus far :
Figure courtesy - RedHat
CNI plays a very important role in deployment of data-intensive CNFs since CNI decides how network communication happens between the CNFs inside the same node or across multiple nodes of a same K8s cluster. CNI intercepts all packets between CNFs in-line and applies security and networking policies to them. There are many CNIs to choose from depending on the use-case but when bandwidth of 10Gbps or more is needed per CNF, we need to carefully select the CNI to ensure optimal performance.
Many people who worked with VM based VNFs might be familiar with OVS-DPDK based solutions. OVS-DPDK uses a QEMU construct known as vhost-user and uses DPDK to drive packets around the VNFs and SRIOVs from physical ports. Although the performance of such solutions were not phenomenal, there was no alternative which could perform better. Naturally when CNFs quickly started to take over from VNFs and became the de-facto standard, many people are trying to refactor OVS-DPDK as the CNI.
Let's analyze the pros and cons of this from a pure theoretical aspect.
First the pros :
Performance is acceptable at lower bandwidth requirements
People are familiar with ovs and its utilities
The cons :
Cloud-native paradigm is all about efficient architecture. Assigning dedicated cores for packet processing defeats the very purpose. (For driving 2 CNFs, one might need to dedicate at least 4 CPU cores)
Performance and latency can get hit anytime due to noisy neighbor effect
The next question is what can be do and are there alternatives available ? Enter the DPU which completely turns the tables upside down. With DPUs, we can now design a highly efficient system architecture for hosting CNFs.
At NetLOX, we analyzed and measured how our DPU based software (Loxilight) fared against the traditional OVS-DPDK.
1. For testing OVS-DPDK, we used the following topology :
2. For testing Loxilight DPU, we used the following topology :
3. The DUT's server spec :
Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz (40-core)
Ubuntu 20.04.2 LTS
We measured different aspects namely :
The raw performance (bandwidth) with traffic-gen
Core utilization in the server
Simulated e2e application performance using iperf
The results are summarized below :
The performance number clearly favors Loxilight based DPU solution and by a very big margin. To learn more about DPUs and Loxilight, kindly get in touch with us to design and engineer together the next-gen cloud-native platform for your business.