Debating the value of XDP

LWN.net needs you!

Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

By Jonathan Corbet
December 6, 2016

All parts of the kernel are shaped by the need for performance and scalability, but some subsystems feel those pressures more than others. The networking subsystem, which can be asked to handle steady-state workloads involving millions of packets per second, has had to look at numerous technologies in its search for improved performance. One result of that work has been the "express data path" (or XDP) mechanism. Now, however, XDP is seeing some pushback from developers who see it as "pointless," a possible long-term maintenance problem, and a distraction from the work that networking developers need to be doing.

The core idea behind XDP is optimizing cases where relatively simple decisions can be made about incoming packets. It allows the loading of a BPF program into the kernel; that program gets an opportunity to inspect packets before they enter the networking stack itself. The initial use case for XDP was to enable the quick dropping of unwanted packets, but it has since expanded to cover simple routing decisions and packet modifications; see this in-progress documentation for some more information on how it works.

The core benefit of XDP is that the system can make quick decisions about packets without the need to involve the rest of the networking code. Performance could possibly be further improved in some settings by loading XDP programs directly into the network interface, perhaps after a translation step.

Thus far, most of the public discussion about XDP has been focused on the details of its implementation rather than on whether XDP is a good idea in the first place. That came to an end at the beginning of December, though, when Florian Westphal, in a posting written with help from Hannes Frederic Sowa, let it be known that he disagrees: "Lots of XDP related patches started to appear on netdev. I'd prefer if it would stop..." He would rather that developers turned away from a "well meaning but pointless" approach toward something that, in his view, is better suited to the problems faced by the networking subsystem.

That something, in short, is any of the mechanisms out there (such as the data plane development kit) that allow the networking stack to be bypassed by user-space code. These mechanisms can indeed yield improved performance in settings where a strictly defined set of functionality is needed and the benefits that come from a general-purpose network stack can be done without. Additionally, he said, some problems are best solved by utilizing the packet-filtering features implemented in the hardware.

XDP, Westphal said, is an inferior solution because it provides a poorer programming environment. Networking code done in user space can be written in any of a range of languages, has full debugging support available, and so on. BPF programs, instead, are harder to develop and much more limited in their potential functionality. Looking at a number of use cases for XDP, including routing, load balancing, and early packet filtering, he claims that there are better solutions for each.

Thomas Graf responded that he has a fundamental problem with user-space networking: as soon as a packet leaves the kernel, anything can happen to it and it is no longer possible to make security decisions about it. User-space code could be compromised and there is no way for the kernel to know. BPF code in the kernel, instead, should be more difficult to compromise since its freedom of action is much more restricted. He also said that load balancing often needs to be done within applications as well as across machines, and he would not want to see that done in user space.

Sowa, instead, questioned the early-drop use case, asking whether the focus was on additional types of protection or improved performance. Like Westphal, he suggested that this problem could be solved primarily with hardware-based packet dropping. The answer from Tom Herbert made it clear that he sees both flexibility and performance as being needed:

DDOS mitigation alone is probably a sufficient motivation to look at XDP. We need something that drops bad packets as quickly as possible when under attack, we need this to be integrated into the stack, we need it to be programmable to deal with the increasing savvy of attackers, and we don't want to be forced to be dependent on HW solutions. This is why we created XDP!

Networking maintainer David Miller also said that he sees XDP as being a good solution for packet dropping use cases. Hardware-based filters, he said, are not up to the job, and XDP looks to him to be the right approach.

Sowa's other concern was not so easily addressed, though. As XDP programs gain functionality, they will need access to increasingly sophisticated information from the rest of the networking stack. That information can be provided by way of functions callable from BPF, but those functions will likely become part of the kernel's user-space ABI. That, in turn, will limit the changes that the networking developers can make in the future. These concerns mirror the worries about tracepoints that have limited their use in parts of the kernel. Nobody in the discussion addressed the ABI problem; in the end, it will have to be handled like any other user-space interface, where new features are, hopefully, added with care and a lot of review.

In the end, the discussion probably changed few minds about the value of XDP — or the lack thereof. Stephen Hemminger probably summarized things best when he said that there is room for a number of different approaches. XDP is better for "high speed packet mangling", while user-space approaches are going to be better for the implementation of large chunks of networking infrastructure. The networking world is complex, and getting more so; a variety of approaches will be needed to solve all the problems that the kernel is facing in this area.

(Log in to post comments)

Debating the value of XDP

Posted Dec 8, 2016 16:32 UTC (Thu) by Tara_Li (subscriber, #26706) [Link]

As XDP programs gain functionality, they will need access to increasingly sophisticated information from the rest of the networking stack.

This looks like the root of the problem here - the fear of creeping featuritis. XDP looks to be intended for a very limited set of issues where the slowdown on processing the rest of the packets is made up for by getting rid of some fraction quickly, hopefully leaving more time for the rest of the packets to get handled more thoroughly. But, if XDP gains more and more functionality, and handles more and more of the packets, at some point it no longer is the "fast path", but becomes the default "slow" path, vs. the old default "even slower path" (which is now only "even slower" because XDP has taken on so much, it's taking as long as the old default did). And yet, each project thinks that its particular class of problem packets should be taken care of first, meaning that small patches to add this new tiny bit of functionality get submitted to add something they need, while gradually slowing down ALL of the packets taking the so-called "fast path". It sounds like a good idea, but can any developer actually say that they believe that if something like this is implemented, it won't develop creeping featuritis?

Debating the value of XDP

Posted Dec 8, 2016 22:40 UTC (Thu) by raven667 (subscriber, #5198) [Link]

If you are going down the eBPF path, why not go all the way and replace all of the hand-optimized C network stack with a eBPF (or similar) VM running in kernel or in userspace that is machine optimized? You can expose some efficient datastructures to it which may be hand-optimized for state tracking when needed. This all starts to look like GPU shader programs at some point 8-)

Debating the value of XDP

Posted Dec 9, 2016 18:19 UTC (Fri) by ksandstr (guest, #60862) [Link]

>This all starts to look like GPU shader programs at some point 8-)

It's looking like mid-to-late nineties Direct3D right now. Predicting from that, it seems unlikely that current XDP programs will ever be translated into bytecoded hardware-supported mysteriously zomgfast[0] acceleration primitives on networking hardware. As such, they're strictly inferior to models like "a compiled C program executing in kernel mode" or, in the absence of a magical in-kernel compiler infrastructure, any fixed-function acceleration pipeline at all.

[0] and not just "0.7 GHz ARM SOC w/ OpenCL, on a PCIe board": just another channel processor with a proprietary interface