Difference between revisions of "EJFAT UDP General Performance Considerations"

Revision as of 16:09, 21 December 2023

NIC queues on multi-cpu nodes

Contemporary NICs support multiple receive and transmit descriptor queues. On reception a NIC distributes packets by applying a filter to each that assigns it to one of a number of logical flows. Packets for each flow are steered to a separate receive queue, which in turn can be processed by a separate CPU. The goal of this is to increase performance.

The filter used is typically a hash function over the network and/or transport layer headers -- for ejfat nodes, a 4-tuple hash over IP addresses and TCP ports of a packet. The most common implementation uses an indirection table where each entry stores a queue number. The receive queue for a packet is determined by masking out the low order seven bits of the computed hash for the packet (usually a Toeplitz hash), taking this number as a key into the indirection table and reading the corresponding value.

Some advanced NICs allow steering packets to queues based on programmable filters.

To find out which filter is being used:

ethtool -n eth2 rx-flow-hash udp4

Contemporary NICs support multiple receive and transmit descriptor queues. On reception, a NIC can send different packets to different queues to distribute processing among CPUs. Find out how many NIC queues there are on your node by looking at the combined property:

// See how many queues there are 
sudo ethtool -l enp193s0f1np1

Effect of NIC queues on UDP transmission

In the case of ejfat nodes, there are a max of 63 queues even though there are 128 cores. It seems odd to me that there isn't 1 queue per cpu, and it does not appear to be changeable so most likely it's built into the kernel when first created.

Difference between revisions of "EJFAT UDP General Performance Considerations"

Revision as of 16:09, 21 December 2023

NIC queues on multi-cpu nodes

Effect of NIC queues on UDP transmission

Jumbo Frames

Navigation menu

Search

@@ Line 4: / Line 4: @@
 : [https://www.kernel.org/doc/html/latest/networking/scaling.html Scaling in the Linux Networking Stack]
+Contemporary NICs support multiple receive and transmit descriptor queues. On reception a NIC distributes packets by applying a filter to each that assigns it to one of a number of logical flows. Packets for each flow are steered to a separate receive queue, which in turn can be processed by a separate CPU. The goal of this is to increase performance.
+The filter used is typically a hash function over the network and/or transport layer headers -- for ejfat nodes, a 4-tuple hash over IP addresses and TCP ports of a packet. The most common implementation uses an indirection table where each entry stores a queue number. The receive queue for a packet is determined by masking out the low order seven bits of the computed hash for the packet (usually a Toeplitz hash), taking this number as a key into the indirection table and reading the corresponding value.
+Some advanced NICs allow steering packets to queues based on programmable filters.
+To find out which filter is being used:
+<blockquote>
+<pre>
+ethtool -n eth2 rx-flow-hash udp4
+</pre>
+</blockquote>
 : Contemporary NICs support multiple receive and transmit descriptor queues. On reception, a NIC can send different packets to different queues to distribute processing among CPUs. Find out how many NIC queues there are on your node by looking at the '''combined''' property: