Difference between revisions of "Testing Load Balancer Bandwidth"

From epsciwiki
Jump to navigation Jump to search
 
(6 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
<font size="+1">Here we measure the data rate between 4 single threaded UDP senders and corresponding 4 single threaded receivers. In one test, data was sent from all senders on ejfat-2 to the LB on ejfat-1 (172.19.22.241). In another test, there was one sender on each of ejfat-2, 3, 4, and 5. Both showed the same behavior The following was used to send:</font>
 
<font size="+1">Here we measure the data rate between 4 single threaded UDP senders and corresponding 4 single threaded receivers. In one test, data was sent from all senders on ejfat-2 to the LB on ejfat-1 (172.19.22.241). In another test, there was one sender on each of ejfat-2, 3, 4, and 5. Both showed the same behavior The following was used to send:</font>
 
<pre>
 
<pre>
./packetBlaster -p 19522 -host 172.19.22.241 -mtu 9000 -s 25000000  -b 100000 -cores 80 (81,82,83)
+
    ./packetBlaster -p 19522 -host 172.19.22.241 -mtu 9000 -s 25000000  -b 100000 -byterate 2940000000 -cores 80 (81,82,83) -e 0 (1,2,3) -id 0 (1,2,3)
 
</pre>
 
</pre>
<font size="+1">in which the UDP Send buffer = 50MB and the app sent buffers of 100kB. The receiver was run as:</font>
+
<font size="+1">in which the UDP Send buffer = 50MB and the app sent buffers of 100kB. Quantities in parenthesis are substituted for one of the 4 invocations. The receivers were all run on ejfat-1 as:</font>
 
<pre>
 
<pre>
./packetBlastee  -p 17750 -b 400000 -r 25000000 -cores 80 (81,82,83)
+
    ./packetBlastee  -p 17750 (1,2,3) -b 400000 -r 25000000 -cores 80 (81,82,83)
 
</pre>
 
</pre>
  
<font size="+1">The sending and receiving threads were pinned to core #80. This is because cores 80-87 are on the same NUMA node as the NIC and perform by far the best when transferring data. When the same test was run pinning the threads to core #1 (on worst performing node) the max transfer rate was 1000 MB/s. After that packets were constantly being dropped.</font>
+
<font size="+1">The sending and receiving threads were pinned to core numbers in the 80-87 range since they are on the same NUMA node as the NIC and therefore allow top performance. The result of this test is that we had the following average total byte rates for the receivers </font>
  
 +
<pre>
 +
    2933, 2929, 2929, 2929 MB/sec
 +
</pre>
 +
 +
<font size="+1">Adding all rates and multiplying by 8 bits/byte gives:</font>
 +
 +
<pre>
 +
    93.8 Gb/sec
 +
</pre>
  
<font size="+1">The following graph shows the CPU usage of both sender and receiver as a function of the data rate.</font>
+
<font size="+1">This rate was sustainable without packet drops for about 1.5 minutes. Completely eliminating packet drops requires a lower throughput.</font>

Latest revision as of 16:22, 13 September 2022

Transmission between 4 senders on various ejfat nodes to the Load Balancer on ejfat-1 and from there to 4 receivers on ejfat-1 (Sep 2022)

Here we measure the data rate between 4 single threaded UDP senders and corresponding 4 single threaded receivers. In one test, data was sent from all senders on ejfat-2 to the LB on ejfat-1 (172.19.22.241). In another test, there was one sender on each of ejfat-2, 3, 4, and 5. Both showed the same behavior The following was used to send:

    ./packetBlaster -p 19522 -host 172.19.22.241 -mtu 9000 -s 25000000  -b 100000 -byterate 2940000000 -cores 80 (81,82,83) -e 0 (1,2,3) -id 0 (1,2,3)

in which the UDP Send buffer = 50MB and the app sent buffers of 100kB. Quantities in parenthesis are substituted for one of the 4 invocations. The receivers were all run on ejfat-1 as:

    ./packetBlastee  -p 17750 (1,2,3) -b 400000 -r 25000000 -cores 80 (81,82,83)

The sending and receiving threads were pinned to core numbers in the 80-87 range since they are on the same NUMA node as the NIC and therefore allow top performance. The result of this test is that we had the following average total byte rates for the receivers

    2933, 2929, 2929, 2929 MB/sec

Adding all rates and multiplying by 8 bits/byte gives:

    93.8 Gb/sec

This rate was sustainable without packet drops for about 1.5 minutes. Completely eliminating packet drops requires a lower throughput.