Difference between revisions of "EJFAT"

From epsciwiki
Jump to navigation Jump to search
 
(129 intermediate revisions by the same user not shown)
Line 2: Line 2:
  
 
<br><hr><br>
 
<br><hr><br>
<div class="orbitron"><font size="+1">System Overview:</font></div>''EJFAT is a collaboration between Energy Sciences Network (ESnet) and Thomas Jefferson National Laboratory (JLab) for proof of concept engineering to program a Field Programmable Gate Array (FPGA) for network data routing of commonly tagged UDP packets from any data source to individual and configurable destination endpoints in an end-point compute work load balanced manner, including some additional tagging for stream reassembly at the endpoint. The primary purpose of this FPGA based acceleration is to load balance work to destination compute farm endpoints with low latency and full line rate bandwidth of 100 Gbs with feedback from the destination compute farm.  
+
<div class="orbitron"><font size="+1">System Overview:</font></div>''EJFAT is a collaboration between Energy Sciences Network (ESnet) and Thomas Jefferson National Laboratory (JLab) for proof of concept engineering for accelerated load balancer (LB) using dynamic IP4/6 address forwarding. Dynamic because the forwarding address is chosen dynamically from a collection of destination endpoints based on near real-time destination workload conditions, and accelerated because the forwarding is accomplished with low fixed latency at line rates of up to 200Gbps per FPGA, where in general a functioning LB may consist of up to four FPGAs acting as one logical DP for a total bandwidth capacity of over 1 Tbps. The low, fixed latency is achieved by utilization of an appropriately programmed Field Programmable Gate Array (FPGA) to effect the Data Plane (DP) functions of the LB.  
  
 
== EJFAT System Status ==
 
== EJFAT System Status ==
 
=== ejfat-1 ===
 
=== ejfat-1 ===
# 200Gbps NIC: ejfat-1-daq  129.57.177.8
+
# 100Gbps NIC: ejfat-1-daq  129.57.177.8
 
# 10Gbps NIC:  ejfat-1      129.57.177.131
 
# 10Gbps NIC:  ejfat-1      129.57.177.131
# U280 FPGA:  ejfat-1-dp    129.57.177.11
+
# U280 FPGA:  ejfat-1-dp    129.57.177.{9-16} - '''LAG'd for 200Gbps'''
# Test moving a current LB install wholesale by moving files/containers
+
# LB CP: ejfat-1 129.57.177.131,  latest Stable branch
# IT Account Setup
+
# LB: DP latest Stable FW
 +
# CP Web UI port 8081
 +
 
 
=== ejfat-2 ===
 
=== ejfat-2 ===
# 200Gbps NIC: ejfat-2-daq  129.57.177.2
+
# 100Gbps NIC: ejfat-2-daq  129.57.177.2
 
# 10Gbps NIC:  ejfat-2      129.57.177.132
 
# 10Gbps NIC:  ejfat-2      129.57.177.132
# U280 FPGA:   ejfat-2-dp   129.57.177.12
+
# 100Gbps U280 FPGA: ejfat-2-dp 129.57.177.{17-24}
# LB: DP latest prod FW, CP Main branch Jan 4 Rev, CP/FW upgrades pending
+
# LB CP: ejfat-2 129.57.177.132,  latest Stable branch
# IOMMU enabled
+
# LB: DP latest Stable FW
 +
# CP Web UI port 8082
 +
 
 
=== ejfat-3 ===
 
=== ejfat-3 ===
 
# 200Gbps NIC: ejfat-3-daq  129.57.177.3
 
# 200Gbps NIC: ejfat-3-daq  129.57.177.3
 
# 10Gbps NIC:  ejfat-3      129.57.177.133
 
# 10Gbps NIC:  ejfat-3      129.57.177.133
# U280 FPGA:  ejfat-3-dp    129.57.177.13
+
# '''Two U280s installed - LAG'd for 400Gbps'''
# Ready to accept second FPGA - needs addition PCIe riser
+
# FW Containers built by Stacey
 +
 
 
=== ejfat-4 ===
 
=== ejfat-4 ===
# 200Gbps NIC: ejfat-4-daq  129.57.177.4
+
# 100Gbps NIC: ejfat-4-daq  129.57.177.4
 
# 10Gbps NIC:  ejfat-4      129.57.177.134
 
# 10Gbps NIC:  ejfat-4      129.57.177.134
# U280 FPGA:   ejfat-4-dp   129.57.177.14
+
# '''XDP experiments'''
# XDP experiments
+
# 100Gbps U280 FPGA: ejfat-4-dp 129.57.177.{41-48}
 +
# LB CP: ejfat-4 129.57.177.134, <s>latest Stable branch</s>
 +
# LB: DP <s>latest Stable FW</s>
 +
 
 
=== ejfat-5 ===
 
=== ejfat-5 ===
 
# 200Gbps NIC: ejfat-5-daq  129.57.177.5
 
# 200Gbps NIC: ejfat-5-daq  129.57.177.5
 
# 10Gbps NIC:  ejfat-5      129.57.177.135
 
# 10Gbps NIC:  ejfat-5      129.57.177.135
# U280 FPGA:   ejfat-5-dp   129.57.177.15
+
# LB CP: ejfat-5 129.57.177.135, <s>latest Stable branch</s>
# LB: DP latest prod FW, CP Dev branch April 11 Rev (Latest)
+
# 100Gbps U280 FPGA: ejfat-5-dp 129.57.177.{49-56}
 +
# LB: DP <s>latest Stable FW</s>
 +
# '''Optical Taps Installed'''
  
 
=== ejfat-6 ===
 
=== ejfat-6 ===
 
# 200Gbps NIC: ejfat-6-daq  129.57.177.6
 
# 200Gbps NIC: ejfat-6-daq  129.57.177.6
 
# 10Gbps NIC:  ejfat-6      129.57.177.136
 
# 10Gbps NIC:  ejfat-6      129.57.177.136
# U280 FPGA:  ejfat-6-dp    129.57.177.16
 
 
# DAOS experiments
 
# DAOS experiments
# IOMMU enabled
+
# '''Using Ubuntu 24.04 LTS'''
 +
# FW containers built
 +
# Waiting for podman compose installation
 +
 
 
=== ejfat-fs ===
 
=== ejfat-fs ===
# 200Gbps NIC: ejfat-fs-daq  129.57.177.7
+
# 100Gbps NIC: ejfat-fs-daq  129.57.177.7
 
# 10Gbps NIC:  ejfat-fs      129.57.177.130
 
# 10Gbps NIC:  ejfat-fs      129.57.177.130
# U280 FPGA:  ejfat-fs-dp    129.57.177.10
 
 
# Hosts NVME memory/disk
 
# Hosts NVME memory/disk
 +
# 100Gbps U280 FPGA: ejfat-fs-dp 129.57.177.{65-72}
 +
# LB CP: ejfat-fs 129.57.177.130, latest Stable branch
 +
# LB: DP latest Stable FW
 +
# CP Web UI port 8080
  
 
== Presentations/Papers ==
 
== Presentations/Papers ==
Line 122: Line 137:
 
|RT-2024 Presentation
 
|RT-2024 Presentation
 
|[https://jeffersonlab.sharepoint.com/:p:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/rt2024.pptx?d=w0dba99dbb67f481f9a39907dbec384b8&csf=1&web=1&e=1XISCm} PPTX]
 
|[https://jeffersonlab.sharepoint.com/:p:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/rt2024.pptx?d=w0dba99dbb67f481f9a39907dbec384b8&csf=1&web=1&e=1XISCm} PPTX]
 +
|-
 +
|2024-07-31
 +
|M. S. Goodrich, et al.
 +
|ACAT-2024 Proceedings
 +
|[https://jeffersonlab.sharepoint.com/:b:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/ACAT_2024.pdf?csf=1&web=1&e=HkQedP PDF]
 +
|-
 +
|2024-10-02
 +
|S. Veseli​, APS/SDM
 +
|APS/ALS - EJFAT
 +
|[https://jeffersonlab.sharepoint.com/:p:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/AlsEjfatMeeting-20241002.pptx?d=wcaa3a21ffd3a466f979bf3f5fbaab457&csf=1&web=1&e=BSOlI7 PPTX]
 
|}
 
|}
  
Line 155: Line 180:
  
 
== HOW-TOs ==
 
== HOW-TOs ==
 +
 +
[[How to use Control Plane Web UI]]
 +
 +
[[How to Monitor Prometheus]]
 +
 +
[https://wiki.jlab.org/epsciwiki/index.php/Install_an_EJFAT_Load_Balancer Install a Load Balancer]
 +
 +
[https://jeffersonlab.sharepoint.com/:t:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/lbtest.txt?csf=1&web=1&e=PNz0DM Test a Load Balancer]
  
 
[[How to setup ejfat nodes]]
 
[[How to setup ejfat nodes]]
Line 163: Line 196:
  
 
[https://jeffersonlab.sharepoint.com/:b:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/CP_PID_Sched.pdf?csf=1&web=1&e=JpffJ4 How to Compute Schedule Density from PID Signals]
 
[https://jeffersonlab.sharepoint.com/:b:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/CP_PID_Sched.pdf?csf=1&web=1&e=JpffJ4 How to Compute Schedule Density from PID Signals]
 +
 +
[https://linuxconfig.org/how-to-enable-jumbo-frames-in-linux Enable Jumbo Frames]
 +
 +
Network Path MTU Discovery support in the Linux Kernel:
 +
 +
<pre>
 +
file: /proc/sys/net/ipv4/tcp_mtu_probing
 +
variable: net.ipv4.tcp_mtu_probing (integer; default: 0; since Linux 2.6.17):
 +
 +
tcp_mtu_probing - INTEGER
 +
Controls TCP Packetization-Layer Path MTU Discovery.  Takes three values:
 +
  0 - Disabled
 +
  1 - Disabled by default, enabled when an ICMP black hole detected
 +
  2 - Always enabled, use initial MSS of tcp_base_mss.
 +
</pre>
 +
 +
== REFERENCEs ==
 +
 +
[https://jeffersonlab.sharepoint.com/:x:/r/sites/DataCenter/_layouts/15/Doc.aspx?sourcedoc=%7B3F832940-1BA2-4183-A00A-5085C5A353D6%7D&file=IRIAD-testbed-Inventory.xlsx&action=default&mobileredirect=true EJFAT Config Planning]
 +
 +
[https://www.jlab.org/news/releases/california-streamin-jefferson-lab-esnet-achieve-coast-coast-feed-real-time-physics JLab EJFAT News Release]
 +
 +
[https://jeffersonlab.sharepoint.com/:i:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/JIRIAF%20on%20FABRIC.png?csf=1&web=1&e=TOGEPr EJFAT on FABRIC]
  
 
[https://jeffersonlab.sharepoint.com/:b:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/E2SAR.drawio.pdf?csf=1&web=1&e=E0Uqlh EJFAT API]
 
[https://jeffersonlab.sharepoint.com/:b:/r/sites/SciComp/Shared%20Documents/EPSCI/EJFAT/E2SAR.drawio.pdf?csf=1&web=1&e=E0Uqlh EJFAT API]
 +
 +
[https://docs.google.com/document/d/1ssw8sye7jExtPCJVejloe8hNkyWOcxEQzVmm45xs5-w/edit#heading=h.b8k68ix2wf30 LB Pipeline]
 +
 +
[https://docs.google.com/document/d/1qEo51MZeUPM3-DA2CK6jAccrU0r1QtPfl5i3aPS2SKM/edit?exids=71471482,71471477#heading=h.69350544ggm5 Getting Started with EJFAT]
 +
 +
[https://jeffersonlab.sharepoint.com/:w:/r/sites/ITDivision/proposals/_layouts/15/Doc.aspx?sourcedoc=%7B33ffd720-9356-471f-8880-b0c56c5593a5%7D&action=view&wdAccPdf=0&wdparaid=39A41B49 IRIAD Workplan]
 +
 +
[https://wiki.jlab.org/epsciwiki/index.php/SRO_Grand_Challenge SRO Grand Challenge]
 +
 +
[https://my.es.net/?_gl=1*pchcca*_ga*MjAyODE5NDE3OC4xNzEwOTYwMDI4*_ga_9Y9H16804B*MTcxMDk2MDAyOC4xLjAuMTcxMDk2MDAyOC4wLjAuMA..&s=JLAB&st=esnet_site ESnet Logical Map]
 +
 +
[http://linux-ip.net/html/tools-ip-neighbor.html IP Neighbor]
 +
 +
[https://robotframework.org/robotframework/latest/RobotFrameworkUserGuide.html Robot Framework]
 +
 +
[https://science.osti.gov/-/media/ascr/ascac/pdf/meetings/202306/Brown_IRI_ASCAC_2023206.pdf IRI Vision]
 +
 +
[https://arxiv.org/pdf/2111.05155 A horizontally scalable online processing system for trigger-less data acquisition]
 +
 +
[https://arxiv.org/pdf/2212.11032 The-triggerless-data-acquisition-system-of-the-XENONnT-experiment]
 +
 +
[https://indico.cern.ch/event/783429/contributions/3378959/attachments/1829959/2996545/khennessy_cepc_dune_daq_v1.pdf DUNE triggerless DAQ]
 +
 +
[https://indico.jlab.org/event/378/contributions/6050/attachments/5093/6351/20200513_JLab_Streaming_Readout.pdf Streaming Mode DAQ at JLab]
 +
 +
[http://www.scholarpedia.org/article/Real-time_data_analysis_in_particle_physics Real-time data analysis in particle physics]
 +
 +
[https://indico.cern.ch/event/659612/contributions/2690262/attachments/1591386/2518642/triggerintro4.pdf Intro to Triggering]
 +
 +
[https://wiki.jlab.org/epsciwiki/images/8/8b/SRO_LDRD_Test_Plan_2024v0.8.pdf SRO Test Plan]
  
 
== Edge to Core Test Equipment: ==
 
== Edge to Core Test Equipment: ==

Latest revision as of 20:52, 19 December 2024

Welcome to the EJFAT Wiki

(ESnet / JLaB FPGA Accelerated Transport)



System Overview:

EJFAT is a collaboration between Energy Sciences Network (ESnet) and Thomas Jefferson National Laboratory (JLab) for proof of concept engineering for accelerated load balancer (LB) using dynamic IP4/6 address forwarding. Dynamic because the forwarding address is chosen dynamically from a collection of destination endpoints based on near real-time destination workload conditions, and accelerated because the forwarding is accomplished with low fixed latency at line rates of up to 200Gbps per FPGA, where in general a functioning LB may consist of up to four FPGAs acting as one logical DP for a total bandwidth capacity of over 1 Tbps. The low, fixed latency is achieved by utilization of an appropriately programmed Field Programmable Gate Array (FPGA) to effect the Data Plane (DP) functions of the LB.

EJFAT System Status

ejfat-1

  1. 100Gbps NIC: ejfat-1-daq 129.57.177.8
  2. 10Gbps NIC: ejfat-1 129.57.177.131
  3. U280 FPGA: ejfat-1-dp 129.57.177.{9-16} - LAG'd for 200Gbps
  4. LB CP: ejfat-1 129.57.177.131, latest Stable branch
  5. LB: DP latest Stable FW
  6. CP Web UI port 8081

ejfat-2

  1. 100Gbps NIC: ejfat-2-daq 129.57.177.2
  2. 10Gbps NIC: ejfat-2 129.57.177.132
  3. 100Gbps U280 FPGA: ejfat-2-dp 129.57.177.{17-24}
  4. LB CP: ejfat-2 129.57.177.132, latest Stable branch
  5. LB: DP latest Stable FW
  6. CP Web UI port 8082

ejfat-3

  1. 200Gbps NIC: ejfat-3-daq 129.57.177.3
  2. 10Gbps NIC: ejfat-3 129.57.177.133
  3. Two U280s installed - LAG'd for 400Gbps
  4. FW Containers built by Stacey

ejfat-4

  1. 100Gbps NIC: ejfat-4-daq 129.57.177.4
  2. 10Gbps NIC: ejfat-4 129.57.177.134
  3. XDP experiments
  4. 100Gbps U280 FPGA: ejfat-4-dp 129.57.177.{41-48}
  5. LB CP: ejfat-4 129.57.177.134, latest Stable branch
  6. LB: DP latest Stable FW

ejfat-5

  1. 200Gbps NIC: ejfat-5-daq 129.57.177.5
  2. 10Gbps NIC: ejfat-5 129.57.177.135
  3. LB CP: ejfat-5 129.57.177.135, latest Stable branch
  4. 100Gbps U280 FPGA: ejfat-5-dp 129.57.177.{49-56}
  5. LB: DP latest Stable FW
  6. Optical Taps Installed

ejfat-6

  1. 200Gbps NIC: ejfat-6-daq 129.57.177.6
  2. 10Gbps NIC: ejfat-6 129.57.177.136
  3. DAOS experiments
  4. Using Ubuntu 24.04 LTS
  5. FW containers built
  6. Waiting for podman compose installation

ejfat-fs

  1. 100Gbps NIC: ejfat-fs-daq 129.57.177.7
  2. 10Gbps NIC: ejfat-fs 129.57.177.130
  3. Hosts NVME memory/disk
  4. 100Gbps U280 FPGA: ejfat-fs-dp 129.57.177.{65-72}
  5. LB CP: ejfat-fs 129.57.177.130, latest Stable branch
  6. LB: DP latest Stable FW
  7. CP Web UI port 8080

Presentations/Papers

date presenter Event links
2021-03-01 G. Heyes EJFAT Proposal Word
2021-10-21 M. S. Goodrich Div Brief PDF
2021-11-05 M. S. Goodrich Canisius College PDF
2021-12-03 S. Sheldon ESnet LB Tutorial MP4
2021-12-10 Y. Kumar SRO iX Presentation PPTX
2022-08-05 M. S. Goodrich RT-2022 Presentation PPTX
2022-08-05 M. S. Goodrich, et al. RT-2022 Proceedings PDF
2022-10-20 S. Sheldon, et al. INDIS-2022 PDF
2022-10-24 M. S. Goodrich ACAT-2022 Presentation PPTX
2023-03-17 M. S. Goodrich, et al. ACAT-2022 Proceedings PDF
2023-05-11 M. S. Goodrich, et al. CHEP-2023 Presentation PPTX
2023-10-12 D. Howard, et al. CHEP-2023 Conference Publication PDF
2024-03-11 M. S. Goodrich, et al. ACAT-2024 Presentation PPTX
2024-04-10 M. S. Goodrich, et al. RT-2024 Presentation PPTX
2024-07-31 M. S. Goodrich, et al. ACAT-2024 Proceedings PDF
2024-10-02 S. Veseli​, APS/SDM APS/ALS - EJFAT PPTX

EJFAT Weekly EPSCI Meetings

EJFAT Weekly EPSCI Meetings

EJFAT Weekly Collaboration Meetings

EJFAT Weekly Meetings

Technical Design Overview

EJFAT Technical Design Overview

UDP Packet Header Formats

IRIAD/EJFAT Testbed

UDP Transmission Performance

EJFAT UDP General Information

EJFAT UDP General Performance Considerations

EJFAT UDP Packet Receiving and Core Switching

EJFAT UDP Packet Sending and NUMA Nodes

EJFAT UDP Single Thread Packet Sending and Receiving

Testing Load Balancer Bandwidth

HOW-TOs

How to use Control Plane Web UI

How to Monitor Prometheus

Install a Load Balancer

Test a Load Balancer

How to setup ejfat nodes

How to install, build and use gRPC

How to install, build and use XDP related packages

How to Compute Schedule Density from PID Signals

Enable Jumbo Frames

Network Path MTU Discovery support in the Linux Kernel:

file: /proc/sys/net/ipv4/tcp_mtu_probing
variable: net.ipv4.tcp_mtu_probing (integer; default: 0; since Linux 2.6.17):

tcp_mtu_probing - INTEGER
	Controls TCP Packetization-Layer Path MTU Discovery.  Takes three values:
	  0 - Disabled
	  1 - Disabled by default, enabled when an ICMP black hole detected
	  2 - Always enabled, use initial MSS of tcp_base_mss.

REFERENCEs

EJFAT Config Planning

JLab EJFAT News Release

EJFAT on FABRIC

EJFAT API

LB Pipeline

Getting Started with EJFAT

IRIAD Workplan

SRO Grand Challenge

ESnet Logical Map

IP Neighbor

Robot Framework

IRI Vision

A horizontally scalable online processing system for trigger-less data acquisition

The-triggerless-data-acquisition-system-of-the-XENONnT-experiment

DUNE triggerless DAQ

Streaming Mode DAQ at JLab

Real-time data analysis in particle physics

Intro to Triggering

SRO Test Plan

Edge to Core Test Equipment:

  1. Price Estimate Spreadsheet
  2. Networking Diagram, Updated (PDF) (from Brent 2024-02-09)
  3. PR408549 : Requisition 1 of 2 :
    1. Statement of Work for Servers
    2. 1/13/2022: EJFAT team decided to solicit two bid responses, one with MLX NIC and one without. Response from Procurement is "I can ask for the two separate quotes. If you are going to purchase both option (with & without add-in cards), once I receive the quotes back, you will have submit a new PR to cover the option (without add-in cards)."
    3. 1/18/2022: Question from KOI Computers: "please clarify what the part number for the NVIDIA Dual Port ConnectX-6". Replied with part # MCX623106AN-CDAT.
    4. 1/24/2022: Requisition currently open for bid responses from vendors. Due date is COB 1/24/2022.
    5. 1/27/2022: PO awarded to Atipa for 6 servers and 1 file-server with FPGA and MLX SmartNIC. Expected delivery date from vendor is 5/31/2022.
  4. PR408870 PR408938 Requisition 2 of 2: Statement of Work for Switches & Cables
    1. 1/14/2022: PRs for the switches, transceivers and fiber have been submitted. I added (4) 2km 100G transceivers to support dual 100G connections between the switches. We can always upgrade to 400G in the future, if needed.
  5. PR409850 NVIDIA ARM HPC Developer Kit
    1. Hardware Specifications for dev kit
      Model GIGABYTE G242-P32, 2U server
      CPU 1x Ampere Altra Q80-30 (Arm processor)
      Memory 512G DDR4 memory
      Storage 6TB SAS/ SATA 3.5″
      GPU 2x NVIDIA A100 GPU
      Network 2x NVIDIA® BlueField®-2 E-Series DPU, 200GbE/HDR single-port QSFP56, PCIe Gen4 x16, secure boot enabled, crypto disabled, 16GB on-board DDR, 1GbE OOB management

Resources