Difference between revisions of "EJFAT EPSCI Meeting Jun. 15, 2022"
Jump to navigation
Jump to search
(4 intermediate revisions by the same user not shown) | |||
Line 59: | Line 59: | ||
** Look at [https://support.mellanox.com/s/article/roce-v2-considerations ROCE] / NIC | ** Look at [https://support.mellanox.com/s/article/roce-v2-considerations ROCE] / NIC | ||
* Pending: | * Pending: | ||
− | ** Support C libraries for LB Host Control Plane - in <s>unit test</s> code review | + | ** Support C libraries for LB Host Control Plane - in <s>unit test</s> <s>code review</s> legal review |
** ESnet smartnic open-source GitHub repo (May) | ** ESnet smartnic open-source GitHub repo (May) | ||
** ESnet private, forkable Jlab P4 and simulations GitHub repo (May) | ** ESnet private, forkable Jlab P4 and simulations GitHub repo (May) | ||
* To Do: | * To Do: | ||
** Near Term: | ** Near Term: | ||
+ | *** EJFAT VLAN Checkout | ||
*** Network Performance Measurements: | *** Network Performance Measurements: | ||
**** FPGA LB Throuput [[File:FPGA-LB-Throuput-Test-0.png|border|400px|link=|Current Results]] | **** FPGA LB Throuput [[File:FPGA-LB-Throuput-Test-0.png|border|400px|link=|Current Results]] | ||
Line 70: | Line 71: | ||
**** UDP Packet Loss | **** UDP Packet Loss | ||
**** Need new parameters | **** Need new parameters | ||
+ | *** SLURM env for EJFAT VLAN (Hess) | ||
+ | *** DAQ/VTP Data Generation Test Harness | ||
+ | *** Burn Rate on EJFAT grant | ||
+ | *** Vivado Licesnses for new machines (?) (Singh) | ||
+ | *** ACAT 2022 Abstract | ||
*** <s>Hall-B FT calorimeter and hodoscope streaming readout test</s> - OBE | *** <s>Hall-B FT calorimeter and hodoscope streaming readout test</s> - OBE | ||
**** <s>May be able to use Abbott's indra-s1 setup</s> | **** <s>May be able to use Abbott's indra-s1 setup</s> | ||
Line 99: | Line 105: | ||
*** IPV6 testing | *** IPV6 testing | ||
*** [https://indico.cern.ch/event/1109460/ RT2022 - August 01-05 Conference] | *** [https://indico.cern.ch/event/1109460/ RT2022 - August 01-05 Conference] | ||
+ | *** ACAT 2022 Paper | ||
+ | *** RT 2022 Paper | ||
* AOT | * AOT | ||
<hr> | <hr> |
Latest revision as of 19:03, 15 June 2022
The meeting time is 2:00pm.
Connection Info:
You can connect using ZoomGov Video conferencing (ID: 161 203 8101). (Click "Expand" to the right for details -->):
Meeting URL https://jlab-org.zoomgov.com/j/1617413961?pwd=QWpXalc0SXFrSUNBNmFrbVZycisrUT09&from=addon Meeting ID 161 741 3961 Passcode 124964 Want to dial in from a phone? Dial one of the following numbers: US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free) Enter the meeting ID and passcode followed by # Connecting from a room system? Dial: bjn.vc or 199.48.152.152 and enter your meeting ID & passcode
Agenda:
- Previous meeting
- Situation:
- Rec'd new f/w build 28 April
- Specs
- Restores Jumbo Frames
- arp, ping - working
- Port entropy field - Passed Test for data_id stream horizontal reassembly with 10 streams
- Using script based LB Control Plane
- ERSAP feed end bottleneck needs investigation; Timmer's blaster may provide relief
- New machines (6) rec'd, installed, going on EJFAT subnet (VLAN 937 172.19.22.0/24)
- Spare EJFAT equip loaners:
- (4) DAQ dev machines indra-s[1-3] 129.57.29/109.23[0-2]
- alkaid: 24 Xeon Gold 3.4 GHz cores, 100Gbs
- indra-s1: 24 Xeon Gold 3.0 GHz cores, 100Gbs
- indra-s2: 32 Xeon Gold 3.2 GHz cores, 100Gbs
- indra-s3: 32 Xeon Gold 2.3 GHz cores, 100Gbs, 750GB ram disk
- (4) DAQ Farm machines dafarm6[1-4] currently on 129.57.29.17[1-4] - each 32 Xeon 2.0Ghz cores - 1 Gbs NIC + (4) 10Gbs Spare NICs
- (4) Unbuilt DAQ Farm machines - each 32 Xeon 2.0Ghz cores - 1 Gbs NIC + (4) 10Gbs Spare NICs
- (4) Spare 10Gbs Spare NICs
- (17) Hall-D machines - gluon120-36 129.57.52.9[2-36] - each 2 Xeon 2.6Ghz cores - 10Gbs NIC
- On Order:
- (4) DAQ dev machines indra-s[1-3] 129.57.29/109.23[0-2]
- Look at iperf2 for network testing
- Look at ROCE / NIC
- Rec'd new f/w build 28 April
- Pending:
- Support C libraries for LB Host Control Plane - in
unit testcode reviewlegal review - ESnet smartnic open-source GitHub repo (May)
- ESnet private, forkable Jlab P4 and simulations GitHub repo (May)
- Support C libraries for LB Host Control Plane - in
- To Do:
- Near Term:
- EJFAT VLAN Checkout
- Network Performance Measurements:
- FPGA LB Throuput
- Host NICs
- Host S/W Reassembly
- UDP Packet Loss
- Need new parameters
- SLURM env for EJFAT VLAN (Hess)
- DAQ/VTP Data Generation Test Harness
- Burn Rate on EJFAT grant
- Vivado Licesnses for new machines (?) (Singh)
- ACAT 2022 Abstract
Hall-B FT calorimeter and hodoscope streaming readout test- OBEMay be able to use Abbott's indra-s1 setupMay be able to use new VTP f/w with Hall-B VTP's - (Ben Raydo)CODA 3.10 + ERSAP for new VTP f/wCODA 2.0 (non-streaming) for old VTP f/wDiagramHall-B to start taking data June 8Hall B VTPs on .167. subnet
- Downstream (June/July):
- HOSS - June
- parallelize writing of raw data files
- distribute raw data across multiple compute nodes for calibration skims
- 1 Gbs at hi-luminosity
- Control Plane
- Will interact with SLURM / Kubernetes
- Python based (?)
- Control Plane daemon for compute host (?)
- Demonstrate CP based flexibility/elasticity
- Hall-D comms with DAQ 109 subnet require network customization; (EJFAT subnet)
- Hall-D EJFAT use case
- Hall-D EJFAT Network Diagram
- Configuration:
- ejfat-sw 100Gbs switch
- (6) PR408549 New Computers w/ X CPUs + U280 fpga
- (?) Retired Data Center Farm Nodes
- EJFAT subnet VLAN 937 172.19.22.0/24 - 100Gbs, Jumbo frames
- DPDK - ESnet reports can stream 100 Gbps using DPDK.
- IPV6 testing
- RT2022 - August 01-05 Conference
- ACAT 2022 Paper
- RT 2022 Paper
- HOSS - June
- Near Term:
- AOT