Difference between revisions of "EJFAT Group Meeting Jun. 16, 2022"
Jump to navigation
Jump to search
(Created page with " The meeting time is 11:00am. === Connection Info: === <div class="toccolours mw-collapsible mw-collapsed"> You can connect using [https://jlab-org.zoomgov.com/j/1610125238?p...") |
|||
Line 32: | Line 32: | ||
<!--------------------------------------------------------------------------------------------------> | <!--------------------------------------------------------------------------------------------------> | ||
=== Agenda: === | === Agenda: === | ||
− | * [[EJFAT Group Meeting | + | * [[EJFAT Group Meeting Jun. 2, 2022 | Previous meeting]] |
*: | *: | ||
− | * | + | * Status: |
− | ** | + | ** Using ESnet FPGA f/w build 28 April |
*** [https://docs.google.com/document/d/1ssw8sye7jExtPCJVejloe8hNkyWOcxEQzVmm45xs5-w/edit#heading=h.mqilsqsmmpek Specs] | *** [https://docs.google.com/document/d/1ssw8sye7jExtPCJVejloe8hNkyWOcxEQzVmm45xs5-w/edit#heading=h.mqilsqsmmpek Specs] | ||
− | *** | + | *** Jumbo Frames |
− | *** arp, ping | + | *** arp, ping, ICMP filtering |
− | *** Port entropy | + | *** Port entropy |
− | ** | + | ** Script based LB Control Plane |
+ | ** Support C libraries for LB Host Control Plane - in <s>unit test</s> <s>code review</s> legal review | ||
+ | ** ESnet smartnic open-source GitHub repo - in legal review | ||
+ | ** ESnet private, forkable Jlab P4 and simulations GitHub repo - in legal review | ||
** ERSAP feed end bottleneck needs investigation; Timmer's blaster may provide relief | ** ERSAP feed end bottleneck needs investigation; Timmer's blaster may provide relief | ||
− | ** ( | + | ** New machines (6) rec'd, installed w/ Ubuntu 20.04 on EJFAT subnet (VLAN 937 172.19.22.0/24) |
− | |||
− | |||
** Spare EJFAT equip loaners: | ** Spare EJFAT equip loaners: | ||
*** (4) DAQ dev machines ''indra-s[1-3]'' 129.57.29/109.23[0-2] | *** (4) DAQ dev machines ''indra-s[1-3]'' 129.57.29/109.23[0-2] | ||
Line 53: | Line 54: | ||
*** (4) DAQ Farm machines ''dafarm6[1-4]'' currently on 129.57.29.17[1-4] - each 32 Xeon 2.0Ghz cores - 1 Gbs NIC + (4) 10Gbs Spare NICs | *** (4) DAQ Farm machines ''dafarm6[1-4]'' currently on 129.57.29.17[1-4] - each 32 Xeon 2.0Ghz cores - 1 Gbs NIC + (4) 10Gbs Spare NICs | ||
*** (4) Unbuilt DAQ Farm machines - each 32 Xeon 2.0Ghz cores - 1 Gbs NIC + (4) 10Gbs Spare NICs | *** (4) Unbuilt DAQ Farm machines - each 32 Xeon 2.0Ghz cores - 1 Gbs NIC + (4) 10Gbs Spare NICs | ||
− | + | *** [https://misportal.jlab.org/reqs/pr/viewPr.do?prNum=408549 PR408549] (6) 100Gbs NICs - ETA 1 July | |
− | + | *** [https://misportal.jlab.org/reqs/pr/viewPr.do?prNum=408870 PR408870] [https://misportal.jlab.org/reqs/pr/viewPr.do?prNum=408938 PR408938] (2) 100Gbs Arista switches, <s>transceivers, cables</s>, etc - ETA <s>1 July</s> 5 October | |
− | + | * Next Steps: | |
− | + | ** EJFAT VLAN Checkout | |
− | + | ** Network Performance: | |
+ | *** FPGA LB Throuput - max sustained 90Gbs | ||
+ | *** Host NICs | ||
+ | *** Host S/W Reassembly - better algorithms, buffering, asynchronicity, etc. | ||
+ | *** [[EJFAT UDP Transmission Performance]] | ||
+ | *** Need better parameters for event reassembly/reconstruction | ||
+ | ** Control Plane | ||
+ | *** Will interact with SLURM / Kubernetes | ||
+ | *** Python based (?) | ||
+ | *** Control Plane daemon for compute host (?) | ||
+ | *** Demonstrate CP based flexibility/elasticity | ||
** Look at iperf2 for network testing | ** Look at iperf2 for network testing | ||
** Look at [https://support.mellanox.com/s/article/roce-v2-considerations ROCE] / NIC | ** Look at [https://support.mellanox.com/s/article/roce-v2-considerations ROCE] / NIC | ||
− | * | + | ** SLURM env for EJFAT VLAN (Hess) |
− | ** | + | ** DAQ/VTP Data Generation Test Harness |
− | ** | + | ** Vivado Licesnses for new machines (?) (Singh) |
− | + | ** ACAT 2022 Abstract | |
− | + | ** Back Burner / Downstream: | |
− | ** | ||
− | ** | ||
− | |||
− | |||
− | |||
− | |||
− | |||
*** <s>Hall-B FT calorimeter and hodoscope streaming readout test</s> - OBE | *** <s>Hall-B FT calorimeter and hodoscope streaming readout test</s> - OBE | ||
**** <s>May be able to use Abbott's indra-s1 setup</s> | **** <s>May be able to use Abbott's indra-s1 setup</s> | ||
Line 80: | Line 84: | ||
**** <s>Hall-B to start taking data June 8</s> | **** <s>Hall-B to start taking data June 8</s> | ||
**** <s>Hall B VTPs on .167. subnet</s> | **** <s>Hall B VTPs on .167. subnet</s> | ||
− | |||
*** [https://www.epj-conferences.org/articles/epjconf/abs/2021/05/epjconf_chep2021_04005/epjconf_chep2021_04005.html HOSS] - June | *** [https://www.epj-conferences.org/articles/epjconf/abs/2021/05/epjconf_chep2021_04005/epjconf_chep2021_04005.html HOSS] - June | ||
**** parallelize writing of raw data files | **** parallelize writing of raw data files | ||
**** distribute raw data across multiple compute nodes for calibration skims | **** distribute raw data across multiple compute nodes for calibration skims | ||
**** 1 Gbs at hi-luminosity | **** 1 Gbs at hi-luminosity | ||
− | |||
− | |||
− | |||
− | |||
− | |||
**** Hall-D comms with DAQ 109 subnet require network customization; (EJFAT subnet) | **** Hall-D comms with DAQ 109 subnet require network customization; (EJFAT subnet) | ||
**** [https://docs.google.com/presentation/d/1m3rFm-1GymYv8zGimlAjL1NmWtXVfyIQdGzhx_j_BKE/edit?usp=sharing Hall-D EJFAT use case] | **** [https://docs.google.com/presentation/d/1m3rFm-1GymYv8zGimlAjL1NmWtXVfyIQdGzhx_j_BKE/edit?usp=sharing Hall-D EJFAT use case] | ||
**** [https://jeffersonlab-my.sharepoint.com/personal/bmorris_jlab_org/Documents/Microsoft%20Teams%20Chat%20Files/JLab%20Network%20-%20HallD-to-EJFAT.png Hall-D EJFAT Network Diagram] | **** [https://jeffersonlab-my.sharepoint.com/personal/bmorris_jlab_org/Documents/Microsoft%20Teams%20Chat%20Files/JLab%20Network%20-%20HallD-to-EJFAT.png Hall-D EJFAT Network Diagram] | ||
− | |||
− | |||
− | |||
− | |||
− | |||
*** [http://www.dpdk.org DPDK] - ESnet reports can stream 100 Gbps using DPDK. | *** [http://www.dpdk.org DPDK] - ESnet reports can stream 100 Gbps using DPDK. | ||
*** IPV6 testing | *** IPV6 testing | ||
*** [https://indico.cern.ch/event/1109460/ RT2022 - August 01-05 Conference] | *** [https://indico.cern.ch/event/1109460/ RT2022 - August 01-05 Conference] | ||
− | *** | + | *** ACAT 2022 Paper |
+ | *** RT 2022 Paper | ||
* AOT | * AOT | ||
<hr> | <hr> |
Revision as of 14:42, 16 June 2022
The meeting time is 11:00am.
Connection Info:
You can connect using ZoomGov Video conferencing (ID: 161 012 5238). (Click "Expand" to the right for details -->):
Meeting URL https://jlab-org.zoomgov.com/j/1610125238?pwd=QnEvcjV6VFFndWZsQW15SmJKU0RJZz09&from=addon Meeting ID 161 012 5238 Passcode 503371 Want to dial in from a phone? Dial one of the following numbers: US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free) Enter the meeting ID and passcode followed by # Connecting from a room system? Dial: bjn.vc or 199.48.152.152 and enter your meeting ID & passcode
Agenda:
- Previous meeting
- Status:
- Using ESnet FPGA f/w build 28 April
- Specs
- Jumbo Frames
- arp, ping, ICMP filtering
- Port entropy
- Script based LB Control Plane
- Support C libraries for LB Host Control Plane - in
unit testcode reviewlegal review - ESnet smartnic open-source GitHub repo - in legal review
- ESnet private, forkable Jlab P4 and simulations GitHub repo - in legal review
- ERSAP feed end bottleneck needs investigation; Timmer's blaster may provide relief
- New machines (6) rec'd, installed w/ Ubuntu 20.04 on EJFAT subnet (VLAN 937 172.19.22.0/24)
- Spare EJFAT equip loaners:
- (4) DAQ dev machines indra-s[1-3] 129.57.29/109.23[0-2]
- alkaid: 24 Xeon Gold 3.4 GHz cores, 100Gbs
- indra-s1: 24 Xeon Gold 3.0 GHz cores, 100Gbs
- indra-s2: 32 Xeon Gold 3.2 GHz cores, 100Gbs
- indra-s3: 32 Xeon Gold 2.3 GHz cores, 100Gbs, 750GB ram disk
- (4) DAQ Farm machines dafarm6[1-4] currently on 129.57.29.17[1-4] - each 32 Xeon 2.0Ghz cores - 1 Gbs NIC + (4) 10Gbs Spare NICs
- (4) Unbuilt DAQ Farm machines - each 32 Xeon 2.0Ghz cores - 1 Gbs NIC + (4) 10Gbs Spare NICs
- PR408549 (6) 100Gbs NICs - ETA 1 July
- PR408870 PR408938 (2) 100Gbs Arista switches,
transceivers, cables, etc - ETA1 July5 October
- (4) DAQ dev machines indra-s[1-3] 129.57.29/109.23[0-2]
- Using ESnet FPGA f/w build 28 April
- Next Steps:
- EJFAT VLAN Checkout
- Network Performance:
- FPGA LB Throuput - max sustained 90Gbs
- Host NICs
- Host S/W Reassembly - better algorithms, buffering, asynchronicity, etc.
- EJFAT UDP Transmission Performance
- Need better parameters for event reassembly/reconstruction
- Control Plane
- Will interact with SLURM / Kubernetes
- Python based (?)
- Control Plane daemon for compute host (?)
- Demonstrate CP based flexibility/elasticity
- Look at iperf2 for network testing
- Look at ROCE / NIC
- SLURM env for EJFAT VLAN (Hess)
- DAQ/VTP Data Generation Test Harness
- Vivado Licesnses for new machines (?) (Singh)
- ACAT 2022 Abstract
- Back Burner / Downstream:
Hall-B FT calorimeter and hodoscope streaming readout test- OBEMay be able to use Abbott's indra-s1 setupMay be able to use new VTP f/w with Hall-B VTP's - (Ben Raydo)CODA 3.10 + ERSAP for new VTP f/wCODA 2.0 (non-streaming) for old VTP f/wDiagramHall-B to start taking data June 8Hall B VTPs on .167. subnet
- HOSS - June
- parallelize writing of raw data files
- distribute raw data across multiple compute nodes for calibration skims
- 1 Gbs at hi-luminosity
- Hall-D comms with DAQ 109 subnet require network customization; (EJFAT subnet)
- Hall-D EJFAT use case
- Hall-D EJFAT Network Diagram
- DPDK - ESnet reports can stream 100 Gbps using DPDK.
- IPV6 testing
- RT2022 - August 01-05 Conference
- ACAT 2022 Paper
- RT 2022 Paper
- AOT