Difference between revisions of "EJFAT EPSCI Meeting Apr. 27, 2022"

From epsciwiki
Jump to navigation Jump to search
(Created page with "The meeting time is 2:00pm. === Connection Info: === <div class="toccolours mw-collapsible mw-collapsed"> You can connect using [https://jlab-org.zoomgov.com/j/1612038101?pwd...")
 
 
(2 intermediate revisions by the same user not shown)
Line 31: Line 31:
 
<!-------------------------------------------------------------------------------------------------->
 
<!-------------------------------------------------------------------------------------------------->
 
=== Agenda: ===
 
=== Agenda: ===
* [[EJFAT EPSCI Meeting Mar. 30, 2022 | Previous meeting]]
+
* [[EJFAT EPSCI Meeting Apr. 20, 2022 | Previous meeting]]
 
*:
 
*:
 
* Situation:
 
* Situation:
** Testing with End-to-end EJFAT ERSAP solution on FPGA LB
+
** <s>Testing with End-to-end EJFAT ERSAP solution on FPGA LB</s>
** Jumbo Frames - indra-s2,s3, alkaid, fpga
+
** <s>Jumbo Frames - indra-s2,s3, alkaid, fpga</s>
 
** Linux IP stack buffer size increased
 
** Linux IP stack buffer size increased
 
** Using script based LB Control Plane
 
** Using script based LB Control Plane
Line 48: Line 48:
 
*** MTU size restricted to max 1500, update pending
 
*** MTU size restricted to max 1500, update pending
 
*** Possibly incorrect control message responses being investigated
 
*** Possibly incorrect control message responses being investigated
** Peter Nugent/LBNL: LDRD LBNL+JLAB combines JLAB streaming + LBNL compute
 
 
* Pending:
 
* Pending:
** Compute Equip.- ETA 1 June - currently on track
+
** Compute Equip.- ETA 1 June - currently on track sans 100Gbs NICs
** Networking Equip. - ETA 1 July
+
** Networking Equip. - ETA <s>1 July</s> 5 October
 
** Support C libraries for LB Host Control Plane
 
** Support C libraries for LB Host Control Plane
 
** ESnet smartnic open-source GitHub repo (April)
 
** ESnet smartnic open-source GitHub repo (April)
Line 68: Line 67:
 
***** Use EJFAT LB firmware that supports load balancing of multiple source/data_IDs.
 
***** Use EJFAT LB firmware that supports load balancing of multiple source/data_IDs.
 
***** EJFAT LB + RA engine performance as an event builder (i.e. source ID aggregator), a capability to send all available data_ID's with the same tick (time frame) to a single destination. NB Hall-B FT calorimeter and hodoscope setup is capable of sending up to 12 data streams.   
 
***** EJFAT LB + RA engine performance as an event builder (i.e. source ID aggregator), a capability to send all available data_ID's with the same tick (time frame) to a single destination. NB Hall-B FT calorimeter and hodoscope setup is capable of sending up to 12 data streams.   
 +
***** May be able to use Abbott's indra-s1 setup
 +
***** May be to use new VTP f/w with  Hall-B VTP's - Abbott's return
 +
***** CODA 3.10 + ERSAP
 
*** Control Plane
 
*** Control Plane
 
**** Feedback from Compute hosts
 
**** Feedback from Compute hosts
 
**** Control Plane daemon for compute host
 
**** Control Plane daemon for compute host
 +
**** [https://www.epj-conferences.org/articles/epjconf/abs/2021/05/epjconf_chep2021_04005/epjconf_chep2021_04005.html HOSS Hall-D EJFAT  use case]
 +
***** parallelize writing of raw data files
 +
***** distribute raw data across multiple compute nodes for calibration skims
 +
***** 1 Gbs at hi-luminosity
 +
***** Demonstrate CP based flexibility/elasticity
 +
***** Hall-D comms with indra-s2 (DAQ 109 subnet) require network customization
 +
***** [https://docs.google.com/presentation/d/1m3rFm-1GymYv8zGimlAjL1NmWtXVfyIQdGzhx_j_BKE/edit?usp=sharing Slides]
 +
***** [https://jeffersonlab-my.sharepoint.com/personal/bmorris_jlab_org/Documents/Microsoft%20Teams%20Chat%20Files/JLab%20Network%20-%20HallD-to-EJFAT.png Netwok Diagram]
 
** Downstream:
 
** Downstream:
 
*** [http://www.dpdk.org DPDK]
 
*** [http://www.dpdk.org DPDK]
 
*** IPV6 testing
 
*** IPV6 testing
*** [https://www.epj-conferences.org/articles/epjconf/abs/2021/05/epjconf_chep2021_04005/epjconf_chep2021_04005.html HOSS Hall-D EJFAT  use case]
 
**** [https://docs.google.com/presentation/d/1m3rFm-1GymYv8zGimlAjL1NmWtXVfyIQdGzhx_j_BKE/edit?usp=sharing Slides]
 
**** parallelize writing of raw data files
 
**** distribute raw data across multiple compute nodes for calibration skims
 
 
*** [https://indico.cern.ch/event/1109460/ RT2022 - August 01-05 Conference]
 
*** [https://indico.cern.ch/event/1109460/ RT2022 - August 01-05 Conference]
 
* AOT
 
* AOT
 
<hr>
 
<hr>

Latest revision as of 17:57, 27 April 2022

The meeting time is 2:00pm.

Connection Info:

You can connect using ZoomGov Video conferencing (ID: 161 203 8101). (Click "Expand" to the right for details -->):

Meeting URL
https://jlab-org.zoomgov.com/j/1612038101?pwd=Yk96QUcyT1NDVTRRUGNtOFVSSTdaUT09&from=addon

Meeting ID
161 203 8101

Passcode
378382

Want to dial in from a phone?

Dial one of the following numbers:
US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free)

Enter the meeting ID and passcode followed by #

Connecting from a room system?
Dial: bjn.vc or 199.48.152.152 and enter your meeting ID & passcode

Agenda:

  • Previous meeting
  • Situation:
    • Testing with End-to-end EJFAT ERSAP solution on FPGA LB
    • Jumbo Frames - indra-s2,s3, alkaid, fpga
    • Linux IP stack buffer size increased
    • Using script based LB Control Plane
    • ERSAP feed end bottleneck needs investigation; Timmer's blaster may provide relief
    • EJFAT subnet
      • VLAN 937 172.19.22.0/24
      • 10Gbs NIC, but need cables (copper twinX will do)
    • DAQ Farm machines dafarm61 dafarm62 dafarm63 dafarm64 currently on 129.57.29.0/24 subnet w/ additional i/f available for EJFAT subnet
    • Testing newly received FPGA load
      • arp, ping - working
      • Port entropy field - currently broken, update pending
      • MTU size restricted to max 1500, update pending
      • Possibly incorrect control message responses being investigated
  • Pending:
    • Compute Equip.- ETA 1 June - currently on track sans 100Gbs NICs
    • Networking Equip. - ETA 1 July 5 October
    • Support C libraries for LB Host Control Plane
    • ESnet smartnic open-source GitHub repo (April)
    • ESnet private, forkable Jlab P4 and simulations GitHub repo (April)
  • To Do:
    • Near Term:
      • Hall-B FT calorimeter and hodoscope streaming readout test (scheduled for week of 04/10 or 04/17)
        • 2 crates for FT calorimeter, furnished with fADC250s. Total 332 channels.
        • 1 crate for the hodoscope that has 12 fADC (total 192 channels)
        • Update firmware on all 3 VTPs to send fADC aggregated data frames in the EVIO format, through UDP.
        • Design ERSAP FE application containing an actor that receives VTP frames through UDP, extracts information about the source/data ID and frame number (tick), and sends it to the EJFAT packetizing engine.
        • Stage 1 testing: single stream/data-ID
          • EJFAT LB + reassembly (RA) engine overall performance data-rate dependence.
          • EJFAT LB + RA engine performance dependence from the EJFAT packetizing engine MTU settings
        • Stage 2 testing: multi-stream/data_ID
          • Use EJFAT LB firmware that supports load balancing of multiple source/data_IDs.
          • EJFAT LB + RA engine performance as an event builder (i.e. source ID aggregator), a capability to send all available data_ID's with the same tick (time frame) to a single destination. NB Hall-B FT calorimeter and hodoscope setup is capable of sending up to 12 data streams.
          • May be able to use Abbott's indra-s1 setup
          • May be to use new VTP f/w with Hall-B VTP's - Abbott's return
          • CODA 3.10 + ERSAP
      • Control Plane
        • Feedback from Compute hosts
        • Control Plane daemon for compute host
        • HOSS Hall-D EJFAT use case
          • parallelize writing of raw data files
          • distribute raw data across multiple compute nodes for calibration skims
          • 1 Gbs at hi-luminosity
          • Demonstrate CP based flexibility/elasticity
          • Hall-D comms with indra-s2 (DAQ 109 subnet) require network customization
          • Slides
          • Netwok Diagram
    • Downstream:
  • AOT