Difference between revisions of "EJFAT EPSCI Meeting Oct. 23, 2024"
Jump to navigation
Jump to search
(Created page with "The meeting time is 2:30pm. === Connection Info: === <div class="toccolours mw-collapsible mw-collapsed"> You can connect using [https://teams.microsoft.com/l/meetup-join/19%...") |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 27: | Line 27: | ||
#: | #: | ||
# Announcements: | # Announcements: | ||
+ | ## SuperComputing24 Atlanta, GA from Nov 17-22, 2024 | ||
# [https://wiki.jlab.org/epsciwiki/index.php/EJFAT Status] | # [https://wiki.jlab.org/epsciwiki/index.php/EJFAT Status] | ||
# Topics | # Topics | ||
Line 63: | Line 64: | ||
### Ram Disks: 1TB Total Mem on ejfat-fs, 0.5 TB others | ### Ram Disks: 1TB Total Mem on ejfat-fs, 0.5 TB others | ||
### Repurposing /dev/sdb to be used for user storage | ### Repurposing /dev/sdb to be used for user storage | ||
− | ### Storage Areas NOT to be backed up could be marked as '' | + | ### Storage Areas NOT to be backed up could be marked as ''scratch'' |
### Have an opportunity to consolidate wares on SSD for consistent SC backup procedure ?. | ### Have an opportunity to consolidate wares on SSD for consistent SC backup procedure ?. | ||
## Experiment Halls - beam returns late January/February 2025 | ## Experiment Halls - beam returns late January/February 2025 | ||
## Ubuntu 20.04 LTS - support ends in 2025 - next ESnet target 22.04 | ## Ubuntu 20.04 LTS - support ends in 2025 - next ESnet target 22.04 | ||
− | ## CP: Control Web UI (127.0.0.1) needs SSH tunnel | + | ## CP: Control Web UI (127.0.0.1:8081) needs SSH tunnel |
## EJFAT II: New CP-DP APIs for config available for current current FW (?) | ## EJFAT II: New CP-DP APIs for config available for current current FW (?) | ||
## ESnet interested in partnering for beachhead in FPGA/GPU AI space | ## ESnet interested in partnering for beachhead in FPGA/GPU AI space | ||
Line 84: | Line 85: | ||
# [https://www.overleaf.com/project/667d9fa6b50f340b46026ba3 ACAT 2024 Paper] - In Review | # [https://www.overleaf.com/project/667d9fa6b50f340b46026ba3 ACAT 2024 Paper] - In Review | ||
# AOT | # AOT | ||
+ | |||
=== Minutes === | === Minutes === | ||
<hr> | <hr> |
Latest revision as of 18:42, 23 October 2024
The meeting time is 2:30pm.
Connection Info:
You can connect using Teams Link. (Click "Expand" to the right for details -->):
Agenda:
- Previous meeting
- Announcements:
- SuperComputing24 Atlanta, GA from Nov 17-22, 2024
- Status
- Topics
- Docker containers on reboot
- IRI Test Development:
- Last Test Thursday Oct 10
- LB version = ESnet ??? version
- Data Source:
- JLAB, CLAS12, pre-triggered events - 1 channel
- Data Sink:
- Perlmutter - 40 nodes
- ORNL/ESnet/JLab IRI Testbed / Defiant - 4 nodes allocated
- JLab - 7 nodes available
- FABRIC - nodes available
- ERSAP
- Test Plans - JLab, ESnet, NERSC:
- Prometheus Dashboards
- The Prometheus dashboard can be accessed on port 1717 of the ejfat-fs node. The test data is located at "100g-nersc-ornl / ejfat-nersc-ornl". The test time interval is around UTC 17:05 to 18:20 on August 29, 2024. To log in to Grafana, please use the username and password "ejfat".
- JLab FEG/SRO
- will use interim UDP solution for event sync
- Special Events Issue - Completely Out-of-band
- Cloud Based message queue
- LB isolation from any non-LB processing
- E2SAR 0.1.2
- ejfat-5 reserved for E2SAR
- segmentation/reassembly complete
- .deb packages for Ubuntu 20, 22 and 24 are now available (they contain E2SAR library, headers, executables as well as appropriate versions of gRPC and Boost dependencies, all installed under /usr/local), as well as the latest Docker image
- Special Events: Site Issue: Cloud solutions include CAFCA, RabbitMsg, ...
- ALS: Next is Tech Talk
- IB
- ejfat-3 - two FPGA DP built, running - with 4-port LAG at switch, needs CP installation
- ejfat-6 - Ubuntu 24.04 installed - esnet-smartnic-fw build succeeds with podman, issues with podman compose
- SC poster submitted - demo in works
- Storage
- SSD drives on ejfat-fs - 20TB used of 28TB - mounted for EJFAT farm
- Ram Disks: 1TB Total Mem on ejfat-fs, 0.5 TB others
- Repurposing /dev/sdb to be used for user storage
- Storage Areas NOT to be backed up could be marked as scratch
- Have an opportunity to consolidate wares on SSD for consistent SC backup procedure ?.
- Experiment Halls - beam returns late January/February 2025
- Ubuntu 20.04 LTS - support ends in 2025 - next ESnet target 22.04
- CP: Control Web UI (127.0.0.1:8081) needs SSH tunnel
- EJFAT II: New CP-DP APIs for config available for current current FW (?)
- ESnet interested in partnering for beachhead in FPGA/GPU AI space
- Separate Project
- FAST program coming up
- May get free help from Xilinx
- Might target VERSA release
- Resources:
- U280’s are discontinued.
- New LB purchases U55C
- U55C bitfiles available 1 year out
- U280 Supported indefinitely
- ACAT 2024 Paper - In Review
- AOT