Difference between revisions of "EJFAT EPSCI Meeting Mar. 9, 2022"
Jump to navigation
Jump to search
(7 intermediate revisions by the same user not shown) | |||
Line 34: | Line 34: | ||
*: | *: | ||
* Situation: | * Situation: | ||
− | ** Testing with ERSAP on FPGA LB | + | ** Testing with End-to-end EJFAT ERSAP solution on FPGA LB |
+ | ** Jumbo Frames - indra-s2, fpga | ||
** Using script based LB Control Plane | ** Using script based LB Control Plane | ||
** Awaiting Compute Equip.- ETA 1 June | ** Awaiting Compute Equip.- ETA 1 June | ||
** Awaiting Networking Equip. - ETA 1 July | ** Awaiting Networking Equip. - ETA 1 July | ||
− | + | ** Building Interim Test Environments: | |
− | ** Building Interim Test | + | *** 129.57.109.0/24 subnet (100Gbs) |
− | *** | + | **** indra-s[1,2,3], alkaid |
− | *** | + | **** Benchmarks for RT2022 (April 1) |
− | *** | + | *** 129.57.172.0/22 subnet (1Gbs ?) - old/idle Hall-D machines |
− | *** | ||
− | |||
− | |||
* Pending: | * Pending: | ||
** <s>Minor f/w change for 'garbage' packets</s> | ** <s>Minor f/w change for 'garbage' packets</s> | ||
Line 52: | Line 50: | ||
** ESnet private, forkable Jlab P4 and simulations GitHub repo (April) | ** ESnet private, forkable Jlab P4 and simulations GitHub repo (April) | ||
* To Do: | * To Do: | ||
− | ** | + | ** Near Term: |
− | *** FPGA port #2 to switch | + | *** <s>[[Test Plans | Test Plan]]</s> |
− | *** Mellanox NIC port #2 to switch | + | *** Performance Measures (RT2022 - April 01 submission): |
− | *** | + | *** Interim Test Environments: |
− | ** | + | **** 129.57.109.0/24 subnet (100Gbs) |
− | ** C-based control plane | + | ***** connect FPGA port #2 to switch |
− | *** Feedback from Compute hosts design | + | ***** connect Mellanox 100 Gbs NIC port #2 to switch |
− | *** Control Plane Arp cache / network good citizen - P4 may do | + | ***** Jumbo Frames - indra-s[1,3], alkaid, 109 subnet switch? |
− | ** Control Plane daemon for compute host | + | **** 129.57.172.0/22 subnet (10Gbs) |
− | * | + | ***** ERSAP / EJFAT RE |
− | ** IPV6 testing | + | *** Control Plane ARP poisoning (spoofing? proxy?) |
− | ** EJFAT Subnet | + | **** arp -s ''ip_addr'' ''hw_addr'' netmask ''nm'' pub |
− | ** Hall-D EJFAT + SLURM use case | + | **** sudo arp -i enp134s0 -s 129.57.109.254 00:aa:bb:cc:dd:ee netmask 255.255.255.0 pub |
− | ** | + | ***** SIOCSARP: Invalid argument |
− | ** | + | **** sudo arp -i enp134s0 -s 129.57.109.254 00:aa:bb:cc:dd:ee pub |
− | ** | + | ***** Address HWtype HWaddress Flags Mask Iface |
− | ** | + | ***** 129.57.109.254 (incomplete) enp175s0f1 |
− | ** | + | ** Downstream: |
− | ** | + | *** C-based control plane |
− | * | + | **** Feedback from Compute hosts design |
− | * | + | **** Control Plane Arp cache / network good citizen - P4 may do |
+ | *** Control Plane daemon for compute host | ||
+ | *** IPV6 testing | ||
+ | *** EJFAT Subnet | ||
+ | *** Hall-D EJFAT + SLURM use case | ||
+ | * Issues: | ||
+ | ** Abbott spare 8 nodes - OBE? | ||
+ | ** Hall-D spare 10Gbs NICs - OBE? | ||
+ | ** CentOS 7 install on interim boxes - OBE? | ||
+ | ** Use (2) spare/borrowed switches - OBE? | ||
+ | ** [https://jeffersonlab-my.sharepoint.com/:b:/r/personal/goodrich_jlab_org/Documents/EJFAT/EJFAT%20Network%20Setup.pdf?csf=1&web=1&e=hkUo8k Diagram] - OBE? | ||
+ | ** ejfat-sw-2022.jlab.org (129.57.29.83) - what is this guy? | ||
+ | ** Mellanox 40Gbs NIC in indra-s2 | ||
* AOT | * AOT | ||
<hr> | <hr> |
Latest revision as of 18:41, 9 March 2022
The meeting time is 2:00pm.
Connection Info:
You can connect using ZoomGov Video conferencing (ID: 161 203 8101). (Click "Expand" to the right for details -->):
Meeting URL https://jlab-org.zoomgov.com/j/1612038101?pwd=Yk96QUcyT1NDVTRRUGNtOFVSSTdaUT09&from=addon Meeting ID 161 203 8101 Passcode 378382 Want to dial in from a phone? Dial one of the following numbers: US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free) Enter the meeting ID and passcode followed by # Connecting from a room system? Dial: bjn.vc or 199.48.152.152 and enter your meeting ID & passcode
Agenda:
- Previous meeting
- Situation:
- Testing with End-to-end EJFAT ERSAP solution on FPGA LB
- Jumbo Frames - indra-s2, fpga
- Using script based LB Control Plane
- Awaiting Compute Equip.- ETA 1 June
- Awaiting Networking Equip. - ETA 1 July
- Building Interim Test Environments:
- 129.57.109.0/24 subnet (100Gbs)
- indra-s[1,2,3], alkaid
- Benchmarks for RT2022 (April 1)
- 129.57.172.0/22 subnet (1Gbs ?) - old/idle Hall-D machines
- 129.57.109.0/24 subnet (100Gbs)
- Pending:
Minor f/w change for 'garbage' packets- Support C libraries for LB Host Control Plane
- ESnet smartnic open-source GitHub repo (April)
- ESnet private, forkable Jlab P4 and simulations GitHub repo (April)
- To Do:
- Near Term:
Test Plan- Performance Measures (RT2022 - April 01 submission):
- Interim Test Environments:
- 129.57.109.0/24 subnet (100Gbs)
- connect FPGA port #2 to switch
- connect Mellanox 100 Gbs NIC port #2 to switch
- Jumbo Frames - indra-s[1,3], alkaid, 109 subnet switch?
- 129.57.172.0/22 subnet (10Gbs)
- ERSAP / EJFAT RE
- 129.57.109.0/24 subnet (100Gbs)
- Control Plane ARP poisoning (spoofing? proxy?)
- arp -s ip_addr hw_addr netmask nm pub
- sudo arp -i enp134s0 -s 129.57.109.254 00:aa:bb:cc:dd:ee netmask 255.255.255.0 pub
- SIOCSARP: Invalid argument
- sudo arp -i enp134s0 -s 129.57.109.254 00:aa:bb:cc:dd:ee pub
- Address HWtype HWaddress Flags Mask Iface
- 129.57.109.254 (incomplete) enp175s0f1
- Downstream:
- C-based control plane
- Feedback from Compute hosts design
- Control Plane Arp cache / network good citizen - P4 may do
- Control Plane daemon for compute host
- IPV6 testing
- EJFAT Subnet
- Hall-D EJFAT + SLURM use case
- C-based control plane
- Near Term:
- Issues:
- Abbott spare 8 nodes - OBE?
- Hall-D spare 10Gbs NICs - OBE?
- CentOS 7 install on interim boxes - OBE?
- Use (2) spare/borrowed switches - OBE?
- Diagram - OBE?
- ejfat-sw-2022.jlab.org (129.57.29.83) - what is this guy?
- Mellanox 40Gbs NIC in indra-s2
- AOT