Install an EJFAT Load Balancer

From epsciwiki
Revision as of 16:25, 13 November 2024 by Goodrich (talk | contribs) (Created page with "== New Installation Preparations == === Check for stale docker images: === <pre> docker image ls </pre> Delete images with tags: esnet-smartnic-fw, smartnic-dpdk-docker, xil...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

New Installation Preparations

Check for stale docker images:

docker image ls

Delete images with tags: esnet-smartnic-fw, smartnic-dpdk-docker, xilinx-labtools-docker, udplbd

Initial setup:

mkdir ~/esnet 

cd ~/esnet 

git clone --recursive https://github.com/esnet/xilinx-labtools-docker 

git clone --recursive https://github.com/esnet/smartnic-dpdk-docker 

git clone --recursive https://github.com/esnet/esnet-smartnic-fw 

git clone https://github.com/JeffersonLab/ersap-grpc.git

git clone https://github.com/esnet/udplbd.git 

Set proper revisions:

Stable Lineup


HW 57684 udplb @ c6956b46 FW 58131 esnet-smartnic-fw @ a07943f0 SW - udplbd @ 5712d10 (v0.3.2) Lab 57755 xilinx-labtools-docker @ 977a5678 DPDK 57593 smartnic-dpdk-docker @ xxxxxxxx


  1. 2. Order is important here:
  1. Xilinx Supports tools:

cp /daqfs/ejfat/Downloads/xilinx/Vivado_Lab_Lin_2023.2_1013_2256.tar.gz ~/esnet/xilinx-labtools-docker/vivado-installer/

cp /daqfs/ejfat/Downloads/xilinx/loadsc_v2.3.zip ~/esnet/xilinx-labtools-docker/sc-fw-downloads cp /daqfs/ejfat/Downloads/esnet/SC_U280_4_3_31.zip ~/esnet/xilinx-labtools-docker/sc-fw-downloads

  1. 3. Docker build for Xilinx Labtools:

cd ~/esnet/xilinx-labtools-docker

git checkout 977a5678

  1. Remark out the following lines in Dockerfile:
  1. Download and extract a few versions of the Satellite Controller firmware packages
  2. https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/alveo.html
  3. ARG SC_FW_BASE_URL="https://www.xilinx.com/bin/public/openDownload?filename="
  4. ARG SC_FW_U280_PKGS="xilinx-u280-gen3x16-xdma_2023.1_2023_0507_2220-all.deb.tar.gz xilinx-u280-gen3x16-xdma_2022.1_2022_0804_1110-all.deb.tar.gz"
  5. ARG SC_FW_U55C_PKGS="xilinx-u55c-gen3x16-xdma_2023.1_2023_0507_2220-all.deb.tar.gz xilinx-u55c-gen3x16-xdma_2022.1_2022_0415_2123-all.deb.tar.gz"
  6. RUN \
  7. cd /sc-fw-downloads && \
  8. for f in $SC_FW_U280_PKGS $SC_FW_U55C_PKGS ; do \
  9. echo "Fetching: $SC_FW_BASE_URL$f" ; \
  10. wget -qO- "$SC_FW_BASE_URL$f" | tar xz --wildcards 'xilinx-sc-fw*.deb' ; \
  11. done ; \
  12. mkdir -p /sc-fw && \
  13. for sc in /sc-fw-downloads/xilinx-sc-fw*.deb ; do \
  14. dpkg-deb --fsys-tarfile "$sc" | tar x -C /sc-fw --strip-components 6 --wildcards './opt/xilinx/firmware/sc-fw/*/sc-fw-*.txt' ; \
  15. done


  1. Follow instructions in README.md


  1. 4. Docker build for DPDK:

cd ~/esnet/smartnic-dpdk-docker

git checkout xxxxxxxx

  1. Follow instructions in README.md


  1. 5. Docker build for smartnic:

cd ~/esnet/esnet-smartnic-fw

git checkout a07943f0

  1. The ejfat f/w is engineered and obtained from esnet as an artifacts file:

SN_HW_VER = 57684 SN_HW_APP_NAME=udplb

cp /daqfs/ejfat/Downloads/esnet/artifacts.au280.$SN_HW_APP_NAME.$SN_HW_VER.zip ~/esnet/esnet-smartnic-fw/sn-hw

  1. Follow instructions in README.md up to and including (if necessary) the following lines:
  1. $ mkdir -p ~/.docker/cli-plugins/
  2. $ curl -SL https://github.com/docker/compose/releases/download/v2.27.1/docker-compose-linux-x86_64 -o ~/.docker/cli-plugins/docker-compose
  3. $ chmod +x ~/.docker/cli-plugins/docker-compose

git submodule init

git submodule update

cp example.env .env


  1. 5.0 Modifiy the .env file:
  1. The following env var lines must be populated:

SMARTNIC_DPDK_IMAGE_URI=<REPOSITORY:TAG>

  1. of smartnic-dpdk-docker Docker image built above since example.env implies retrieval from a remote es.net repository and retrieval will instead be made from a local Docker repository.
  1. Similarly,

LABTOOLS_IMAGE_URI=<REPOSITORY:TAG>

  1. of xilinx-labtools-docker Docker image built above.
  1. Un-remark and set the following lines:

SN_HW_APP_NAME=udplb

SN_HW_BOARD=au280

  1. <version> = 57684

SN_HW_VER=<artifacts version number from above, e.g., 57684>

SN_FW_VER=44124 Note this value is useful but not critical; can be set to zero


  1. 5.1 Build the firmware:

./build.sh


  1. 5.2 Modify the sn-stack/.env file:
  1. 5.2.1 Un-remark and set the following lines:

COMPOSE_PROFILES=smartnic-mgr-vfio-unlock


  1. 5.2.2 Un-remark and set the JTag serial code:
  1. Execute the bash cmd:

sudo lsusb -v -d 0403:6011 | grep iSerial

  1. e.g., 21770323600G

HW_TARGET_SERIAL=21770323600GA Note the appended 'A' char


  1. 5.2.3 Un-remark and set the FPGA PCI device code:
  1. Execute the bash cmd:
  1. either

lspci -Dd 10ee:

  1. or

lspci |grep -i xilinx

  1. e.g., 0000:a1:00

FPGA_PCIE_DEV=0000:a1:00


  1. 5.2.4 Un-remark and set the following lines:

SN_HOST=ejfat-?-dp.jlab.org Note this is the data planes (FPGA) well known IPV4 address or network name


  1. 5.2.5 Un-remark and set the rpc AUTH token:
  1. Execute the bash cmd:

openssl rand -base64 24

  1. e.g., 1CEpuDN0z39AFndEvcP3EmsuT8zu+3lt
  1. SN_CFG_AUTH_TOKEN=1CEpuDN0z39AFndEvcP3EmsuT8zu+3lt


  1. 5.3 Modify the sn-stack/docker-compose.yml file:
  1. 5.3.1 In the smartnic-hw/command: section, uncomment the FORCE argument line in the /scripts/program_card.sh invocation


  1. 5.3.2
  1. In older configurations it is required to expose TCP port 50051 (smartnic-p4) outside of the *firmware* docker stack so that the external control plane can reach the p4 agent. This is needed for retro-fitting older firmware with the newer FW / control-plane split. Newer firmware doesn't need this port fixup.
  1. Exposing the p4 agent TCP port is done by adding this stanza to the "smartnic-p4" section:
  1. ports:
  2. - "50051:50051"


  1. 5.3.2 add the following lines to the end of the smartnic-p4: section:


  1. logging:
  2. options:
  3. max-file: 5
  4. max-size: 100m
  1. Verify the sn-stack/docker-compose.yml:

cd sn-stack

docker compose config --quiet && echo "All good!"


  1. 5.4 If applicable, follow instructions in esnet-smartnic-fw/sn-stack/README.INSTALL.md for: One-Time setup:
  1. 5.4.1 Converting from factory flash image to ESnet Smartnic flash image
  1. 5.4.2 Perform a cold-boot (power cycle) of the server hosting the FPGA card
  1. It is essential that this is a proper power cycle and not simply a warm reboot. Specifically do not use
  1. $ shutdown -r now
  1. Remotely: (smokenmirrors)

ipmitool -I lanplus -U ejfat -L Operator -H ejfat-4-bmc.jlab.org chassis power status ipmitool -I lanplus -U ejfat -L Operator -H ejfat-4-bmc.jlab.org chassis power off ipmitool -I lanplus -U ejfat -L Operator -H ejfat-4-bmc.jlab.org chassis power on

  1. Failure to perform a cold-boot here will result in an unusable card.


  1. 5.4.3 Normal Operation of the Runtime Environment

docker compose up -d

  1. Verify that

docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml exec smartnic-fw sn-cli dev version

  1. Returns something like:
  1. Device Version Info
  1. DNA: 0x40020000013b83c12c108485
  1. USR_ACCESS: 0x0000ac1b (<version>>)
  1. BUILD_STATUS: 0x12211043


docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml logs smartnic-fw

  1. Returns something like:
  1. smartnic-fw-1 | + sleep infinity


  1. 6. Library build for ersap-grpc :

cd ~/esnet/ersap-grpc/

git switch esnet3 git checkout a3b85c3868554380e12759f23335eaf3fead2441

export GRPC_INSTALL_DIR=/daqfs/ersap/installation3

  1. Follow instructions in README.md
  1. Note: It is typically not necessary to install/build grpc as the line above indicates
  1. 7. Docker build for Control Plane:

cd ~/esnet/udplbd/

git checkout 5712d10

cp /daqfs/ejfat/Downloads/JLab/JLabCA.crt ~/esnet/udplbd/

  1. 7.1 Modifiy docker-compose.yml
  1. a. Mount host filespace for /data

services:

 volumes:
   - ./data:/data
  1. b. Mount host TLS cert location for /certs

services:

 volumes:
   - /etc/letsencrypt/archive/<machine>.jlab.org:/certs
  1. 7.2 remove the leftover udplbd data base file:

rm ~/esnet/udplbd/data/udplbd.db

  1. 7.3 Follow instructions in README.md
  1. 7.3a Modifiy /etc/config.yml
  1. a. specify FPGA DP IPV4/6 addresses (up to 8)
  2. b. specify FPGA DP MAC unicast/broadcast addresses
  3. c. Put host IPV4 for CP event numbers/host (sync)
  4. d. Specify an event number/port for each address in 7.3a.a
  5. e. Put host IPV4 for CP server/host (grpc)
  6. f. Specify an auth token for CP grpc comms
  7. g. optionally enable server/TLS
  8. h. optionally specify container path to server/tls/certFile and server/tls/keyFile
  9. i. optionally perform steps 7.3a.g-h for smartnic/tls

docker compose build docker compose up -d

  1. 7.4

docker compose -f ~/esnet/udplbd/docker-compose.yml logs udplbd | less


  1. 8. Execute the the FPGA cmac setup procedure

scp <somewhere>/u280_cmac_setup.sh ~/esnet/esnet-smartnic-fw/sn-stack/scratch

chmod +x ~/esnet/esnet-smartnic-fw/sn-stack/scratch/u280_cmac_setup.sh

docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml exec smartnic-fw /scratch/u280_cmac_setup.sh > ~/esnet/esnet-smartnic-fw/sn-stack/scratch/u280_cmac_setup.out