Difference between revisions of "Install an EJFAT Load Balancer"
(22 intermediate revisions by the same user not shown) | |||
Line 41: | Line 41: | ||
| Lab || 57755 || xilinx-labtools-docker || 977a5678 | | Lab || 57755 || xilinx-labtools-docker || 977a5678 | ||
|- | |- | ||
− | | DPDK || 57593 || smartnic-dpdk-docker || | + | | DPDK || 57593 || smartnic-dpdk-docker || fd1ea53 |
|} | |} | ||
− | + | == Xilinx Supports tools: == | |
− | + | === Required Binaries === | |
<pre> | <pre> | ||
cp /daqfs/ejfat/Downloads/xilinx/Vivado_Lab_Lin_2023.2_1013_2256.tar.gz ~/esnet/xilinx-labtools-docker/vivado-installer/ | cp /daqfs/ejfat/Downloads/xilinx/Vivado_Lab_Lin_2023.2_1013_2256.tar.gz ~/esnet/xilinx-labtools-docker/vivado-installer/ | ||
Line 52: | Line 52: | ||
</pre> | </pre> | ||
− | + | === Docker build for Xilinx Labtools: === | |
<pre> | <pre> | ||
cd ~/esnet/xilinx-labtools-docker | cd ~/esnet/xilinx-labtools-docker | ||
Line 78: | Line 78: | ||
</pre> | </pre> | ||
− | Follow instructions in README.md | + | Follow instructions in README.md |
− | |||
− | + | == Docker build for DPDK: == | |
<pre> | <pre> | ||
cd ~/esnet/smartnic-dpdk-docker | cd ~/esnet/smartnic-dpdk-docker | ||
− | git checkout | + | git checkout fd1ea53 |
</pre> | </pre> | ||
− | Follow instructions in README.md | + | Follow instructions in README.md |
− | |||
− | + | == Docker build for smartnic: == | |
<pre> | <pre> | ||
cd ~/esnet/esnet-smartnic-fw | cd ~/esnet/esnet-smartnic-fw | ||
Line 125: | Line 123: | ||
− | + | === Modifiy the .env file: === | |
<pre> | <pre> | ||
Line 175: | Line 173: | ||
Execute the bash cmd: | Execute the bash cmd: | ||
<pre> | <pre> | ||
− | sudo lsusb -v -d 0403:6011 | grep iSerial | + | SER=`sudo lsusb -v -d 0403:6011 | grep iSerial|tr -s ' '|cut -f4 -d' '` |
</pre> | </pre> | ||
Line 181: | Line 179: | ||
<pre> | <pre> | ||
− | HW_TARGET_SERIAL= | + | HW_TARGET_SERIAL=${SER}A #Note the appended 'A' char |
</pre> | </pre> | ||
==== Un-remark and set the FPGA PCI device code: ==== | ==== Un-remark and set the FPGA PCI device code: ==== | ||
− | Execute the bash | + | Execute the bash cmd: |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<pre> | <pre> | ||
− | lspci | | + | FPDC=`lspci -Dd 10ee:|head -1|tr -s ' '|cut -f1 -d' '|cut -f1 -d'.'` |
</pre> | </pre> | ||
Line 200: | Line 192: | ||
<pre> | <pre> | ||
− | FPGA_PCIE_DEV= | + | FPGA_PCIE_DEV=$FPDC |
</pre> | </pre> | ||
Line 207: | Line 199: | ||
<pre> | <pre> | ||
− | SN_HOST= | + | SN_HOST=${HOSTNAME}-dp.jlab.org #Note this is the data planes (FPGA) well known IPV4 address or network name |
</pre> | </pre> | ||
Line 215: | Line 207: | ||
<pre> | <pre> | ||
− | openssl rand -base64 24 | + | AUTH=`openssl rand -base64 24` |
</pre> | </pre> | ||
e.g., 1CEpuDN0z39AFndEvcP3EmsuT8zu+3lt | e.g., 1CEpuDN0z39AFndEvcP3EmsuT8zu+3lt | ||
<pre> | <pre> | ||
− | SN_CFG_AUTH_TOKEN= | + | SN_CFG_AUTH_TOKEN=$AUTH |
</pre> | </pre> | ||
− | ==== | + | ==== Modify the sn-stack/docker-compose.yml file: ==== |
+ | |||
+ | In the smartnic-hw/command: section, uncomment the FORCE argument line in the /scripts/program_card.sh invocation: | ||
− | + | <pre> | |
+ | command: | ||
+ | - /bin/bash | ||
+ | - -c | ||
+ | - -e | ||
+ | - -o | ||
+ | - pipefail | ||
+ | - -x | ||
+ | - | | ||
+ | if [ ! -e /bitfiles/ok ] ; then | ||
+ | exit 1 | ||
+ | fi | ||
+ | /scripts/program_card.sh \ | ||
+ | xilinx-hwserver:3121 \ | ||
+ | "${HW_TARGET_SERIAL:-*}" \ | ||
+ | /bitfiles/esnet-smartnic.bit \ | ||
+ | $FPGA_PCIE_DEV \ | ||
+ | FORCE #### <- ################### | ||
+ | if [ $$? ] ; then | ||
+ | touch /status/ok | ||
+ | sleep infinity | ||
+ | fi | ||
+ | </pre> | ||
In older configurations it is required to expose TCP port 50051 (smartnic-p4) outside of the *firmware* docker stack so that the external control plane can reach the p4 agent. This is needed for retro-fitting older firmware with the newer FW / control-plane split. Newer firmware doesn't need this port fixup. | In older configurations it is required to expose TCP port 50051 (smartnic-p4) outside of the *firmware* docker stack so that the external control plane can reach the p4 agent. This is needed for retro-fitting older firmware with the newer FW / control-plane split. Newer firmware doesn't need this port fixup. | ||
Line 246: | Line 262: | ||
=== Verify the sn-stack/docker-compose.yml: === | === Verify the sn-stack/docker-compose.yml: === | ||
− | + | <pre> | |
cd sn-stack | cd sn-stack | ||
docker compose config --quiet && echo "All good!" | docker compose config --quiet && echo "All good!" | ||
+ | </pre> | ||
− | + | If applicable, follow instructions in esnet-smartnic-fw/sn-stack/README.INSTALL.md for: One-Time setup: | |
− | |||
− | + | Converting from factory flash image to ESnet Smartnic flash image | |
− | + | Perform a cold-boot (power cycle) of the server hosting the FPGA card | |
+ | |||
+ | It is essential that this is a proper power cycle and not simply a warm reboot. Specifically do not use | ||
+ | <pre> | ||
+ | shutdown -r now | ||
+ | </pre> | ||
− | + | Instead | |
− | + | <pre> | |
− | + | shutdown -P | |
− | + | </pre> | |
− | |||
− | + | then (Remotely): (smokenmirrors) | |
− | |||
− | |||
− | + | <pre> | |
+ | ipmitool -I lanplus -U ejfat -L Operator -H $HOSTNAME-bmc.jlab.org chassis power status | ||
+ | ipmitool -I lanplus -U ejfat -L Operator -H $HOSTNAME-bmc.jlab.org chassis power on | ||
+ | </pre> | ||
+ | Failure to perform a cold-boot here will result in an unusable card. | ||
− | |||
+ | Normal Operation of the Runtime Environment: | ||
+ | <pre> | ||
docker compose up -d | docker compose up -d | ||
+ | </pre> | ||
− | + | Verify that | |
− | + | <pre> | |
docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml exec smartnic-fw sn-cli dev version | docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml exec smartnic-fw sn-cli dev version | ||
+ | </pre> | ||
+ | |||
+ | Returns something like: | ||
− | + | Device Version Info | |
− | + | DNA: 0x40020000013b83c12c108485 | |
− | + | USR_ACCESS: 0x0000ac1b (57684) | |
− | + | BUILD_STATUS: 0x12211043 | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | <pre> | ||
docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml logs smartnic-fw | docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml logs smartnic-fw | ||
+ | </pre> | ||
− | + | Returns something like: | |
− | + | smartnic-fw-1 | + sleep infinity | |
− | |||
− | |||
− | |||
+ | == Library build for ersap-grpc == | ||
+ | <pre> | ||
cd ~/esnet/ersap-grpc/ | cd ~/esnet/ersap-grpc/ | ||
Line 305: | Line 327: | ||
export GRPC_INSTALL_DIR=/daqfs/ersap/installation3 | export GRPC_INSTALL_DIR=/daqfs/ersap/installation3 | ||
+ | </pre> | ||
− | + | Follow instructions in README.md | |
− | + | Note: It is typically not necessary to install/build grpc as the line above indicates | |
− | |||
− | |||
+ | == Docker build for Control Plane: == | ||
+ | <pre> | ||
cd ~/esnet/udplbd/ | cd ~/esnet/udplbd/ | ||
Line 317: | Line 340: | ||
cp /daqfs/ejfat/Downloads/JLab/JLabCA.crt ~/esnet/udplbd/ | cp /daqfs/ejfat/Downloads/JLab/JLabCA.crt ~/esnet/udplbd/ | ||
+ | </pre> | ||
− | + | === Modifiy docker-compose.yml === | |
− | |||
− | |||
+ | Mount host filespace for /data | ||
+ | <pre> | ||
services: | services: | ||
volumes: | volumes: | ||
- ./data:/data | - ./data:/data | ||
+ | </pre> | ||
− | + | Mount host TLS cert location for /certs | |
− | + | <pre> | |
services: | services: | ||
volumes: | volumes: | ||
- /etc/letsencrypt/archive/<machine>.jlab.org:/certs | - /etc/letsencrypt/archive/<machine>.jlab.org:/certs | ||
+ | </pre> | ||
− | + | remove the leftover udplbd data base file: | |
− | + | <pre> | |
rm ~/esnet/udplbd/data/udplbd.db | rm ~/esnet/udplbd/data/udplbd.db | ||
+ | </pre> | ||
− | + | Follow instructions in README.md | |
− | + | === Modifiy /etc/config.yml === | |
− | + | <li> specify FPGA DP IPV4/6 addresses (up to 8) | |
− | + | <li> specify FPGA DP MAC unicast/broadcast addresses | |
− | + | <li> Put host IPV4 for CP event numbers/host (sync) | |
− | + | <li> Specify an event number/port for each address in 7.3a.a | |
− | + | <li> Put host IPV4 for CP server/host (grpc) | |
− | + | <li> Specify an auth token for CP grpc comms | |
− | + | <li> optionally enable server/TLS | |
− | + | <li> optionally specify container path to server/tls/certFile and server/tls/keyFile | |
− | + | <li> optionally perform steps 7.3a.g-h for smartnic/tls | |
− | |||
+ | <pre> | ||
docker compose build | docker compose build | ||
docker compose up -d | docker compose up -d | ||
− | |||
− | |||
− | |||
docker compose -f ~/esnet/udplbd/docker-compose.yml logs udplbd | less | docker compose -f ~/esnet/udplbd/docker-compose.yml logs udplbd | less | ||
+ | </pre> | ||
− | + | == Execute the the FPGA cmac setup procedure == | |
− | + | <pre> | |
− | + | cp /daqfs/efat/Downloads/esnet/u280_cmac_setup.sh ~/esnet/esnet-smartnic-fw/sn-stack/scratch | |
− | |||
chmod +x ~/esnet/esnet-smartnic-fw/sn-stack/scratch/u280_cmac_setup.sh | chmod +x ~/esnet/esnet-smartnic-fw/sn-stack/scratch/u280_cmac_setup.sh | ||
docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml exec smartnic-fw /scratch/u280_cmac_setup.sh > ~/esnet/esnet-smartnic-fw/sn-stack/scratch/u280_cmac_setup.out | docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml exec smartnic-fw /scratch/u280_cmac_setup.sh > ~/esnet/esnet-smartnic-fw/sn-stack/scratch/u280_cmac_setup.out | ||
+ | </pre> |
Latest revision as of 14:56, 15 November 2024
New Installation Preparations
Check for stale docker images:
docker image ls
Delete images with tags: esnet-smartnic-fw, smartnic-dpdk-docker, xilinx-labtools-docker, udplbd
Initial setup:
mkdir ~/esnet cd ~/esnet git clone --recursive https://github.com/esnet/xilinx-labtools-docker git clone --recursive https://github.com/esnet/smartnic-dpdk-docker git clone --recursive https://github.com/esnet/esnet-smartnic-fw git clone https://github.com/JeffersonLab/ersap-grpc.git git clone https://github.com/esnet/udplbd.git
Proper revisions:
Purpose | Version | Container | Revision |
---|---|---|---|
HW | 57684 | udplb | c6956b46 |
FW | 58131 | esnet-smartnic-fw | a07943f0 |
SW | 0.3.2 | udplbd | 5712d10 |
Lab | 57755 | xilinx-labtools-docker | 977a5678 |
DPDK | 57593 | smartnic-dpdk-docker | fd1ea53 |
Xilinx Supports tools:
Required Binaries
cp /daqfs/ejfat/Downloads/xilinx/Vivado_Lab_Lin_2023.2_1013_2256.tar.gz ~/esnet/xilinx-labtools-docker/vivado-installer/ cp /daqfs/ejfat/Downloads/xilinx/loadsc_v2.3.zip ~/esnet/xilinx-labtools-docker/sc-fw-downloads cp /daqfs/ejfat/Downloads/esnet/SC_U280_4_3_31.zip ~/esnet/xilinx-labtools-docker/sc-fw-downloads
Docker build for Xilinx Labtools:
cd ~/esnet/xilinx-labtools-docker git checkout 977a5678
Remark out the following lines in Dockerfile:
# Download and extract a few versions of the Satellite Controller firmware packages # https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/alveo.html # ARG SC_FW_BASE_URL="https://www.xilinx.com/bin/public/openDownload?filename=" # ARG SC_FW_U280_PKGS="xilinx-u280-gen3x16-xdma_2023.1_2023_0507_2220-all.deb.tar.gz xilinx-u280-gen3x16-xdma_2022.1_2022_0804_1110-all.deb.tar.gz" # ARG SC_FW_U55C_PKGS="xilinx-u55c-gen3x16-xdma_2023.1_2023_0507_2220-all.deb.tar.gz xilinx-u55c-gen3x16-xdma_2022.1_2022_0415_2123-all.deb.tar.gz" # RUN \ # cd /sc-fw-downloads && \ # for f in $SC_FW_U280_PKGS $SC_FW_U55C_PKGS ; do \ # echo "Fetching: $SC_FW_BASE_URL$f" ; \ # wget -qO- "$SC_FW_BASE_URL$f" | tar xz --wildcards 'xilinx-sc-fw*.deb' ; \ # done ; \ # mkdir -p /sc-fw && \ # for sc in /sc-fw-downloads/xilinx-sc-fw*.deb ; do \ # dpkg-deb --fsys-tarfile "$sc" | tar x -C /sc-fw --strip-components 6 --wildcards './opt/xilinx/firmware/sc-fw/*/sc-fw-*.txt' ; \ # done
Follow instructions in README.md
Docker build for DPDK:
cd ~/esnet/smartnic-dpdk-docker git checkout fd1ea53
Follow instructions in README.md
Docker build for smartnic:
cd ~/esnet/esnet-smartnic-fw git checkout a07943f0
The ejfat f/w is engineered and obtained from ESnet as an artifacts file:
SN_HW_VER=57684 SN_HW_APP_NAME=udplb cp /daqfs/ejfat/Downloads/esnet/artifacts.au280.$SN_HW_APP_NAME.$SN_HW_VER.zip ~/esnet/esnet-smartnic-fw/sn-hw
Follow instructions in README.md up to and including (if necessary) the following lines:
mkdir -p ~/.docker/cli-plugins/ curl -SL https://github.com/docker/compose/releases/download/v2.27.1/docker-compose-linux-x86_64 -o ~/.docker/cli-plugins/docker-compose chmod +x ~/.docker/cli-plugins/docker-compose
update cloned repo:
git submodule init git submodule update
Modifiy the .env file:
cp example.env .env
The following .env var lines must be populated:
Note that Docker images can be retrieved from a remote es.net repository or retrieval will instead be made from a local Docker repository.
SMARTNIC_DPDK_IMAGE_URI=<REPOSITORY:TAG>
Similarly,
LABTOOLS_IMAGE_URI=<REPOSITORY:TAG>
Un-remark and set the following lines:
SN_HW_APP_NAME=udplb SN_HW_BOARD=au280 SN_HW_VER=57684 SN_FW_VER=44124 #Note this value is useful but not critical; can be set to zero
5.1 Build the firmware:
./build.sh
Modify the sn-stack/.env file:
Un-remark and set the following lines:
COMPOSE_PROFILES=smartnic-mgr-vfio-unlock
Un-remark and set the JTag serial code:
Execute the bash cmd:
SER=`sudo lsusb -v -d 0403:6011 | grep iSerial|tr -s ' '|cut -f4 -d' '`
e.g., 21770323600G
HW_TARGET_SERIAL=${SER}A #Note the appended 'A' char
Un-remark and set the FPGA PCI device code:
Execute the bash cmd:
FPDC=`lspci -Dd 10ee:|head -1|tr -s ' '|cut -f1 -d' '|cut -f1 -d'.'`
e.g., 0000:a1:00
FPGA_PCIE_DEV=$FPDC
Un-remark and set the following lines:
SN_HOST=${HOSTNAME}-dp.jlab.org #Note this is the data planes (FPGA) well known IPV4 address or network name
Un-remark and set the rpc AUTH token:
Execute the bash cmd:
AUTH=`openssl rand -base64 24`
e.g., 1CEpuDN0z39AFndEvcP3EmsuT8zu+3lt
SN_CFG_AUTH_TOKEN=$AUTH
Modify the sn-stack/docker-compose.yml file:
In the smartnic-hw/command: section, uncomment the FORCE argument line in the /scripts/program_card.sh invocation:
command: - /bin/bash - -c - -e - -o - pipefail - -x - | if [ ! -e /bitfiles/ok ] ; then exit 1 fi /scripts/program_card.sh \ xilinx-hwserver:3121 \ "${HW_TARGET_SERIAL:-*}" \ /bitfiles/esnet-smartnic.bit \ $FPGA_PCIE_DEV \ FORCE #### <- ################### if [ $$? ] ; then touch /status/ok sleep infinity fi
In older configurations it is required to expose TCP port 50051 (smartnic-p4) outside of the *firmware* docker stack so that the external control plane can reach the p4 agent. This is needed for retro-fitting older firmware with the newer FW / control-plane split. Newer firmware doesn't need this port fixup.
Exposing the p4 agent TCP port is done by adding this stanza to the "smartnic-p4" section:
ports: - "50051:50051"
add the following lines to the end of the smartnic-p4: section:
logging: options: max-file: 5 max-size: 100m
Verify the sn-stack/docker-compose.yml:
cd sn-stack docker compose config --quiet && echo "All good!"
If applicable, follow instructions in esnet-smartnic-fw/sn-stack/README.INSTALL.md for: One-Time setup:
Converting from factory flash image to ESnet Smartnic flash image
Perform a cold-boot (power cycle) of the server hosting the FPGA card
It is essential that this is a proper power cycle and not simply a warm reboot. Specifically do not use
shutdown -r now
Instead
shutdown -P
then (Remotely): (smokenmirrors)
ipmitool -I lanplus -U ejfat -L Operator -H $HOSTNAME-bmc.jlab.org chassis power status ipmitool -I lanplus -U ejfat -L Operator -H $HOSTNAME-bmc.jlab.org chassis power on
Failure to perform a cold-boot here will result in an unusable card.
Normal Operation of the Runtime Environment:
docker compose up -d
Verify that
docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml exec smartnic-fw sn-cli dev version
Returns something like:
Device Version Info
DNA: 0x40020000013b83c12c108485 USR_ACCESS: 0x0000ac1b (57684) BUILD_STATUS: 0x12211043
docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml logs smartnic-fw
Returns something like:
smartnic-fw-1 | + sleep infinity
Library build for ersap-grpc
cd ~/esnet/ersap-grpc/ git switch esnet3 git checkout a3b85c3868554380e12759f23335eaf3fead2441 export GRPC_INSTALL_DIR=/daqfs/ersap/installation3
Follow instructions in README.md
Note: It is typically not necessary to install/build grpc as the line above indicates
Docker build for Control Plane:
cd ~/esnet/udplbd/ git checkout 5712d10 cp /daqfs/ejfat/Downloads/JLab/JLabCA.crt ~/esnet/udplbd/
Modifiy docker-compose.yml
Mount host filespace for /data
services: volumes: - ./data:/data
Mount host TLS cert location for /certs
services: volumes: - /etc/letsencrypt/archive/<machine>.jlab.org:/certs
remove the leftover udplbd data base file:
rm ~/esnet/udplbd/data/udplbd.db
Follow instructions in README.md
Modifiy /etc/config.yml
docker compose build docker compose up -d docker compose -f ~/esnet/udplbd/docker-compose.yml logs udplbd | less
Execute the the FPGA cmac setup procedure
cp /daqfs/efat/Downloads/esnet/u280_cmac_setup.sh ~/esnet/esnet-smartnic-fw/sn-stack/scratch chmod +x ~/esnet/esnet-smartnic-fw/sn-stack/scratch/u280_cmac_setup.sh docker compose -f ~/esnet/esnet-smartnic-fw/sn-stack/docker-compose.yml exec smartnic-fw /scratch/u280_cmac_setup.sh > ~/esnet/esnet-smartnic-fw/sn-stack/scratch/u280_cmac_setup.out