Use JANA2 + GPU in singularity container
Here are some instructions for building a Singularity container that can access NVidia GPU hardware.
These instructions for building the image were developed on a JLab CUE desktop that has /apps mounted. In principle, they can also be used on the ifarm machines, but some of the image files can be multiple GB in size which tends to be more of an issue. There are also some system packages that may need to be installed so having sudo privilege is useful. Once the image file is created, it can be moved to other computers and run there without issues.
First, make sure the squashfs tools are installed since singularity will needs these to create the image files:
sudo yum install squashfs-tools
Next, setup to use singularity from the CUE via the /apps network mounted directory:
module use /apps/modulefiles module load singularity/3.9.5
Create a singularity image from the official nvidia/cuda images. Here, I chose 11.4.2 because it is closest to the CUDA version on gluon200, which I will be targeting for the test. Note that the devel version is semi-large (~3GB) but includes the gcc 9.3.0 compiler.
singularity pull docker://nvidia/cuda:11.4.2-devel-ubuntu20.04
This should leave you with a file named something like cuda_11.4.2-devel-ubuntu20.04.sif.
Copy the file to the computer with available GPUs and CUDA drivers installed and test it like this:
singularity run -c --nv cuda_11.4.2-devel-ubuntu20.04.sif Singularity> nvidia-smi Sat Apr 16 15:09:09 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla T4 Off | 00000000:01:00.0 Off | 0 | | N/A 41C P0 25W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 Tesla T4 Off | 00000000:81:00.0 Off | 0 | | N/A 40C P0 25W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 2 Tesla T4 Off | 00000000:A1:00.0 Off | 0 | | N/A 43C P0 26W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 3 Tesla T4 Off | 00000000:C1:00.0 Off | 0 | | N/A 42C P0 25W / 70W | 0MiB / 15109MiB | 5% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ Singularity>
Note that the above comes from running the nvidia-smi executable from the host, but within the container.