Difference between revisions of "General Particle Tracer"
Line 95: | Line 95: | ||
For those who like Jupyter notebooks, the lab has a JupyterHub server that will spawn a platform-virtualized container for you with a Jupyter instance running inside it. | For those who like Jupyter notebooks, the lab has a JupyterHub server that will spawn a platform-virtualized container for you with a Jupyter instance running inside it. | ||
− | The containers are hosted by the farm but internally run Ubuntu | + | The containers are hosted by the farm but internally run Ubuntu 24.04, not AlmaLinux 9. |
To be able to invoke GPT from inside a Jupyter notebook, there are two main options: | To be able to invoke GPT from inside a Jupyter notebook, there are two main options: | ||
* Run it "locally" in the same instance; for this, we need a Ubuntu version, which is being prepared. TO DO | * Run it "locally" in the same instance; for this, we need a Ubuntu version, which is being prepared. TO DO |
Latest revision as of 11:35, 18 September 2024
In this context, GPT always refers to General Particle Tracer, never to the language processing AI.
GPT is a 3D particle pusher with (optional) space charge as well as spin tracking. It is easy to parametrize and supports all types of field maps we need for accelerator studies as long as our focus is not on things like wake fields, beam/matter interactions, radiation, etc.
GPT on Linux
For the longest time, we have been using the somewhat dated GPT 3.38 release. Because the transition of farm, CUE, and ACE systems to RHEL 9 / AlmaLinux 9 has led to incompatibilities with GPT 3.38 (mostly compiler/ABI issues when changing the custom elements / gdfa progs configuration) and the /apps folder has been retired, running GPT 3.38 is no longer straightforward. To reduce version chaos, we decided to upgrade to GPT 3.55 and abandon the support for older GPT versions as well as legacy platforms (RHEL < 9 as well as CPUs without AVX2, which includes some old CUE machines).
We are currently testing the following binary distributions (all for x86_64):
- gpt355-RHEL9-gcc-11.4-4-sep-2024-avx2 --- GPT 3.55, RHEL 9 build w/ GCC 11.4, requires AVX2-capable CPU
- gpt355-RHEL9-gcc-11.4-4-sep-2024-avx2-MPI4 --- same, but with OpenMPI dependency to run parallel jobs on the cluster
The MPI version will run on a single node without a performance penalty and with no difference in usage. However, because of the library dependencies, it will not run on most machines outside of the farm.
All distributions are deployed with the following additions to the base version:
- custom elements:
- IONATOR 08/22/2022: ionator
- Alicia's elements v3.52_2023_03_17: addxyrot ccsmove indexed_sectormagnet map1D_Brectmag map1D_B_scope map1D_TM_scope map25D_TM_scope map2D_B_scope map3D_B_scope map3D_Ecomplex_scope map3D_Hcomplex_scope map3D_TM_scope map3D_TMparmela map3D_TMparmela_scope map3D_wien mmap3D_B mmap3D_B_scope quad_fringe radialslitmask removedirection ringslit sectormagnet_fix setgdfvars setplateaut setsigma setsupergt TE011gauss_scope TE110gauss_scope TM010gauss_scope TM110gauss_scope TMrectcavity_scope quadrupole_scope wien wienfindB wienfindE writecGBx writeGB writeGBall writeGBallscreen writeGBcyl writeGBcylscreen writeGBscreen writeucGBx writep writepscreen writep_eVperc writep_eVpercscreen
- gdfa progs:
- Alicia's progs v3.52_2023_03_17: Q1 Q2 tutacc avgEk avgEk_eV avgEtotal avgEtotal_eV avgGBphi avgGBphi_on_rxy avgGBrxy avgGBx avgGBx_lhcs avgGBy avgGBz avgL avgx_lhcs avgxprime avgxprime_lhcs avgyprime avgptotal avgp_eVperc avgpx_eVperc avgpx_eVperc_lhcs avgpy_eVperc avgpz_eVperc avg_y_on_x cannemixrms cnemixrms dEk_100_eV EnergyDiff maxEk maxEk_eV maxEtotal maxEtotal_eV maxr maxt maxx maxx_lhcs maxxprime maxxprime_lhcs maxy maxyprime minEk minEk_eV minEtotal minEtotal_eV minG mint minx minx_lhcs minxprime minxprime_lhcs miny minyprime nemiprrms nemipxrms stddGonG stdEk stdEk_eV stdEtotal stdEtotal_eV stdGBphi_on_rxy stdGBx stdGBy stdGBz stduncorEk stduniform stdzs transmission ucnemixrms xyaspectratio z_x_slope z_y_slope
These additions are not audited, so their use can result in unexpected behavior.
If any further elements or gdfa progs need to be included, reach out to Max Bruker.
Running on CUE machines
Not actively supported.
Running on farm nodes
The preferred way to run GPT is on the farm, which, unlike the common CUE systems, is intended for these sorts of jobs. Running code on the farm does not imply parallel computing, although that is easy to do when needed. The interactive farm nodes can be used for testing and to run short-duration jobs that only need a couple of threads, and their OS environment is the same as on the compute nodes.
You need an account: Farm and ifarm Access/Accounts
SSH to interactive farm nodes is only allowed directly from MFA gateways, see instructions for how to simplify the process: Connecting to Farm and QCD Interactive Nodes
After you SSH into an interactive farm node, the GPT environment variables (PATH, GPTLICENSE) are configured by an Environment Module, which is loaded as follows:
module use /scigroup/inj_group/sw/el9/modulefiles module load gpt/3.55-mpi4
Interactive job
On an interactive node (or an allocated interactive session), you can run small GPT jobs like you would on any machine. The interactive nodes have a lot (128?) of CPU cores, so consider limiting the number of concurrent threads to avoid slowing down the machine for other users:
gpt -j 16 ...
or:
mr -j 16 ... gpt -j 1 ...
Single-node scheduled job
The size of job that you would consider running interactively (single task, limited number of threads) can also be allocated resources to run on a non-interactive farm node. Jobs are scheduled with slurm, like so:
squeue test.sbatch
In this case, the content of test.sbatch
can be something like:
#!/bin/bash #SBATCH --partition=production #SBATCH --job-name=gpt_singlenode_test #SBATCH -t 0:30:0 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=16 #SBATCH --mem-per-cpu=1000M FOLDER=/group/inj_group/Max/TestFolder mr -v -o $FOLDER/test.gdf $FOLDER/test.mr gpt -j 16 $FOLDER/test.in
In this example, the mr
file only contains static parameters, so it will only run one gpt
instance, but this may take a long time because e.g. I'm tracking a million particles.
The output file will show up in the specified folder, and the stdout and stderr of the tasks in slurm-xxx.out
(xxx being the job id assigned by sbatch
). Progress can be monitored with squeue -j xxx
or, more conveniently, squeue -u bruker -i 10
(lists all jobs I own every 10 seconds).
MPI job
To leverage the parallel computing capability of the cluster, one can run GPT in an MPI job. In principle, this does not require much modification to the batch script, other than preceding the final command with mpirun
:
#!/bin/bash #SBATCH --partition=production #SBATCH --job-name=gpt_mpi_test #SBATCH -t 0:30:0 #SBATCH -N 64 #SBATCH --mem-per-cpu=1000M FOLDER=/group/inj_group/Max/TestFolder mpirun -v mr -v -o $FOLDER/test.gdf $FOLDER/test.mr gpt -j 1 $FOLDER/test.in
This allocates a certain number (N) of CPUs, which may or may not be distributed across the cluster, and runs a GPT process on each. The processes compute independently but are centrally managed. mr
takes care of assigning jobs to processes and merging the output. The same thing works for gdfmgo
(according to the manual; I have not tested it myself).
For jobs that naturally lend themselves to being parallelized (e.g., mr
with a number of independent runs that is large compared to the number of parallel processes), the wall-clock time scales very well, unless merging large output files creates a bottleneck.
Note that one node is allocated to merging/managing only, so if you want the dimension of the parameter space to be divisible by the number of processes, use one fewer.
Things to note about the farm
- The environment in which you submit the batch script is carried over to the job, i.e., you only have to load the environment module once per SSH session and the nodes will know where to find GPT.
- GPT jobs do not tend to be I/O heavy, so we can use /group directly, which makes it convenient to get to your files via sftp etc. But for jobs that require storage with a lot of space or bandwidth, the dedicated node-local and cluster-wide storage systems would be better.
JupyterHub
For those who like Jupyter notebooks, the lab has a JupyterHub server that will spawn a platform-virtualized container for you with a Jupyter instance running inside it. The containers are hosted by the farm but internally run Ubuntu 24.04, not AlmaLinux 9. To be able to invoke GPT from inside a Jupyter notebook, there are two main options:
- Run it "locally" in the same instance; for this, we need a Ubuntu version, which is being prepared. TO DO
- Submit it as another farm job (many more threads + RAM available); this is slightly less convenient because pipes are not available for data exchange and one has to deal with slurm for managing the extra job. But it should work. If there's time, I'll look into how this could be added transparently to the GPT/Python interface.
GPT/Python interface
- The GPT environment module adds
/scigroup/inj_group/sw/gpt_python
toPYTHONPATH
. - Dependencies need to be installed on a per-user basis. Note that pip maintains separate folders for each python version. If packages are needed in Jupyter, you can invoke a terminal in JupyterLab and run pip there.
python3 -m pip install polars
TO DO: more description