Difference between revisions of "Login to SciComp GPUs"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
− | + | The following is how to use one of the ML scicomp machines that has 4 Titan RTX GPU cards installed. | |
<br> | <br> | ||
Steps: <br> | Steps: <br> | ||
− | + | <ol> | |
+ | |||
+ | <li> Setting up the software environment seems to be more easily done using conda. We need to first log into jlab common environment with the below ssh command. </li> <br> | ||
<font color='#000099'><b> | <font color='#000099'><b> | ||
ssh login.jlab.org | ssh login.jlab.org | ||
Line 9: | Line 11: | ||
You'll be prompted to enter your Jlab account password. <br> | You'll be prompted to enter your Jlab account password. <br> | ||
<br> | <br> | ||
− | + | ||
+ | |||
+ | |||
+ | <li> We need to log into ifarm with the following ssh command </li> <br> | ||
<br> | <br> | ||
<font color='#000099'><b> | <font color='#000099'><b> | ||
Line 16: | Line 21: | ||
In 190X, X can either be 1 or 2. | In 190X, X can either be 1 or 2. | ||
<br><br> | <br><br> | ||
− | + | ||
+ | |||
+ | |||
+ | <li> Setting up Python environment </li> <br> | ||
<ul> | <ul> | ||
<li>The software must be set up using a computer other than sciml190X since it needs a level of outside network access not available there. </li> | <li>The software must be set up using a computer other than sciml190X since it needs a level of outside network access not available there. </li> | ||
Line 43: | Line 51: | ||
</ul> | </ul> | ||
− | + | ||
+ | |||
+ | |||
+ | <li> Reserving the GPUs </li> | ||
<ul> | <ul> | ||
<li> To reserve 2 GPU cards</li> | <li> To reserve 2 GPU cards</li> |
Revision as of 21:38, 26 August 2020
The following is how to use one of the ML scicomp machines that has 4 Titan RTX GPU cards installed.
Steps:
- Setting up the software environment seems to be more easily done using conda. We need to first log into jlab common environment with the below ssh command.
- We need to log into ifarm with the following ssh command
- Setting up Python environment
- The software must be set up using a computer other than sciml190X since it needs a level of outside network access not available there.
- We recommend using Conda to manage your python packages and environments.
- Also, the size of the installation is large enough that it won't fit easily in you home directory. Conda likes to install things in ~/.conda so that must be a link to some larger disk.
- If ~/.conda already exists, please delete it since we are going to create a symbolic link named ~/.conda
- Create a folder in your work directory that can be linked to "~/.conda". For me, I created a folder named condaenv in "/work/halld2/home/kishan/". You can simply achieve this by running the following commands mkdir /work/<your hall>/home/<your name>/condaenv ln -s /work/<your hall>/home/<your name>/condaenv ~/.conda
- You can check if symbolic link is set up by running ls -ls you will see one of the entries as .conda -> /work/<your hall>/home/<your name>/condaenv
- Now run the following commands to load Anaconda3 and create a virtual environment named tf-gpu with tensorflow-gpu, cudatoolkit, keras and numpy installed. bash source /etc/profile.d/modules.sh module use /apps/modulefiles module load anaconda3/4.5.12 conda create -n tf-gpu tensorflow-gpu cudatoolkit keras numpy
- Activate the tf-gpu virtual environment. conda activate tf-gpu
- Reserving the GPUs
- To reserve 2 GPU cards salloc --gres gpu:TitanRTX:2 --partition gpu --nodes 1 srun --pty bash If you with to reserve n GPU nodes, change above command to gpu:TitanRTX:n
- Now activate your tf virtual environment by running below commands. source /etc/profile.d/modules.sh module use /apps/modulefiles module load anaconda3/4.5.12 conda activate tf-gpu
ssh login.jlab.org
You'll be prompted to enter your Jlab account password.
ssh ifarm190X In 190X, X can either be 1 or 2.