Login to SciComp GPUs

From epsciwiki
Revision as of 21:23, 26 August 2020 by Kishan (talk | contribs)
Jump to navigation Jump to search

The following is how to use one of the ML scicomp machines that has 4 Titan RTX GPU cards installed.
Steps:
1. Setting up the software environment seems to be more easily done using conda. We need to first log into jlab common environment with the below ssh command


 ssh login.jlab.org


You'll be prompted to enter your Jlab account password.

2. We need to log into ifarm with the following ssh command

 ssh ifarm190X

In 190X, X can either be 1 or 2.

3. Setting up Python environment

  • The software must be set up using a computer other than sciml190X since it needs a level of outside network access not available there.
  • We recommend using Conda to manage your python packages and environments.
  • Also, the size of the installation is large enough that it won't fit easily in you home directory. Conda likes to install things in ~/.conda so that must be a link to some larger disk.
  • If ~/.conda already exists, please delete it since we are going to create a symbolic link named ~/.conda
  • Create a folder in your work directory that can be linked to "~/.conda". For me, I created a folder named condaenv in "/work/halld2/home/kishan/". You can simply achieve this by running the following commands
  • mkdir /work/<your hall>/home/<your name>/condaenv ln -s /work/<your hall>/home/<your name>/condaenv ~/.conda
  • You can check if symbolic link is set up by running
  • ls -ls you will see one of the entries as .conda -> /work/<your hall>/home/<your name>/condaenv
  • Now run the following commands to load Anaconda3 and create a virtual environment named tf-gpu with tensorflow-gpu, cudatoolkit, keras and numpy installed.
  • bash source /etc/profile.d/modules.sh module use /apps/modulefiles module load anaconda3/4.5.12 conda create -n tf-gpu tensorflow-gpu cudatoolkit keras numpy
  • Activate the tf-gpu virtual environment.
  • conda activate tf-gpu