Difference between revisions of "Jupyter via VSCode remote-ssh with singularity on ifarm"

From epsciwiki
Jump to navigation Jump to search
Line 103: Line 103:
 
Note that you can [https://julialang.org/downloads/ install your own Julia binaries] and point to them instead.
 
Note that you can [https://julialang.org/downloads/ install your own Julia binaries] and point to them instead.
  
'''Step 3:''' Make symlink so Julia packages install to work disk. This saves filling your home directory quota as described above for python. Below is how I did it. Adjust to the work disk location of your preference.
+
'''Step 3:''' Make a symlink so Julia packages install to a work disk instead of your home directory. This saves filling your home directory quota as described above for python. Below is how I did it. Adjust to the work disk location of your preference.
  
 
   mkdir /work/epsci/${USER}/home_dot_julia
 
   mkdir /work/epsci/${USER}/home_dot_julia
   Singularity> ln -s /work/epsci/${USER}/home_dot_julia ~/.julia
+
   ln -s /work/epsci/${USER}/home_dot_julia ~/.julia
  
 
'''Step 4:''' Create a Julia Jupyter notebook. At this point if you select "New File..." from the VSCode window you should see a new option to create a new Julia file. If you don't, then try disconnecting the VSCode remote-ssh session and reconnecting.  
 
'''Step 4:''' Create a Julia Jupyter notebook. At this point if you select "New File..." from the VSCode window you should see a new option to create a new Julia file. If you don't, then try disconnecting the VSCode remote-ssh session and reconnecting.  
Line 113: Line 113:
  
 
'''NOTE:''' It seems the ".ipynb" extension is the only one allowed by the VSCode Jupyter extension, even if the notebook is a Julia notebook!
 
'''NOTE:''' It seems the ".ipynb" extension is the only one allowed by the VSCode Jupyter extension, even if the notebook is a Julia notebook!
 +
 +
Below is some example Julia code that you can paste into a cell and execute. The first time you run this, the package installs will take quite a while. (It took >8min for me). At first, it will appear to fail with Download errors. It looks like when this happens, it reverts to download source and compiling that which is probably why it takes so long. (We should ask the CC to whitelist ''pkg.julialang.org'' which will probably give access to precompiled binaries.)
 +
 +
  import Pkg; Pkg.add("Plots")
 +
  import Pkg; Pkg.add("GR")
 +
 
 +
  using Plots
 +
 
 +
  # plot some data
 +
  plot([cumsum(rand(500) .- 0.5), cumsum(rand(500) .- 0.5)])
 +
 
 +
  # save the current figure
 +
  savefig("plots.svg")
 +
  # .eps, .pdf, & .png are also supported

Revision as of 12:42, 19 June 2023


Here are instructions for configuring your local VSCode to connect to the ifarm via ssh and run an ipython kernel inside a singularity container.

Get a smartcard

If you don't already have one, it is well worth stopping by the helpdesk and getting a smartcard USB device. It is not strictly required, but can save a lot of hassle typing numbers you get from your MFA app on your phone.

Configure SSH on your local computer

Logging into an ifarm computer requires first logging into the scilogin.jlab.org gateway. Configuring your local computer to do a proxy jump will make this a lot easier. Instructions for this can be found in the JLab knowledge base article here:

https://jlab.servicenowservices.com/kb?id=kb_article_view&sysparm_article=KB0014918&sys_kb_id=862c54221bf0d510a552ed3ce54bcb1a&spa=1

Follow those instructions so that "ssh ifarm" works from your local computer.

Next, add the following to your ~/.ssh/config file. This configures it so when you ssh to the special host "epsci-ubuntu-22.04~ifarm" it will not only tunnel you all of the way into the ifarm, but will run the specificed singularity container and drop you into it.

 # https://github.com/microsoft/vscode-remote-release/issues/3066#issuecomment-1019500216
 #
 Host epsci-ubuntu-22.04~ifarm
   HostName ifarm.jlab.org
   ProxyJump scilogin.jlab.org
   RemoteCommand singularity shell --bind /w,/work /cvmfs/oasis.opensciencegrid.org/jlab/epsci/singularity/images/epsci-ubuntu-22.04.img
   RequestTTY yes

Test that this works by doing:

  ssh epsci-ubuntu-22.04~ifarm

This should result in a "Singularity>" prompt that you can confirm is the correct OS by looking at either the /etc/os-release file or the Dockerfile in the /container directory.

Note that the "--bind /w,/work" option mounts the work disks inside the container so you can use them. Add any other directories (e.g. /group) you may need (Note that your home directory is automatically mounted).

Using work directories for large software installations

It is easy to overfill the quota on your CUE home directory by having VScode extensions and python environments install lots of packages. Redirecting these to the work disk can save a lot of headache. There are actually a few ways to deal with this depending on which python kernel/environment you select.

Using a global environment

This is not necessarily t

It is best to deal with this now by creating symbolic links in you home directory that point to directories on the work disk where these packages can be installed.

There are a couple of directories to be concerned with. Here are some commands to execute on the ifarm, that will set up the appropriate links. Replace "epsci" with the name of whatever work disk is appropriate. Note that if these directories already exist in your home directory, you may want to rename them or remove them to make way for this method.

 mkdir -p /work/epsci/${USER}/home_dot_local
 mkdir -p /work/epsci/${USER}/home_dot_cache
 mkdir -p /work/epsci/${USER}/home_dot_vscode-server
 ln -s /work/epsci/${USER}/home_dot_local ~/.local
 ln -s /work/epsci/${USER}/home_dot_cache ~/.cache
 ln -s /work/epsci/${USER}/home_dot_vscode-server ~/.vscode-server

The above will install any VScode extensions on the remote system (i.e. ifarm) into the /work/epsci/${USER}/home_dot_vscode-server directory.

Python packages installed using pip while in a jupyter session (not necessarily when using VSCode) will get installed into the /work/epsci/${USER}/home_dot_local ~/.local and /work/epsci/${USER}/home_dot_cache ~/.cache directories.

VSCode lets you specify a python virtual environment on a workspace-by-workspace basis so to make sure those end up on the work disk will take another couple of steps (see "Customized Python Virtual Environment" section below).

Configuring VScode

In VSCode

  1. Open the command palette using Cmd+shift+P or from the gear menu in the bottom left of the window
  2. Type "settings.json" and then select Preferences: Open User Settings (JSON)
  3. Add the following to your settings (if you have other settings already, you may need to add a comma to the line before this one!):
 "remote.SSH.enableRemoteCommand": true

Click on the "Remote Explorer" extension icon on the left side of the window (monitor with a circle in lower right) to open. You should see the "epsci-ubuntu-22.04~ifarm" item. Hover over it to see options to either connect the current window (arrow) or open a new window (box). Choose either option to get a connected window. Watch for a small entry box at the top of the window that is asking for your password. Enter your PIN+OTP to login. (OTP=One Time Password from either your smartcard or phone app).

It will automatically install some extensions on the remote host under ~/.vscode-server.

You can verify that it is working correctly by opening a new terminal in VSCode and seeing that it gives the "Singularity>" prompt.

Customized Python Virtual Environment

VSCode will provide kernel options to use with Jupyter which correspond to different python environments. You can create new virtual environments through the VSCode interface. When you do this, it will create a directory called .venv in the current workspace directory the VSCode window is using. Installing many of the standard python packages (tensorflow, pytorch, pandas, matplotlib, etc....) will take up several GB of space. It is better if this can be stored once and all of your workspaces use it.

The easiest way to do this is to

  1. create a workspace with a new Jupyter notebook
  2. create a new python virtual environment (this will create a .venv directory in the workspace directory)
  3. install the python packages via the VSCode Jupyter interface
  4. move the .venv directory to a central location and make a symbolic link pointing to it workspace directory

For the second step above, you can run a cell in the Jupyter notebook with these contents:

  %pip install pandas numpy matplotlib tensorflow torch

Open Jupyter notebook and select kernel

You should now be able to navigate the remote system in VSCode to either open an existing notebook, or create a new. Once you do, the "Select Kernel" option will be available in the top right corner of the VScode window. The first time you do this, it will have an option at the top of the window to "Install suggested extensions Python + Jupyter". Select this to install those extensions on the remote system.

Once the remote extensions are installed, click on "Select Kernel" again and a menu with different options will appear at the top of the window. Select the "Python Environments..." option.

Using a Julia Notebook

Jupyter supports many languages, including julia. You can configure VSCode to use use a Julia interpreter in a Jupyter notebook.

Step 1: Install the Julia extension in the remote VSCode server Go to a VSCode window that is connected to the "SSH: epsci-ubuntu-22.04~ifarm" host (or whichever remote host you wish to use). Click on the extensions icon on the left of the window and type "Julia" at the top to find the Julia extension and install it. Note that the Jupyter extension should already be installed on the remote system via the instructions in previous sections.

Step 2: Configure VSCode to find the Julia interpreter. Do this by setting the "julia.executablePath" on your remote system:

  1. Open command palette (Cmd+shift+P)
  2. Type "settings" and select "Preferences: Open Remote Settings (JSON) (SSH: epsci-ubuntu-22.04~ifarm)"
  3. Add the julia.executablePath setting it to the full path to the julia executable. Here is an example. Note that this also includes the path to my python venvs used with Jupyter.
 {
     "python.venvPath":"/work/epsci/davidl/python_venvs",
     "julia.executablePath":"/group/epsci/apps/Julia/julia-1.9.1/bin/julia"
 }

Note that you can install your own Julia binaries and point to them instead.

Step 3: Make a symlink so Julia packages install to a work disk instead of your home directory. This saves filling your home directory quota as described above for python. Below is how I did it. Adjust to the work disk location of your preference.

 mkdir /work/epsci/${USER}/home_dot_julia
 ln -s /work/epsci/${USER}/home_dot_julia ~/.julia

Step 4: Create a Julia Jupyter notebook. At this point if you select "New File..." from the VSCode window you should see a new option to create a new Julia file. If you don't, then try disconnecting the VSCode remote-ssh session and reconnecting.

If you want to use Jupyter, then don't create a Julia file, but instead select "Jupyter Notebook .ipynb support". This will create a new Jupyter notebook in which you can select to use either the "Julia 1.9.1" kernel or one of your python kernels.

NOTE: It seems the ".ipynb" extension is the only one allowed by the VSCode Jupyter extension, even if the notebook is a Julia notebook!

Below is some example Julia code that you can paste into a cell and execute. The first time you run this, the package installs will take quite a while. (It took >8min for me). At first, it will appear to fail with Download errors. It looks like when this happens, it reverts to download source and compiling that which is probably why it takes so long. (We should ask the CC to whitelist pkg.julialang.org which will probably give access to precompiled binaries.)

 import Pkg; Pkg.add("Plots")
 import Pkg; Pkg.add("GR")
 
 using Plots
 
 # plot some data
 plot([cumsum(rand(500) .- 0.5), cumsum(rand(500) .- 0.5)])
 
 # save the current figure
 savefig("plots.svg")
 # .eps, .pdf, & .png are also supported