Jupyter via VSCode remote-ssh with singularity on ifarm

From epsciwiki
Revision as of 13:36, 18 June 2023 by Davidl (talk | contribs)
Jump to navigation Jump to search


Here are instructions for configuring your local VSCode to connect to

 #-------------------------------------------------------
 # This comes from the JLab knowledge base article here:
 # https://jlab.servicenowservices.com/kb?id=kb_article_view&sysparm_article=KB0014918&sys_kb_id=862c54221bf0d510a552ed3ce54bcb1a&spa=1 
 #
 Host ifarm ifarm???? qcdi qcdi????
 Hostname %h.jlab.org.
 Host *.jlab.org
 Hostname %h.
 Match host scilogin.jlab.org.,scilogin?.jlab.org.,login.jlab.org.,login?.jlab.org.,acclogin.jlab.org.,acclogin?.jlab.org.,hallgw.jlab.org.,hallgw?.jlab.org.
 ProxyJump none
 Match host *.jlab.org.
 ProxyJump scilogin.jlab.org.
 User davidl
 
 Match host *.jlab.org.
 ControlMaster auto
 ControlPath ~/.ssh/cm/%C.sock
 ControlPersist 10m
 #-------------------------------------------------------

Test that this works by doing "ssh ifarm". See the knowledge base article for details and troubleshooting.

Using work directories for large software installations

It is easy to overfill the quota on your CUE home directory by having VScode extension and python environments install lots of packages. Redirecting these to the work disk can save a lot of headache. There are a couple of directories to be concerned with. Here is one way to handle it (do all of this on the ifarm):

 mkdir -p /work/epsci/${USER}/home_dot_local
 mkdir -p /work/epsci/${USER}/home_dot_cache
 mkdir -p /work/epsci/${USER}/home_dot_vscode-server
 ln -s /work/epsci/${USER}/home_dot_local ~/.local
 ln -s /work/epsci/${USER}/home_dot_cache ~/.cache
 ln -s /work/epsci/${USER}/home_dot_vscode-server ~/.vscode-server

The above will install any VScode extensions on the remote system (i.e. ifarm) into the /work/epsci/${USER}/home_dot_vscode-server directory.

Python packages installed using pip while in a jupyter session will get installed into the /work/epsci/${USER}/home_dot_local ~/.local and /work/epsci/${USER}/home_dot_cache ~/.cache directories.

= Customized Python Virtual Environment

VScode will provide kernel options to use with jupyter which correspond to different python environments. You can create new virtual environments through the VScode interface. When I did this, it created a directory:

 ~/.vscode-server/data/Machine/.venv

If you want to use a virtual environment that was set up via the command line rather than through VScode, the easiest thing to do is make a symbolic link pointing to it called ~/.vscode-server/data/Machine/.venv. Here is an example of doing this. Note that these commands were run from a VScode terminal connected to a singularity container. (This ensures the python executable is compatible with that system).

 # First, create python virtual environment and install some packages
 mkdir -p /work/epsci/${USER}/python_venvs
 python3 -m venv /work/epsci/${USER}/python_venvs/venv_epsci-centos-7.7.1908
 source /work/epsci/${USER}/python_venvs/venv_epsci-centos-7.7.1908/bin/activate
 pip install --upgrade pip
 pip install tensorflow pandas numpy ipykernel matplotlib
 
 # Second, create symbolic link
 ln -s /work/epsci/${USER}/python_venvs/venv_epsci-centos-7.7.1908 ~/.vscode-server/data/Machine/.venv_epsci-centos-7.7.1908

Configuring VScode

In VSCode

  1. Open the command palette using Cmd+shift+P or from the gear menu in the bottom left of the window
  2. Type "settings.json" and then select Preferences: Open User Settings (JSON)
  3. Add the following to your settings (if you have other settings already, you may need to add a comma to the line before this one!):
 "remote.SSH.enableRemoteCommand": true


 # https://github.com/microsoft/vscode-remote-release/issues/3066#issuecomment-1019500216
 #
 Host epsci-ubuntu-22.04~ifarm
   HostName ifarm.jlab.org
   ProxyJump scilogin.jlab.org
   RemoteCommand singularity shell --bind /w,/work /cvmfs/oasis.opensciencegrid.org/jlab/epsci/singularity/images/epsci-ubuntu-22.04.img
   RequestTTY yes

Test by logging in from command line first with the following. Note that you will need to enter your 2-factor PIN+OTP when prompted:

 ssh epsci-ubuntu-22.04~ifarm

Click on the "Remote Explorer" extension icon on the left side of the window (monitor with a circle in lower left) to open. You should see the "epsci-ubuntu-22.04~ifarm" item. Hover over it to see options to either connect the current window (arrow) or open a new window (box). Choose whichever you want to get a connected window. It will automatically install some extensions on the remote host under ~/.vscode-server.

Open jupyter notebook and select kernel

You should now be able to navigate the remote system in VScode to either open an existing notebook, or create a new. Once you do, the "Select Kernel" option will be available in the top right corner of the VScode window. The first time you do this, it will have an option at the top of the window to "Install suggested extensions Python + Jupyter". Select this to install those extensions on the remote system.

Once the remote extensions are installed, click on "Select Kernel" again and a menu with different options will appear at the top of the window. Select the "Python Environments..." option.