How to transfer huge files

From Tritium Experiments Group
Revision as of 13:55, 16 March 2018 by Yez (talk | contribs) (→‎Use scp)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Use scp

A quick and dirty way to copy files from JLab work disks is somethine like the following:

 scp -rp yez@ftp.jlab.org:/work/halla/triton/yez/file .

Or a better command which sometimes works for me by most of times doesn't (however, I recommend to use rsync to copy from among different locations in the same PC, or ifarm):

 rsync -av yez@ftp.jlab.org:/work/halla/triton/yez/file .

Use Globus

Globus is a very nice tool to transfer a large number of big data files from JLab to any place with much faster internet connection speed (JLab server allows a maximum of 5GBps speed). It also has an easy web-based interface to perform the work. However, the instruction of how to use Globus is somewhat misleading. I simplified the instruction based on my painful experience of setting up the connections.

Here is the step-by-step How-to:

  • 1) Go to https://www.globus.org/, and sign-in your account by choosing Jefferson Lab or your own institution account if there are (I used my Argonne account). I haven't tried personal account. Please let me know how it works.
  • 2) If your institution already has an Endoint-sever ready, use them. Here, I chose to create my persional endpoint on an Argonne PC (or can be on your own laptop if you have enough space, or external hard-dirve).
    *Find a button "add Globus Connect Personal endpoint Endpoint List", 
    *Type in the name of the endpoint (for you to manage later)
    *Create the Setup Key and copy it. 
  • 3) Download the scripts to your local computer (or computer that you want to save the files to)
    https://docs.globus.org/how-to/globus-connect-personal-linux/ 
  • 4) *Ignore* the instruction on the webpage unless you want to install a user-interface globus which is not really needed. Simply unpack the zip files: globusconnectpersonal-x.x.x/
  • 5) Inside the folder, run the following command:
       ./globusconnect -setup <key>
   where <key> is the key-chain you generated and copied from the Globus webpage in step 2)
  • 6) Go to the directory ~/.globusonline/lta/, and edit or create a text-file named "config-paths". Inside this file, add lines to specify where you want to let the Globus get access to (so you can save files to or copy files from), e.g:
     /data/yez/Tritium,1,1
     /home/work/marathon,1,1
  where the first "1" means you allow Globus to visit this folder ("0" to turn off). The second "1" mean you allow Globus to write to this folder ("0" to set "read-only"). For more details, see this:https://docs.globus.org/faq/globus-connect-endpoints/#how_do_i_configure_accessible_directories_on_globus_connect_personal_for_linux
  • 7) Now you can start the endpoint-server by running:
    ./globusconnect -start &
  If you want to make some changes (like add or remove paths, change permission, stop the serve (./globusconnect -stop) and restart it after changing "config-paths")
   a) Connect to the JLab endpoint by searching " jlab#scifiles". (see https://scicomp.jlab.org/docs/node/11)
   b) Specify the endpoint that you just created (under "Administrated by Me"), and it will show your folders that are allowed to show (in config-paths).
   c) Choose the files or entire folders on JLab-endpoint (e.g., /cache/halla/triton/raw)
   d) Choose where you want to save these files on your personal endpoint PC. 
   e) Then click the big blue button in between two endpoints. Then just wait for your files to be transfered over.