JIRIAF Meeting Sep. 28 2023

From epsciwiki
Jump to navigation Jump to search


Connection Info:

You can connect using the following link (Meeting ID: 161 690 3130). (Click "Expand" to the right for details -->):

One tap mobile: US: +16692545252,,1608518798# or +16468287666,,1608518798#
Meeting URL: https://jlab-org.zoomgov.com/j/1616903130?pwd=cjg3U0Y4SndXL05SeFBmQjVHZkhrQT09&from=addon
Meeting ID: 161 690 3130
Passcode: 018094

Join by Telephone
For higher quality, dial a number based on your current location.
Dial:
US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free)
Meeting ID: 161 690 3130

International numbers
Join by SIP
1616903130@sip.zoomgov.com
Join by H.323
161.199.138.10 (US West)
161.199.136.10 (US East)
Meeting ID: 160 851 8798
Passcode: 018094


Agenda:

  1. Announcements
    1. The initial manuscript outlining our project has been submitted to the EPJ.
    2. The JIRIAF LDRD project has been chosen to receive funding for its second year of operation.
  2. Remote Access to jiriaf2301/02
    1. Verify if remote access to jiriaf2301/02 is resolved.
      1. xrdp installation.
    2. Discuss the need for acceptable remote access performance on jiriaf2301/03 for prototyping purposes.
  3. Running Remote JRM with Virtual Kubelet (vk-cmd)
    1. Discuss the setup of running a remote JRM using the k8s API server on jiriaf2301.
    2. Please be advised that the JRM job constitutes a resource allocation mechanism, and the execution of the actual workflow will take place on this allocated resource after the decision made by JMS.
        1. Can we run multiple pods (user workflows) on a reserved resource (i.e., JRM/virtual Kubelet)?
        2. May we depend on the Kubernetes API server to coordinate the orchestration of these pods?
    3. Share any recent developments or challenges in this area.
  4. Running JRM as a SLURM/SWIF Job
    1. Explore the process of running JRM as a SLURM job.
      1. Discuss the advantages and benefits of utilizing SWIF for running JRM.
      2. Address any questions or concerns regarding this choice.
  5. JCS and k8s API Server
    1. Discuss how the JCS and/or k8s API server will update the "available resources" table in the MariaDB.
    2. Discuss mechanisms interacting with the k8s API server.
    3. Discuss how JRM job records are maintained in the database.
  6. Running CLAS12 Data-Stream Processing Pipeline
    1. Outline the steps to prepare a Docker image for the CLAS12 data-stream processing pipeline.
    2. Discuss running the pipeline container as a k8s POD inside a JRM on a Perlmutter node at NERSC.
  7. Status of the Front-End
    1. Provide an update on the current status of the front-end system.
    2. JIRIAF Entry Web Page
      1. Discuss whether a web server running on jiriaf2301 is required for the JIRIAF entry web page.
    3. Connection to MariaDB Back-End
      1. Confirm if there is a connection to the MariaDB back-end for populating the "user-job-request" table.
    4. Status of the JIRIAF REST API
      1. Provide an update on the current status and functionality of the JIRIAF REST API.
  8. Authentication of JIRIAF Users
    1. Explore the authentication mechanisms used by SWIF and OSG.
      1. Discuss the feasibility of adopting these mechanisms for JIRIAF.
  9. Metadata Characterizing User Resources and Processing Requests
    1. Discuss the metadata related to user resources and processing requests, including data source, core requirements, memory, disk space, walk time, preferred site, and other relevant factors.
  10. JIRIAF Workflow Request Matching Server (JMS)
    1. Describe the role and mechanisms of JMS in running Kubernetes pods.
    2. Share any updates or changes in this area.
  11. JMS Matching Algorithms
    1. Discuss different matching algorithms JMS uses, such as exact matching and distributed resource allocation.
    2. Explore the implications and benefits of these algorithms.
  12. AOT

Useful References



Minutes: