JIRIAF Meeting Feb. 8 2024

From epsciwiki
Revision as of 18:22, 8 February 2024 by Gurjyan (talk | contribs) (→‎Agenda:)
Jump to navigation Jump to search


Connection Info:

You can connect using the following link (Meeting ID: 160 126 6529). (Click "Expand" to the right for details -->):

One tap mobile: US: +16692545252,,1608518798# or +16468287666,,1608518798#
Meeting URL: https://jlab-org.zoomgov.com/j/1601266529?pwd=ZkZKL0tjeWFpbmxDeWZob0VmbzNOUT09&from=addon
Meeting ID: 160 126 6529
Passcode: 292304

Join by Telephone
For higher quality, dial a number based on your current location.
Dial:
US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free)
Meeting ID: 160 126 6529

International numbers
Join by SIP
1616903130@sip.zoomgov.com
Join by H.323
161.199.138.10 (US West)
161.199.136.10 (US East)
Meeting ID: 160 851 8798
Passcode: 292304


Agenda:

  • Announcements
    • Patrick's JLAB account.
    • ORNL accounts are ready.
    • Abstract accepted at the ACAT conference
    • Presentation at NERSC Data Day
    • Start preparing a paper for NIM (e.g.)
      • Objectives as a groundwork for IRI
      • Introduction to streaming workflows for HEP and NP
        • Advantages in terms of workflow deployment, migration, and orchestration
        • Mentioning the use of EJFAT as a transport mechanism that might be a key component for the future IRI
        • Description of two JLAB data processing workflows: CLAS12 and GlueX
      • K8s as workflow orchestration and monitoring tool
        • Novelty: Building a dynamic and elastic k8s cluster without having fixed computing resources.
        • Running k8s nodes in user space without additional configuration and/or setup requirements from resource providers.
        • Deploying pods through shell commands.
      • Concept validation experiment: JLAB-ESnet-NERSC data-stream processing
      • Conclusions
  • JFE
    • The OIDC identity layer on top of the OAuth 2.0 protocol is necessary for JIRIAF users to join the CILogon federated identity management ecosystem, allowing users to access services using their existing institutional credentials without needing separate usernames and passwords. JIRIAF application OIDC registration.
    • CILogon token structure.
    • Workflow description metadata: Processing type (batch, streaming, opportunistic-streaming, etc.), Docker image location, Resource requirements (core type, core count, memory, disk, time, data provisioning details).
    • pod.yaml, metric-server.yaml and VK/JRM startup scripts
    • Database tables and visualization.
      • Argo: workflow and visualization engine
        • Argo workflow: If the wall time for the JRM is about to run out, move the pending pod request to another JRM to continue processing.
  • JCS and JMS
    • time, cpu, and memory requests to steer deployment.
      • Allocate a node suitable for running the specified job.
        • What would be the time request for the JRM within the SLURM request?
      • Job request queue
      • List of pending and active JRMs
      • Check the list of not yet scheduled jobs and decide if we need to run more JRMs
      • Remove JRM if no suitable jobs can run on it.
      • Bayesian network-based agent model for a site/workflow.
        • Queueing theory-based mathematical model for predicting wait time for a streaming event in a queue before processing.
  • JRM
    • Implement a function using ConfigMap configuration to write files in pods.
    • Define mechanisms to act on user workflows, such as reducing previously allocated resources to the user workflow/application.
  • Documentation and code
    • Centralize the code base in Github.
      • Repository for all scripts and k8s YAML configuration files
    • HowTo manual/instructions for setting up JIRIAF on jiriaf2301-02
  • AOT

Useful References



Minutes: