JIRIAF Meeting Jan. 11 2024

From epsciwiki
Revision as of 14:43, 11 January 2024 by Gurjyan (talk | contribs)
Jump to navigation Jump to search


Connection Info:

You can connect using the following link (Meeting ID: 160 126 6529). (Click "Expand" to the right for details -->):


Agenda:

  1. Announcements
    1. Welcome Patrick onboard.
    2. Problem getting an NSLS-II data-intensive workflow for migration.
    3. New sites for deployments: ORNL and ANL.
      1. Ticket (INC0114103) requesting Jiriaf nodes to access DOE computing facilities like NERSC and ORNL.
        1. The list of IP addresses and ports to present to the security and networking team.
    4. NERSC allocation
      1. Jiriaf's project request has been approved (as of 01.02).
      2. Nick Taylor assigned more time for the m3792 project (EJFAT-EsNet).
  2. Summary of the project's undertakings
    1. Define mechanisms to act on user workflows, such as reducing previously allocated resources to the user workflow/application.
    2. JCS
      1. Starting VKs (Jiriaf nodes) through the k8s API management system
      2. Jiriaf node naming convention and labeling
      3. Jiriaf k8s cluster autoscaling (with possible AI support)
      4. Defining workflows/pods in the cluster that are unschedulable
      5. JCS and Jiriaf database relationship. Tables, such as
      6. available resource, user requests, and user workflow status.
      7. Examine the site resources database table (constantly updated by SWIF2) and submit SWIF2 requests to launch nodes and allocate/lease resources.
      8. Communicate with the k8s App server, ensuring submitted jobs are running, updating JIRIAF's available resource DB table.
      9. Develop a resource-request matching algorithm that compares user requests with the available resources.
      10. Define and suggest metadata structure for requests for accurate matching.
    3. JRM
      1. Implement a function using ConfigMap configuration to write files in pods.
      2. Anatomy of the node/vk launch script.
      3. VK hardware monitor server
    4. JFE
      1. Forms to submit user workflow requests
        1. Login and authentication, Processing type (batch, streaming, opportunistic-streaming, etc.), Docker image location, Resource requirements (core type, core count, memory, disk, time, data provisioning details).
          1. Research if k8s provides facilities for this (e.g., k8s dashboard)
      2. Visualize Jiriaf database tables.
        1. Dynamic updates.
  3. Preparation for the publication.
  4. AOT

Useful References



Minutes: