JIRIAF Meeting Jan. 11 2024

From epsciwiki
Revision as of 15:02, 10 January 2024 by Gurjyan (talk | contribs)
Jump to navigation Jump to search


Connection Info:

You can connect using the following link (Meeting ID: 160 126 6529). (Click "Expand" to the right for details -->):

One tap mobile: US: +16692545252,,1608518798# or +16468287666,,1608518798#
Meeting URL: https://jlab-org.zoomgov.com/j/1601266529?pwd=ZkZKL0tjeWFpbmxDeWZob0VmbzNOUT09&from=addon
Meeting ID: 160 126 6529
Passcode: 292304

Join by Telephone
For higher quality, dial a number based on your current location.
Dial:
US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free)
Meeting ID: 160 126 6529

International numbers
Join by SIP
1616903130@sip.zoomgov.com
Join by H.323
161.199.138.10 (US West)
161.199.136.10 (US East)
Meeting ID: 160 851 8798
Passcode: 292304


Agenda:

  1. Announcements
    1. Welcome Patrick onboard.
    2. Problem getting an NSLS-II data-intensive workflow for migration.
    3. New sites for deployments: ORNL and ANL.
      1. Ticket (INC0114103) requesting Jiriaf nodes to access DOE computing facilities like NERSC and ORNL.
        1. The list of IP addresses and ports to present to the security and networking team.
  2. Summary of the project's undertakings and key achievements
    1. M3
      1. Define mechanisms to act on user workflows, such as reducing previously allocated resources to the user workflow/application.
    2. M4
      1. JCS design and development
        1. Starting VKs (Jiriaf nodes) through the k8s API management system
          1. Jiriaf node naming convention and labeling
        2. Jiriaf k8s cluster autoscaling (with possible AI support)
          1. Defining workflows/pods in the cluster that are unschedulable
        3. JCS and Jiriaf database relationship. Tables, such as
          1. available resource, user requests, and user workflow status.
          2. Examine the site resources database table (constantly updated by SWIF2) and submit SWIF2 requests to launch nodes and allocate/lease resources.
          3. Communicate with the k8s App server, ensuring submitted jobs are running, updating JIRIAF's available resource DB table.
          4. Develop a resource-request matching algorithm that compares user requests with the available resources.
          5. Define and suggest metadata structure for requests for accurate matching.
    1. M5
      1. JIRIAF k8s node/vk_cmd: Implement a function that can use ConfigMap configuration to write files in pods.
        1. Anatomy of the node/vk launch script.
      2. VK hardware monitor server
    2. Future milestones
      1. Accepting opportunistic workflow primarily designed for streaming purposes.
      2. Mathematical model for simulating the abstract processor/actor within the JIRIAF ecosystem.
      3. Definition of the parameters and functionalities of the distributed workflow agent model and initiation of its design.
  1. Slides for upcoming presentations in preparation for the publications
    1. Slide describing JIRIAF virtual k8s cluster creation, emphasizing its dynamic nature.
    2. Slide showing Prometheus integration to monitor JIRIAF k8s cluster and pods.
    3. Start working on a paper describing JIRIAF resource acquisition and workflow deployment within a dynamic k8s cluster.
  2. AOT

Useful References



Minutes: