Difference between revisions of "JIRIAF Meeting Jan. 11 2024"

From epsciwiki
Jump to navigation Jump to search
Line 46: Line 46:
 
### Nick Taylor assigned more time for the m3792 project (EJFAT-EsNet).
 
### Nick Taylor assigned more time for the m3792 project (EJFAT-EsNet).
 
# Summary of the project's undertakings  
 
# Summary of the project's undertakings  
## Define mechanisms to act on user workflows, such as reducing previously allocated resources to the user workflow/application.
+
## JFE
 +
### Forms to submit user workflow requests
 +
#### Login and authentication, Processing type (batch, streaming, opportunistic-streaming, etc.), Docker image location, Resource requirements (core type, core count, memory, disk, time, data provisioning details).
 +
##### Research if k8s provides facilities for this (e.g., k8s dashboard)
 +
### Visualize Jiriaf database tables.
 +
#### Dynamic updates.
 
## JCS  
 
## JCS  
 
### Starting VKs (Jiriaf nodes) through the k8s API management system
 
### Starting VKs (Jiriaf nodes) through the k8s API management system
Line 60: Line 65:
 
## JRM  
 
## JRM  
 
### Implement a function using ConfigMap configuration to write files in pods.
 
### Implement a function using ConfigMap configuration to write files in pods.
 +
### Define mechanisms to act on user workflows, such as reducing previously allocated resources to the user workflow/application.
 +
### VK hardware monitor server
 
### Anatomy of the node/vk launch script.
 
### Anatomy of the node/vk launch script.
### VK hardware monitor server
 
## JFE
 
### Forms to submit user workflow requests
 
#### Login and authentication, Processing type (batch, streaming, opportunistic-streaming, etc.), Docker image location, Resource requirements (core type, core count, memory, disk, time, data provisioning details).
 
##### Research if k8s provides facilities for this (e.g., k8s dashboard)
 
### Visualize Jiriaf database tables.
 
#### Dynamic updates.
 
 
# Preparation for the publication.   
 
# Preparation for the publication.   
 
# AOT
 
# AOT

Revision as of 14:47, 11 January 2024


Connection Info:

You can connect using the following link (Meeting ID: 160 126 6529). (Click "Expand" to the right for details -->):

One tap mobile: US: +16692545252,,1608518798# or +16468287666,,1608518798#
Meeting URL: https://jlab-org.zoomgov.com/j/1601266529?pwd=ZkZKL0tjeWFpbmxDeWZob0VmbzNOUT09&from=addon
Meeting ID: 160 126 6529
Passcode: 292304

Join by Telephone
For higher quality, dial a number based on your current location.
Dial:
US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free)
Meeting ID: 160 126 6529

International numbers
Join by SIP
1616903130@sip.zoomgov.com
Join by H.323
161.199.138.10 (US West)
161.199.136.10 (US East)
Meeting ID: 160 851 8798
Passcode: 292304


Agenda:

  1. Announcements
    1. Welcome Patrick onboard.
    2. Problem getting an NSLS-II data-intensive workflow for migration.
    3. New sites for deployments: ORNL and ANL.
      1. Ticket (INC0114103) requesting Jiriaf nodes to access DOE computing facilities like NERSC and ORNL.
        1. The list of IP addresses and ports to present to the security and networking team.
    4. NERSC allocation
      1. Jiriaf's project request has been approved (as of 01.02).
      2. Nick Taylor assigned more time for the m3792 project (EJFAT-EsNet).
  2. Summary of the project's undertakings
    1. JFE
      1. Forms to submit user workflow requests
        1. Login and authentication, Processing type (batch, streaming, opportunistic-streaming, etc.), Docker image location, Resource requirements (core type, core count, memory, disk, time, data provisioning details).
          1. Research if k8s provides facilities for this (e.g., k8s dashboard)
      2. Visualize Jiriaf database tables.
        1. Dynamic updates.
    2. JCS
      1. Starting VKs (Jiriaf nodes) through the k8s API management system
      2. Jiriaf node naming convention and labeling
      3. Jiriaf k8s cluster autoscaling (with possible AI support)
      4. Defining workflows/pods in the cluster that are unschedulable
      5. JCS and Jiriaf database relationship. Tables, such as
      6. available resource, user requests, and user workflow status.
      7. Examine the site resources database table (constantly updated by SWIF2) and submit SWIF2 requests to launch nodes and allocate/lease resources.
      8. Communicate with the k8s App server, ensuring submitted jobs are running, updating JIRIAF's available resource DB table.
      9. Develop a resource-request matching algorithm that compares user requests with the available resources.
      10. Define and suggest metadata structure for requests for accurate matching.
    3. JRM
      1. Implement a function using ConfigMap configuration to write files in pods.
      2. Define mechanisms to act on user workflows, such as reducing previously allocated resources to the user workflow/application.
      3. VK hardware monitor server
      4. Anatomy of the node/vk launch script.
  3. Preparation for the publication.
  4. AOT

Useful References



Minutes: