Difference between revisions of "JIRIAF Meeting Oct. 12 2023"

From epsciwiki
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 42: Line 42:
 
## "Hello World" workflow within the JRM at NERSC Perlmutter cluster.
 
## "Hello World" workflow within the JRM at NERSC Perlmutter cluster.
 
## Digital tween for an abstract actor of the data processing system.
 
## Digital tween for an abstract actor of the data processing system.
### Start with a simple digital agent for a processor node (JRM). Use queuing theory to define the state of the JRM in terms of  
+
### Start with a simple digital agent for a processor node (JRM). Use Queuing theory to define the state of the JRM in terms of  
 
#### traffic intensity, time data quantum spends in a data pipeline, etc.
 
#### traffic intensity, time data quantum spends in a data pipeline, etc.
 
#### Use workflow-provided measures to define data arrival and servicing rates
 
#### Use workflow-provided measures to define data arrival and servicing rates
Line 49: Line 49:
 
# JIRIAF Available Resources DB Table  
 
# JIRIAF Available Resources DB Table  
 
## Suggested schema:
 
## Suggested schema:
### computing_suite (e.g., HPC, GPU, etc.)
+
### computing site.
 
### resource_ID (unique identifier for each resource)
 
### resource_ID (unique identifier for each resource)
 
### resource_type (e.g., CPU, GPU, memory, etc.)
 
### resource_type (e.g., CPU, GPU, memory, etc.)
Line 68: Line 68:
 
### data_type (streaming or batch files)
 
### data_type (streaming or batch files)
 
### can_it_be_horizontally_distributed/scaled (yes/no)
 
### can_it_be_horizontally_distributed/scaled (yes/no)
 +
### data provisioning technology choice
 
## Important question: '''Who is responsible for data provisioning, JIRIAF, or user workflow management system?'''
 
## Important question: '''Who is responsible for data provisioning, JIRIAF, or user workflow management system?'''
 
# JIRIAF Workflow Resource Matching Service (JMS)
 
# JIRIAF Workflow Resource Matching Service (JMS)

Latest revision as of 18:45, 12 October 2023


Connection Info:

You can connect using the following link (Meeting ID: 161 690 3130). (Click "Expand" to the right for details -->):

One tap mobile: US: +16692545252,,1608518798# or +16468287666,,1608518798#
Meeting URL: https://jlab-org.zoomgov.com/j/1616903130?pwd=cjg3U0Y4SndXL05SeFBmQjVHZkhrQT09&from=addon
Meeting ID: 161 690 3130
Passcode: 018094

Join by Telephone
For higher quality, dial a number based on your current location.
Dial:
US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free)
Meeting ID: 161 690 3130

International numbers
Join by SIP
1616903130@sip.zoomgov.com
Join by H.323
161.199.138.10 (US West)
161.199.136.10 (US East)
Meeting ID: 160 851 8798
Passcode: 018094


Agenda:

  1. Announcements
  2. Current status of the JRM
    1. Running JRM as a SWIF Job
    2. SSH tunnel between the processing node (JRM) and the JIRIAF server (Kubernetes API server).
    3. "Hello World" workflow within the JRM at NERSC Perlmutter cluster.
    4. Digital tween for an abstract actor of the data processing system.
      1. Start with a simple digital agent for a processor node (JRM). Use Queuing theory to define the state of the JRM in terms of
        1. traffic intensity, time data quantum spends in a data pipeline, etc.
        2. Use workflow-provided measures to define data arrival and servicing rates
        3. Use other indirect measures (e.g., reassembly engine fifo levels for streaming processing case) to estimate these rates if workflow does not provide information.
      2. Define physical actor control channels.
  3. JIRIAF Available Resources DB Table
    1. Suggested schema:
      1. computing site.
      2. resource_ID (unique identifier for each resource)
      3. resource_type (e.g., CPU, GPU, memory, etc.)
      4. number_of_cores
      5. memory
      6. available_walltime (decreasing over time)
        1. Available wall time will decrease over time.
        2. The wall time decrease below a certain threshold will trigger resource release.
        3. The wall time for the opportunistic resource (e.g., commercial cloud resource) will be clearly indicated in the table.
  4. JIRIAF User Resource Requests DB Table
    1. Suggested schema:
      1. name (user or group)
      2. resource_type (e.g., CPU, GPU, memory, etc.)
      3. number_of_cores
      4. memory
      5. wall time
      6. workflow_container_info (e.g., container ID, Docker image, etc.)
      7. data_type (streaming or batch files)
      8. can_it_be_horizontally_distributed/scaled (yes/no)
      9. data provisioning technology choice
    2. Important question: Who is responsible for data provisioning, JIRIAF, or user workflow management system?
  5. JIRIAF Workflow Resource Matching Service (JMS)
    1. Choice of technology/language: Python?
    2. The algorithm must be able to fit multiple user requests into a bigger resource (workflow Tetris).
  6. Running CLAS12 Data-Stream Processing Pipeline
    1. Docker image for the CLAS12 data-stream processing pipeline.
  7. Status of the Front-End
    1. JIRIAF Entry Web Page
    2. Connection to MariaDB Back-End
    3. JIRIAF REST API
    4. Authentication of JIRIAF Users
  8. AOT

Useful References



Minutes: