JIRIAF Meeting Oct. 12 2023

From epsciwiki
Revision as of 18:45, 12 October 2023 by Gurjyan (talk | contribs) (→‎Agenda:)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Connection Info:

You can connect using the following link (Meeting ID: 161 690 3130). (Click "Expand" to the right for details -->):

One tap mobile: US: +16692545252,,1608518798# or +16468287666,,1608518798#
Meeting URL: https://jlab-org.zoomgov.com/j/1616903130?pwd=cjg3U0Y4SndXL05SeFBmQjVHZkhrQT09&from=addon
Meeting ID: 161 690 3130
Passcode: 018094

Join by Telephone
For higher quality, dial a number based on your current location.
Dial:
US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free)
Meeting ID: 161 690 3130

International numbers
Join by SIP
1616903130@sip.zoomgov.com
Join by H.323
161.199.138.10 (US West)
161.199.136.10 (US East)
Meeting ID: 160 851 8798
Passcode: 018094


Agenda:

  1. Announcements
  2. Current status of the JRM
    1. Running JRM as a SWIF Job
    2. SSH tunnel between the processing node (JRM) and the JIRIAF server (Kubernetes API server).
    3. "Hello World" workflow within the JRM at NERSC Perlmutter cluster.
    4. Digital tween for an abstract actor of the data processing system.
      1. Start with a simple digital agent for a processor node (JRM). Use Queuing theory to define the state of the JRM in terms of
        1. traffic intensity, time data quantum spends in a data pipeline, etc.
        2. Use workflow-provided measures to define data arrival and servicing rates
        3. Use other indirect measures (e.g., reassembly engine fifo levels for streaming processing case) to estimate these rates if workflow does not provide information.
      2. Define physical actor control channels.
  3. JIRIAF Available Resources DB Table
    1. Suggested schema:
      1. computing site.
      2. resource_ID (unique identifier for each resource)
      3. resource_type (e.g., CPU, GPU, memory, etc.)
      4. number_of_cores
      5. memory
      6. available_walltime (decreasing over time)
        1. Available wall time will decrease over time.
        2. The wall time decrease below a certain threshold will trigger resource release.
        3. The wall time for the opportunistic resource (e.g., commercial cloud resource) will be clearly indicated in the table.
  4. JIRIAF User Resource Requests DB Table
    1. Suggested schema:
      1. name (user or group)
      2. resource_type (e.g., CPU, GPU, memory, etc.)
      3. number_of_cores
      4. memory
      5. wall time
      6. workflow_container_info (e.g., container ID, Docker image, etc.)
      7. data_type (streaming or batch files)
      8. can_it_be_horizontally_distributed/scaled (yes/no)
      9. data provisioning technology choice
    2. Important question: Who is responsible for data provisioning, JIRIAF, or user workflow management system?
  5. JIRIAF Workflow Resource Matching Service (JMS)
    1. Choice of technology/language: Python?
    2. The algorithm must be able to fit multiple user requests into a bigger resource (workflow Tetris).
  6. Running CLAS12 Data-Stream Processing Pipeline
    1. Docker image for the CLAS12 data-stream processing pipeline.
  7. Status of the Front-End
    1. JIRIAF Entry Web Page
    2. Connection to MariaDB Back-End
    3. JIRIAF REST API
    4. Authentication of JIRIAF Users
  8. AOT

Useful References



Minutes: