JIRIAF Meeting Feb. 16 2023
Connection Info:
You can connect using https://jlab-org.zoomgov.com/j/1608518798 (Meeting ID: 160 851 8798). (Click "Expand" to the right for details -->):
One tap mobile: US: +16692545252,,1608518798# or +16468287666,,1608518798#
Meeting URL: https://jlab-org.zoomgov.com/j/1608518798?pwd=NnU3cW1ZZFhTTUQ2Y0hIRU5JTWg0UT09&from=addon
Meeting ID: 160 851 8798
Passcode: 205601
Join by Telephone
For higher quality, dial a number based on your current location.
Dial:
US: +1 669 254 5252 or +1 646 828 7666 or +1 551 285 1373 or +1 669 216 1590 or 833 568 8864 (Toll Free)
Meeting ID: 160 851 8798
International numbers
Join by SIP
1608518798@sip.zoomgov.com
Join by H.323
161.199.138.10 (US West)
161.199.136.10 (US East)
Meeting ID: 160 851 8798
Passcode: 205601
Agenda:
- Previous Meeting
- Announcements
- Make the necessary preparations to move a time-critical workflow to NERSC (Vardan).
- Modernize the CLAS12 reconstruction application so that it can function under the ERSAP streaming framework (Vardan).☺
- Modify the process for the CLAS12 reconstruction so that it may operate in a streaming manner (Vardan). ☺
- Perform a test drive of the CLAS12 reconstruction procedure to NERSC, which is currently operating on a Perlmutter node (Nick and Deby). ☺
- ERSAP should be updated to support ET source actors. This generic source component reads events from the FIFO associated with the CODA Event Transfer (Vardan).
- Prototype JIRIAF Front End (JFE). (Horio)
- Describe the hardware architecture of the user-facing JIRIAF computing node while assuming the following (Horio, Amitoj):
- Docker images of workflows and maybe other kinds of containers might be staged by us.
- It's possible that we'll stage certain data sets relating to jobs.
- Hosting a data lake, an in-memory data grid with the option to store data on the disk, is recommended for time-critical or streaming operations.
- The front end of the web application is an essential component. We should also consider installing a WEB server in addition to the RUCIO storage element, CVMFS file system, Globus, XRootD.
- Setup web server - Design UI forms
- Form.io is one of the recommendations that we might take into account.
- Authentication of users and permission to access
- Adopt OSG mechanisms
- It should be noted that the RUCIO design provides a layer for authentication and permission.
- Install the JIRIAF job queue database (Horio, Chris)
- Establish and fill up the task queue database table for the JIRIAF (Horio, Chris)
- Define table structure
- Job ID
- Memory count
- Number of cores
- Disk
- Expected time of completion
- The kind of workflow
- Priority
- Describe the hardware architecture of the user-facing JIRIAF computing node while assuming the following (Horio, Amitoj):
- Prototype JIRIAF Central Service (JCS). (Vardan, Chris)
- When designing the JCR, be sure to take into account the following pub/sub-technologies:
- XMsg
- Kafka
- Specify the communication protocol that will be used between JCA and JRM
- Define JIRIAF resource-pool table structure
- Specify and develop JRM, a software agent that is able to
- Accept and carry out local tasks
- Report job-specific metrics
- Cancel jobs and perform local cleaning activities
- Submit JRM tasks while making use of the super facility API and SWIF2
- When designing the JCR, be sure to take into account the following pub/sub-technologies:
- Prototype JIRIAF Workflow Resource matching Service (JWRMS) (Vardan)
- Develop a working model of the JWRMS.
- Examine the JIRIAF resource pool and jobs queue database tables in order to locate a workflow that is compatible with the available resources;
- The ability to combine tasks in order to make better use of the available resources.
- Ensure that workflow priorities are supported.
- Remove any tasks that have been finished from the workflow queue.
- In the event that the task is only half finished, be sure to update the workflow queue.
- Prototype JIRIAF Facility Manager (JFM) (Vardan, Chris)
- Keep an eye on the resources that are available at the remote computing facility.
- Super facility application programming interface
- SWIF2
- Prometheus
- etc.
- Fill up the table in the resource-pool database for JIRIAF
- The database table for the JIRIAF resource pool should be updated with the most recent information from the facility.
- Keep an eye on the resources that are available at the remote computing facility.
- AOT
Useful References