Difference between revisions of "JIRIAF"

Revision as of 14:25, 26 October 2022

JLAB Integrated Research Infrastructure Across Facilities

Project Description

The JIRIAF (JLAB Integrated Research Infrastructure Across Facilities ) project aims to appraise capabilities to combine geographically diverse computing facilities into an integrated science infrastructure. This assumes evaluating an infrastructure that dynamically integrates temporarily unallocated or idled compute resources from various providers. Since the participating facilities will have diverse resources and local running workflows, it becomes essential to study the challenges of heterogeneous, distributed, and opportunistic compute resource provisioning from several participating data centers that will be presented to an end-user as a single, unified computing infrastructure. Policies and requirements for computational workflows that can effectively utilize volatile resources are critical for this integrated scientific environment. The primary objective of the JIRIAF project is to test the relocation of computing workflow from resources close to the experiment to a geographically remote data center in cases when near real-time data quality checks are required or online calibration and alignment, and so on, are in need. And the relocation of a workflow between two geographically remote data centers In cases when local resources are insufficient for data processing. We need to understand what types of solutions work, where future investment is required, the operational and sociological aspects of collaboration across sites, and which science workflows benefit most from distributed infrastructure.
This project is well-positioned to show the feasibility of workload rollovers across DOE computing facilities. This intern will provide operational resilience and load balancing during peak times and bring science-oriented computing facilities together, mandating uniform data movement, data processing API unification, and resource sharing. In the end, the science rate will increase. Static resource provisioning by carving resources from the local farms and dedicating them to guest tasks is straightforward. Also, DOE has dedicated resources (such as NERSC) that can be requested and allocated for specific tasks—not to mention OSG. But along with possible dedicated resource provisioning, the novelty of this project is to satisfy occasional, un-scheduled tasks that need timely processing, such as workflows that are slowed down or stopped due to computer center unforeseen maintenance periods, quick data QAs during the data acquisition (including streaming and triggered DAQs), fast analysis trains to check physics, etc. In other words integrating DOE compute facilities that a user sees as one facility (no resource request proposals, approvals, special memberships, etc.)

General Resources

Meetings

Documents

Proposals

Presentations

Date	Event	Presenter	Slides

Publications

Date	Journal	Title

Notes

Useful Links

@@ Line 3: / Line 3: @@
 === Project Description ===
-The JIRIAF (JLAB Integrated Research Infrastructure Across Facilities ) project aims to appraise capabilities to combine geographically diverse computing facilities into an integrated science infrastructure. This assumes evaluating an infrastructure that dynamically integrates temporarily unallocated or idled compute resources from various providers. Since the participating facilities will have diverse resources and local running workflows, it becomes essential to study the challenges of heterogeneous, distributed, and opportunistic compute resource provisioning from several participating data centers that will be presented to an end-user as a single, unified computing infrastructure. Policies and requirements for computational workflows that can effectively utilize volatile resources are critical for this integrated scientific environment. The primary objective of the JIRIAF project is to test the relocation of computing workflow from resources close to the experiment to a geographically remote data center in cases when near real-time data quality checks are required or online calibration and alignment, and so on, are in need. And the relocation of a workflow between two geographically remote data centers In cases when local resources are insufficient for data processing. We need to understand what types of solutions work, where future investment is required, the operational and sociological aspects of collaboration across sites, and which science workflows benefit most from distributed infrastructure. This project is well-positioned to show the feasibility of workload rollovers across DOE computing facilities. This intern will provide operational resilience and load balancing during peak times and bring science-oriented computing facilities together, mandating uniform data movement, data processing API unification, and resource sharing. In the end, the science rate will increase. Static resource provisioning by carving resources from the local farms and dedicating them to guest tasks is straightforward. Also,  DOE has dedicated resources (such as NERSC) that can be requested and allocated for specific tasks—not to mention OSG. But along with possible dedicated resource provisioning, the novelty of this project is to satisfy occasional, un-scheduled tasks that need timely processing, such as workflows that are slowed down or stopped due to computer center unforeseen maintenance periods, quick data QAs during the data acquisition (including streaming and triggered DAQs), fast analysis trains to check physics, etc. In other words integrating DOE compute facilities that a user sees as one facility (no resource request proposals, approvals, special memberships, etc.)
+The JIRIAF (JLAB Integrated Research Infrastructure Across Facilities ) project aims to appraise capabilities to combine geographically diverse computing facilities into an integrated science infrastructure. This assumes evaluating an infrastructure that dynamically integrates temporarily unallocated or idled compute resources from various providers. Since the participating facilities will have diverse resources and local running workflows, it becomes essential to study the challenges of heterogeneous, distributed, and opportunistic compute resource provisioning from several participating data centers that will be presented to an end-user as a single, unified computing infrastructure. Policies and requirements for computational workflows that can effectively utilize volatile resources are critical for this integrated scientific environment. The primary objective of the JIRIAF project is to test the relocation of computing workflow from resources close to the experiment to a geographically remote data center in cases when near real-time data quality checks are required or online calibration and alignment, and so on, are in need. And the relocation of a workflow between two geographically remote data centers In cases when local resources are insufficient for data processing. We need to understand what types of solutions work, where future investment is required, the operational and sociological aspects of collaboration across sites, and which science workflows benefit most from distributed infrastructure. <br>This project is well-positioned to show the feasibility of workload rollovers across DOE computing facilities. This intern will provide operational resilience and load balancing during peak times and bring science-oriented computing facilities together, mandating uniform data movement, data processing API unification, and resource sharing. In the end, the science rate will increase. Static resource provisioning by carving resources from the local farms and dedicating them to guest tasks is straightforward. Also,  DOE has dedicated resources (such as NERSC) that can be requested and allocated for specific tasks—not to mention OSG. But along with possible dedicated resource provisioning, the novelty of this project is to satisfy occasional, un-scheduled tasks that need timely processing, such as workflows that are slowed down or stopped due to computer center unforeseen maintenance periods, quick data QAs during the data acquisition (including streaming and triggered DAQs), fast analysis trains to check physics, etc. In other words integrating DOE compute facilities that a user sees as one facility (no resource request proposals, approvals, special memberships, etc.)
 === General Resources ===