Deploy Workflows on NERSC, ORNL, and Local EJFAT nodes via Helm Charts
Jump to navigation
Jump to search
JIRIAF Workflow Setup and Deployment Guide
Quick Start
Setting up EJFAT nodes
./main/local-ejfat/init-jrm/launch-nodes.sh
Deploying Prometheus
cd main/prom
ID=jlab-100g-nersc-ornl
helm install $ID-prom prom/ --set Deployment.name=$ID
Deploying EJFAT workflows
cd main/local-ejfat
./launch_job.sh
Deploying SLURM NERSC-ORNL workflows
cd main/slurm-nersc-ornl
./batch-job-submission.sh
Detailed Usage
EJFAT Node Initialization
- Run the launch-nodes script:
./main/local-ejfat/init-jrm/launch-nodes.sh
- To customize node range, modify the script:
for i in $(seq <start> <end>)
Local EJFAT Workflow Deployment
- Set project ID:
ID=your-project-id
- Deploy workflow:
helm install $ID-job-ejfat-<INDEX> local-ejfat/job/ --set Deployment.name=$ID-job-ejfat-<INDEX> --set Deployment.serviceMonitorLabel=$ID
- For quick deployment, use launch_job.sh:
./main/local-ejfat/launch_job.sh
SLURM NERSC-ORNL Workflow Deployment
- Launch a single job:
./launch_job.sh <ID> <INDEX> <SITE> <ersap-exporter-port> <jrm-exporter-port>
- Example:
./launch_job.sh jlab-100g-nersc-ornl 0 perlmutter 20000 10000
- For batch job submission:
./batch-job-submission.sh
Prometheus Deployment
- Deploy Prometheus:
cd main/prom
ID=jlab-100g-nersc-ornl
helm install $ID-prom prom/ --set Deployment.name=$ID
Customization
Local EJFAT
Edit main/local-ejfat/job/values.yaml to customize deployment:
Deployment:
name: this-name-is-changing
namespace: default
replicas: 1
serviceMonitorLabel: ersap-test4
cpuUsage: "128"
ejfatNode: "2"
ersapSettings:
image: gurjyan/ersap:v0.1
cmd: /ersap/run-pipeline.sh
file: /x.ersap
SLURM NERSC-ORNL
Edit main/slurm-nersc-ornl/job/values.yaml to customize deployment:
Deployment:
name: this-name-is-changing
namespace: default
replicas: 1
serviceMonitorLabel: ersap-test4
site: perlmutter
Cleanup
To delete a deployed job:
helm uninstall <release-name> -n <namespace>
Troubleshooting
- Check pod status:
kubectl get pods -n <namespace>
- View pod logs:
kubectl logs <pod-name> -n <namespace>
- Describe a pod:
kubectl describe pod <pod-name> -n <namespace>