Difference between revisions of "Job-scripts-jiriaf"
Jump to navigation
Jump to search
(One intermediate revision by the same user not shown) | |||
Line 54: | Line 54: | ||
operator: In | operator: In | ||
values: | values: | ||
− | - " | + | - "jiriaf" |
# Below should be commented out if the JIRIAF_WALLTIME is set to 0 | # Below should be commented out if the JIRIAF_WALLTIME is set to 0 | ||
### | ### | ||
Line 153: | Line 153: | ||
operator: In | operator: In | ||
values: | values: | ||
− | - " | + | - "jiriaf" |
# Below should be commented out if the JIRIAF_WALLTIME is set to 0 | # Below should be commented out if the JIRIAF_WALLTIME is set to 0 | ||
### | ### |
Latest revision as of 20:51, 10 April 2024
Job Scripts
Job scripts include the definitions for both the configMap
and the pod
associated with a particular job.
Computing Sites Running Containers in User Space (Common in HPC Environments Using Singularity or Shifter)
kind: ConfigMap
apiVersion: v1
metadata:
name: shifter-stress
data:
stress.sh: |
#!/bin/bash
export NUMBER=$2
export TIME=$1
shifter --image="jlabtsai/stress:latest" --entrypoint
---
apiVersion: v1
kind: Pod
metadata:
name: some-name # Job Name Here
spec:
containers:
- name: c1
image: shifter-stress
command: ["bash"]
args: ["300", "2"] # Time and cpu for stress
volumeMounts:
- name: shifter-stress
mountPath: shifter-stress
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: "1" # Number of CPUs Here as well
memory: 1Gi # Memory Here
volumes:
- name: shifter-stress
configMap:
name: shifter-stress
nodeSelector:
kubernetes.io/role: agent
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
# Below are the labels for the node, corresponding to the jiriaf labels
- key: jiriaf.nodetype
operator: In
values:
- "cpu"
- key: jiriaf.site
operator: In
values:
- "jiriaf"
# Below should be commented out if the JIRIAF_WALLTIME is set to 0
###
- key: jiriaf.alivetime
operator: Gt
values:
- "30"
###
tolerations:
- key: "virtual-kubelet.io/provider"
value: "mock"
effect: "NoSchedule"
restartPolicy: Never
Compute Sites Utilizing Docker or Other Container Runtimes Operating in Root Space
Two containers are instantiated in this process. The first container is dedicated to the user's job, while the second container's role is to adjust the PGID
of the first container. This adjustment ensures that metrics from the correct processes running in the first container are accurately collected. Please note that the PGID may not always be correct. It's essential to ensure that the PGID corresponds to the processes running within the container.
kind: ConfigMap
apiVersion: v1
metadata:
name: docker-stress
data:
stress.sh: |
#!/bin/bash
export PGID_FILE=$3
docker run -d --rm -e NUMBER=$2 -e TIME=$1 jlabtsai/stress:latest > /dev/null
## find the last container id
export CONTAINER_ID=$(docker ps -l -q)
docker inspect -f '{{.State.Pid}}' $CONTAINER_ID > $3
sleep $1
---
kind: ConfigMap
apiVersion: v1
metadata:
name: get-pgid
data:
stress.sh: |
#!/bin/bash
sleep 3
cp $1 $2
---
apiVersion: v1
kind: Pod
metadata:
name: some-name # Job Name Here
spec:
containers:
- name: c1
image: docker-stress
command: ["bash"]
args: ["300", "2", "~/default/some-name/containers/c1/p"] # "default" is the namespace of the pod. "some-name" is the pod name.
volumeMounts:
- name: docker-stress
mountPath: docker-stress
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: "1" # Number of CPUs Here as well
memory: 1Gi # Memory Here
- name: c2
image: get-pgid
command: ["bash"]
args: ["~/default/some-name/containers/c1/p", "~/default/some-name/containers/c1/pgid"] # "default" is the namespace of the pod. "some-name" is the pod name.
volumeMounts:
- name: get-pgid
mountPath: get-pgid
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: "1" # Number of CPUs Here as well
memory: 1Gi # Memory Here
volumes:
- name: docker-stress
configMap:
name: docker-stress
- name: get-pgid
configMap:
name: get-pgid
nodeSelector:
kubernetes.io/role: agent
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
# Below are the labels for the node, corresponding to the jiriaf labels
- key: jiriaf.nodetype
operator: In
values:
- "cpu"
- key: jiriaf.site
operator: In
values:
- "jiriaf"
# Below should be commented out if the JIRIAF_WALLTIME is set to 0
###
- key: jiriaf.alivetime
operator: Gt
values:
- "30"
###
tolerations:
- key: "virtual-kubelet.io/provider"
value: "mock"
effect: "NoSchedule"
restartPolicy: Never