Difference between revisions of "Deploy Prometheus Monitoring with Prometheus Operator"

From epsciwiki
Jump to navigation Jump to search
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
 
= Deploy Prometheus Monitoring with Prometheus Operator =
 
= Deploy Prometheus Monitoring with Prometheus Operator =
  
Line 7: Line 8:
 
Ensure your Kubernetes cluster has:
 
Ensure your Kubernetes cluster has:
  
# [https://github.com/prometheus-operator/prometheus-operator#quickstart Prometheus Operator]
+
# [https://github.com/prometheus-operator/kube-prometheus Prometheus Operator]
 
# [https://github.com/kubernetes-sigs/metrics-server#installation Kubernetes Metrics Server]
 
# [https://github.com/kubernetes-sigs/metrics-server#installation Kubernetes Metrics Server]
 
# [https://helm.sh/docs/intro/install/ Helm]
 
# [https://helm.sh/docs/intro/install/ Helm]
Line 15: Line 16:
 
Below is a visual representation of the Prometheus deployment process:
 
Below is a visual representation of the Prometheus deployment process:
  
[[File: prometheus_deployment_flow_chart.png|Prometheus Deployment Flow Chart|1000px]]
+
[[File:prometheus_deployment_flow_chart.png|Prometheus Deployment Flow Chart|1100px]]
  
 
This flow chart illustrates the key steps in deploying Prometheus monitoring using the Prometheus Operator.
 
This flow chart illustrates the key steps in deploying Prometheus monitoring using the Prometheus Operator.
Line 21: Line 22:
 
== Deployment Steps ==
 
== Deployment Steps ==
  
# '''Install Prometheus Operator''': First, we need to install the Prometheus Operator:
+
=== 1. Setup Environment ===
## Create a namespace for monitoring:
+
Clone the repository and navigate to the <code>prom</code> folder:
  <syntaxhighlight lang="bash">
+
<pre>
  kubectl create namespace monitoring
+
git clone https://github.com/JeffersonLab/jiriaf-test-platform.git
  </syntaxhighlight>
+
cd jiriaf-test-platform/main/prom
## Add the Prometheus Operator Helm repository:
+
</pre>
  <syntaxhighlight lang="bash">
+
 
  helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
+
=== 2. Install Prometheus Operator ===
  helm repo update
+
Instead of using Helm, we'll use the community-maintained manifests from the kube-prometheus project:
  </syntaxhighlight>
+
 
## Install the Prometheus Operator:
+
a. Clone the kube-prometheus repository:
  <syntaxhighlight lang="bash">
+
<syntaxhighlight lang="bash">
  helm install prometheus-operator prometheus-community/kube-prometheus-stack -n monitoring
+
git clone --depth 1 https://github.com/prometheus-operator/kube-prometheus.git /tmp/kube-prometheus
  </syntaxhighlight>
+
</syntaxhighlight>
## Verify the installation:
+
b. Copy the manifests to your current directory:
  <syntaxhighlight lang="bash">
+
<syntaxhighlight lang="bash">
  kubectl get pods -n monitoring
+
cp -R /tmp/kube-prometheus/manifests .
  </syntaxhighlight>
+
</syntaxhighlight>
 +
c. Create the Custom Resource Definitions (CRDs) and Prometheus Operator:
 +
<syntaxhighlight lang="bash">
 +
kubectl create -f ./manifests/setup/
 +
</syntaxhighlight>
 +
d. Apply the remaining manifests:
 +
<syntaxhighlight lang="bash">
 +
kubectl create -f ./manifests/
 +
</syntaxhighlight>
 +
e. Verify the installation:
 +
<syntaxhighlight lang="bash">
 +
kubectl -n monitoring get pods
 +
</syntaxhighlight>
 +
 
 +
=== 3. Configure Values ===
 +
Edit <code>values.yaml</code> to set your specific configuration:
  
# '''Configure Values''': Edit `values.yaml` to set your specific configuration:
+
<pre>
  <syntaxhighlight lang="yaml">
+
Deployment:
  Deployment:
+
  name: <project-id>
    name: <project-id>
+
  namespace: default
    namespace: default
 
 
 
  PersistentVolume:
 
    node: jiriaf2302-control-plane
 
    path: /var/prom
 
    size: 5Gi
 
 
 
  Prometheus:
 
    serviceaccount: prometheus-k8s
 
    namespace: monitoring
 
  </syntaxhighlight>
 
  Key configurations:
 
  * `Deployment.name`: Used for job naming, persistent volume path, and service monitoring selection
 
  * `Deployment.namespace`: Specifies job namespace and namespace monitoring selection
 
  * `PersistentVolume.*`: Configures storage for Prometheus data
 
  * `Prometheus.*`: Sets Prometheus server details
 
  
  Note: Only those `servicemonitors` with the namespace `default` can be monitored. To monitor additional namespaces, additional configuration is required. Refer to the [https://github.com/prometheus-operator/kube-prometheus/blob/main/docs/customizations/monitoring-additional-namespaces.md Prometheus Operator documentation on customizations] for details.
+
PersistentVolume:
 +
  node: jiriaf2302-control-plane
 +
  path: /var/prom
 +
  size: 5Gi
  
# '''Install the Custom Prometheus Helm Chart''': Run the following command, replacing `<project-id>` with your identifier:
+
Prometheus:
  <syntaxhighlight lang="bash">
+
  serviceaccount: prometheus-k8s
  helm install <project-id>-prom prom/ --set Deployment.name=<project-id>
+
  namespace: monitoring
  </syntaxhighlight>
+
</pre>
  Example:
 
  <syntaxhighlight lang="bash">
 
  ID=jlab-100g-nersc-ornl
 
  helm install $ID-prom prom/ --set Deployment.name=$ID
 
  </syntaxhighlight>
 
  
# '''Verify Deployment''': Check that all components are running:
+
Key configurations:
  <syntaxhighlight lang="bash">
+
* <code>Deployment.name</code>: Used for job naming, persistent volume path, and service monitoring selection
  kubectl get pods -n monitoring
+
* <code>Deployment.namespace</code>: Specifies job namespace and namespace monitoring selection
  kubectl get pv
+
* <code>PersistentVolume.*</code>: Configures storage for Prometheus data
  </syntaxhighlight>
+
* <code>Prometheus.*</code>: Sets Prometheus server details
  
# '''Access Grafana Dashboard''':
+
'''Note:''' Only those <code>servicemonitors</code> with the namespace <code>default</code> can be monitored. To monitor additional namespaces, additional configuration is required. Refer to the [https://github.com/prometheus-operator/kube-prometheus/blob/main/docs/customizations/monitoring-additional-namespaces.md Prometheus Operator documentation on customizations] for details.
## Find the Grafana service:
+
 
  <syntaxhighlight lang="bash">
+
=== 4. Install the Custom Prometheus Helm Chart ===
  kubectl get svc -n monitoring
+
Run the following command, replacing <code><project-id></code> with your identifier:
  </syntaxhighlight>
+
 
## Set up port forwarding:
+
<pre>
  <syntaxhighlight lang="bash">
+
helm install <project-id>-prom prom/ --set Deployment.name=<project-id>
  kubectl port-forward svc/prometheus-operator-grafana -n monitoring 3000:80
+
</pre>
  </syntaxhighlight>
+
 
## Access Grafana at `http://localhost:3000` (default credentials: admin/admin)
+
Example:
 +
<pre>
 +
ID=jlab-100g-nersc-ornl
 +
helm install $ID-prom prom/ --set Deployment.name=$ID
 +
</pre>
 +
 
 +
=== 5. Verify Deployment ===
 +
Check that all components are running:
 +
<pre>
 +
kubectl get pods -n monitoring
 +
kubectl get pv
 +
</pre>
 +
 
 +
=== 6. Access Grafana Dashboard ===
 +
a. Find the Grafana service:
 +
<pre>
 +
kubectl get svc -n monitoring
 +
</pre>
 +
b. Set up port forwarding:
 +
<pre>
 +
kubectl port-forward svc/prometheus-operator-grafana -n monitoring 3000:80
 +
</pre>
 +
c. Access Grafana at <code>http://localhost:3000</code> (default credentials: admin/admin or admin/prom-operator)
 +
 
 +
=== 7. Remove Prometheus Helm Chart (if needed) ===
 +
Notice this will remove the persistent volume claim, and the data will be lost.
 +
<syntaxhighlight lang="bash">
 +
# 1. Remove the persistent volume claim
 +
kubectl delete pvc -n monitoring prometheus-<project-id>-db-prometheus-<project-id>-0
 +
# 2. Remove the Prometheus Helm Chart
 +
helm uninstall <project-id>-prom
 +
</syntaxhighlight>
  
 
== Components Deployed ==
 
== Components Deployed ==
  
* Prometheus Server (`prometheus.yaml`)
+
* Prometheus Server (<code>prometheus.yaml</code>)
* Persistent Volume for data storage (`prom-pv.yaml`)
+
* Persistent Volume for data storage (<code>prom-pv.yaml</code>)
* Empty directory creation job (`prom-create_emptydir.yaml`)
+
* Empty directory creation job (<code>prom-create_emptydir.yaml</code>)
  
 
== Integration with Workflows ==
 
== Integration with Workflows ==
Line 102: Line 131:
 
== Advanced Configuration ==
 
== Advanced Configuration ==
  
For further customization, refer to the Helm chart templates and `values.yaml`. Ensure your cluster has the necessary permissions and resources for persistent volumes and Prometheus server operation.
+
For further customization, refer to the Helm chart templates and <code>values.yaml</code>. Ensure your cluster has the necessary permissions and resources for persistent volumes and Prometheus server operation.
  
 
== Troubleshooting ==
 
== Troubleshooting ==
  
 
If you encounter issues:
 
If you encounter issues:
# Check pod status: `kubectl get pods -n monitoring`
+
# Check pod status: <code>kubectl get pods -n monitoring</code>
# View pod logs: `kubectl logs <pod-name> -n monitoring`
+
# View pod logs: <code>kubectl logs <pod-name> -n monitoring</code>
# Ensure persistent volume is correctly bound: `kubectl get pv`
+
# Ensure persistent volume is correctly bound: <code>kubectl get pv</code>
# Verify Prometheus configuration: `kubectl get prometheus -n monitoring -o yaml`
+
# Verify Prometheus configuration: <code>kubectl get prometheus -n monitoring -o yaml</code>
 
 
For more help, consult the [https://github.com/prometheus-operator/prometheus-operator/tree/main/Documentation Prometheus Operator documentation].
 

Latest revision as of 16:53, 17 September 2024

Deploy Prometheus Monitoring with Prometheus Operator

This guide outlines the deployment process for a custom Prometheus monitoring setup using the Prometheus Operator.

Prerequisites

Ensure your Kubernetes cluster has:

  1. Prometheus Operator
  2. Kubernetes Metrics Server
  3. Helm

Deployment Flow Chart

Below is a visual representation of the Prometheus deployment process:

Prometheus Deployment Flow Chart

This flow chart illustrates the key steps in deploying Prometheus monitoring using the Prometheus Operator.

Deployment Steps

1. Setup Environment

Clone the repository and navigate to the prom folder:

git clone https://github.com/JeffersonLab/jiriaf-test-platform.git
cd jiriaf-test-platform/main/prom

2. Install Prometheus Operator

Instead of using Helm, we'll use the community-maintained manifests from the kube-prometheus project:

a. Clone the kube-prometheus repository:

git clone --depth 1 https://github.com/prometheus-operator/kube-prometheus.git /tmp/kube-prometheus

b. Copy the manifests to your current directory:

cp -R /tmp/kube-prometheus/manifests .

c. Create the Custom Resource Definitions (CRDs) and Prometheus Operator:

kubectl create -f ./manifests/setup/

d. Apply the remaining manifests:

kubectl create -f ./manifests/

e. Verify the installation:

kubectl -n monitoring get pods

3. Configure Values

Edit values.yaml to set your specific configuration:

Deployment:
  name: <project-id>
  namespace: default

PersistentVolume:
  node: jiriaf2302-control-plane
  path: /var/prom
  size: 5Gi

Prometheus:
  serviceaccount: prometheus-k8s
  namespace: monitoring

Key configurations:

  • Deployment.name: Used for job naming, persistent volume path, and service monitoring selection
  • Deployment.namespace: Specifies job namespace and namespace monitoring selection
  • PersistentVolume.*: Configures storage for Prometheus data
  • Prometheus.*: Sets Prometheus server details

Note: Only those servicemonitors with the namespace default can be monitored. To monitor additional namespaces, additional configuration is required. Refer to the Prometheus Operator documentation on customizations for details.

4. Install the Custom Prometheus Helm Chart

Run the following command, replacing <project-id> with your identifier:

helm install <project-id>-prom prom/ --set Deployment.name=<project-id>

Example:

ID=jlab-100g-nersc-ornl
helm install $ID-prom prom/ --set Deployment.name=$ID

5. Verify Deployment

Check that all components are running:

kubectl get pods -n monitoring
kubectl get pv

6. Access Grafana Dashboard

a. Find the Grafana service:

kubectl get svc -n monitoring

b. Set up port forwarding:

kubectl port-forward svc/prometheus-operator-grafana -n monitoring 3000:80

c. Access Grafana at http://localhost:3000 (default credentials: admin/admin or admin/prom-operator)

7. Remove Prometheus Helm Chart (if needed)

Notice this will remove the persistent volume claim, and the data will be lost.

# 1. Remove the persistent volume claim
kubectl delete pvc -n monitoring prometheus-<project-id>-db-prometheus-<project-id>-0
# 2. Remove the Prometheus Helm Chart
helm uninstall <project-id>-prom

Components Deployed

  • Prometheus Server (prometheus.yaml)
  • Persistent Volume for data storage (prom-pv.yaml)
  • Empty directory creation job (prom-create_emptydir.yaml)

Integration with Workflows

This setup is designed to monitor services and jobs created by your workflow system.

Advanced Configuration

For further customization, refer to the Helm chart templates and values.yaml. Ensure your cluster has the necessary permissions and resources for persistent volumes and Prometheus server operation.

Troubleshooting

If you encounter issues:

  1. Check pod status: kubectl get pods -n monitoring
  2. View pod logs: kubectl logs <pod-name> -n monitoring
  3. Ensure persistent volume is correctly bound: kubectl get pv
  4. Verify Prometheus configuration: kubectl get prometheus -n monitoring -o yaml