Prometheus + Grafana
How KubeHero plugs into kube-prometheus-stack and what you get for free.
KubeHero is designed to ride your existing Prometheus + Grafana stack. The chart ships three resource types that plug into kube-prometheus-stack:
- ServiceMonitor — tells Prometheus where to scrape our
/metrics - PrometheusRule — chargeback recording rules + alerts
- ConfigMap dashboards — picked up by Grafana's sidecar via the
grafana_dashboard=1label
Prerequisites
You need the prometheus-operator CRDs in the cluster — installing the kube-prometheus-stack chart covers this. The KubeHero chart detects monitoring.coreos.com/v1 and only emits the resources when those CRDs exist.
Values that wire it
# values.yaml for kubehero
prometheus:
enabled: true
scrapeInterval: "30s"
scrapeTimeout: "10s"
# Match whatever label your Prometheus CR uses for ServiceMonitor selection.
# kube-prometheus-stack's default is the release name of its own install.
release: "kube-prometheus-stack"
extraLabels: {}
grafana:
enabled: true
sidecarLabel: "grafana_dashboard"
sidecarLabelValue: "1"
folder: "KubeHero"
dashboards:
chargeback: true
fleet: true
gpu: true
What gets installed
Inspect after helm install:
kubectl -n kubehero-system get servicemonitor,prometheusrule
kubectl -n monitoring get configmap -l grafana_dashboard=1 | grep kubehero
You should see:
servicemonitor/kubehero-collectorservicemonitor/kubehero-control-planeprometheusrule/kubehero-chargeback- 3 ConfigMaps (chargeback / fleet / gpu dashboards)
Troubleshooting discovery
If Prometheus isn't scraping us:
-
Confirm the
release:label invalues.yamlmatches your Prometheus CR'sspec.serviceMonitorSelector.matchLabels.release. Mismatch means Prometheus ignores our ServiceMonitor. -
Or disable label-based selection in your Prometheus:
# kube-prometheus-stack values prometheus: prometheusSpec: serviceMonitorSelectorNilUsesHelmValues: false ruleSelectorNilUsesHelmValues: false -
Check the Prometheus web UI → Status → Targets — kubehero-collector should be
UP.
Remote-write to Grafana Cloud
If you terminate Prometheus at Grafana Cloud rather than self-hosted:
# kube-prometheus-stack values.yaml
prometheus:
prometheusSpec:
remoteWrite:
- url: https://prometheus-blocks-prod-us-central1.grafana.net/api/prom/push
basicAuth:
username: { name: grafana-cloud-creds, key: username }
password: { name: grafana-cloud-creds, key: password }
Our recording rules evaluate in your self-hosted Prometheus and arrive at Grafana Cloud pre-aggregated, so dashboard queries are fast even on the hosted side.
Shipped dashboards
Chargeback by team
Five panels: hourly rate time series per team, 30-day projected spend bar gauge, nodepool breakdown, GPU idle cost per team, top-20 workloads table.
Fleet
Headline stats: hourly burn, recoverable / month, per-cluster time series.
GPU panel
Utilization heatmap + per-GPU idle-hour cost ranking.
Disable any of them:
grafana:
dashboards:
chargeback: true
fleet: true
gpu: false
Custom PromQL for your own dashboards
# Monthly projected spend per team
kubehero:team_cost_usd:rate30d
# Rank workloads by hourly recoverable $
topk(20, sum(rate(kubehero_pod_recoverable_usd_per_second[5m])) by (namespace, pod)) * 3600
# CVEs priced — workload cost × severity (requires Posture integration)
kubehero_pod_cost_usd_per_second * on(pod, namespace) group_left
max(kubehero_workload_cve_severity_score) by (pod, namespace)
See Metrics reference for the complete label set.