KubeHero docs

Stack

Every dependency, every integration, every OSS project we ride instead of rebuild.

KubeHero's thesis: don't rebuild what OSS already nailed. Our chart ships our own services; everything else is either consumed (Prometheus, Grafana, DCGM) or installed alongside us via stack-install.sh.

Storage

RoleChoiceWhy this and not others
Time-seriesClickHouse (Altinity operator)Columnar, billion-point compression. Same engine Cloudflare / PostHog / Signoz / Grafana Cloud run on. Postgres can't keep up at our event rate.
Metadata + auditPostgreSQL via CloudNativePGCNCF sandbox. Operator-managed, backed up to S3 automatically. Beats Bitnami's chart for production.
Cache + rate-limitValkeyPost-Redis-BSL: Valkey is the OSS answer. DragonflyDB is 25× faster but newer — Valkey is safer for v0.
Cold archiveS3 / Azure Blob / GCS via ParquetCheap forever-storage of detailed pod-seconds. DuckDB or ClickHouse queries them on demand.

If you'd rather not self-manage these, each can be pointed at a managed equivalent (e.g. Neon for Postgres, ClickHouse Cloud, Upstash for Valkey) via Helm values.

Auth

Dex (CNCF sandbox) is the OIDC proxy. Connectors to Okta, Azure AD, Google Workspace, GitHub, GitLab, and LDAP ship in the standard distribution. We never see passwords. See Integrations · Identity.

Observability

We ride kube-prometheus-stack, we don't fight it.

  • Prometheus — scrapes our /metrics, runs our PrometheusRule
  • Grafana — our 3 ConfigMap dashboards auto-load via the sidecar
  • Alertmanager — routes chargeback alerts to Slack / PagerDuty / OpsGenie
  • Optional: Loki (logs), Tempo (traces), Pyroscope (continuous profiling → flame graphs in workload drill-in)

Security (Posture view sources)

ToolRoleOSS
Trivy OperatorCVE + misconfig scans on running workloadsApache 2 · CNCF-adjacent
FalcoRuntime anomaly detectionCNCF graduated
TetragoneBPF-based runtime securityCNCF sandbox · Isovalent
Azure Defender / AWS Inspector v2 / GCP SCCCloud posture + findingsvendor APIs
Pod Security StandardsBuilt-in admission baselineupstream K8s

We correlate findings against workload cost so a $18k/mo workload with an unpatched critical CVE ranks higher than either fact alone.

Secrets

External Secrets Operator — bridges AWS Secrets Manager / Azure Key Vault / GCP Secret Manager / HashiCorp Vault → Kubernetes Secrets. Most mature clusters already run it.

Certs

cert-manager — weekly mTLS rotation for agent ↔ control plane. We don't ship our own PKI.

Per-cloud integrations

CloudAuthPricingSecurityAutoscaler signal
AWSIRSAPricing API + Savings Plans + SpotInspector v2 + GuardDuty + Security HubKarpenter, Cluster Autoscaler
GCPWorkload IdentityCloud Billing → BigQuery + CUD recommenderSecurity Command CenterGKE Autoscaler
AzureWorkload IdentityCost Management + Retail Prices + RIs/SPsDefender for CloudAKS Autoscaler

Each cloud is a drop-in adapter behind a single Go interface — adding Oracle / IBM / Alibaba later is a new file.

Autoscaler signals (read-only)

We read signals from whichever autoscaler is already running; we never replace.

  • Karpenter (AWS, expanding to Azure)
  • Cluster Autoscaler (all clouds, older)
  • KEDA (event-driven autoscaling)
  • VPA (Vertical Pod Autoscaler — we sanity-check our rightsizing against its recommendations)

What we deliberately DO NOT adopt

  • OpenCost / Kubecost allocation engine — their accuracy ceiling is our baseline. We offer an importer for their labels if a customer wants continuity.
  • Kyverno / Gatekeeper — admission-level. Our CRDs are resource-level. Orthogonal concerns.
  • Temporal — heavy. Add when we need durable long-running workflows, not before.

Install it all

# interactive — prompts for each dep
./infra/demo/stack-install.sh

# non-interactive full stack
./infra/demo/stack-install.sh --all

# just kubehero + kube-prometheus-stack (rest must already be present)
./infra/demo/stack-install.sh --core-only

Every block in values.yaml has embedded: false + external: { ... } — point at your existing deployment, or flip embedded to true and install via the script above.