KubeHero docs

Architecture

How the collector, control plane, and operator fit together.

KubeHero is four services plus an open CRD surface. The split is designed so you can run any subset — full stack in one cluster or a federated hub-and-edge topology — without losing functionality.

Services

ServiceLanguageWhere it runsResponsibility
collectorGo + eBPFCustomer cluster (DaemonSet)Per-pod CPU / memory / GPU attribution, cgroup-accurate. Streams compressed telemetry to the control plane.
control-planeGo + Connect-RPCYour cluster (hub)Ingests telemetry, evaluates policies, serves the dashboard and CLI.
pricing-engineGo (CronJob)Your clusterPulls AKS/GKE/EKS pricing daily, normalizes across clouds and lifecycles (on-demand, spot, savings-plan, committed).
operatorGo + kubebuilderCustomer clusterReconciles BudgetPolicy, CeilingPolicy, RightsizingPolicy CRDs. Never runs enforcement without humanArm.

Data plane

  • Telemetry: 1-second ticks from the collector, batched into 5-second gRPC frames, LZ4 compressed. Target overhead: < 0.5% CPU, < 50 MiB RSS per node.
  • Storage: ClickHouse for time-series, PostgreSQL for metadata (users, orgs, policies, audit log).
  • Transport: mTLS end-to-end, via cert-manager with your own CA.

The three CRDs

  • BudgetPolicy — declarative spending intent (ceiling, scope, escalation).
  • CeilingPolicy — burn-rate triggered enforcement that references a BudgetPolicy.
  • RightsizingPolicy — how aggressively to recommend or apply workload right-sizing.

See the CRD reference for every field.

Licensing

Open source. Apache 2.0 for the agent, CLI, collector, cost-model library, and proto schemas. BSL 1.1 for the control plane, operator, pricing engine, and dashboard. Source at github.com/kubehero-io/platform.