Kubernetes CPU Metrics in the kubeletstats Receiver: Transition from .cpu.utilization to .cpu.usage

The OpenTelemetry Collector’s kubeletstats receiver is a crucial component for collecting Kubernetes node, pod and container metrics. To improve metric accuracy and adhere to OpenTelemetry semantic conventions, we are updating how CPU metrics are named and emitted.

This blog post explains the motivation behind this change, the impact on users, the role of the feature gate which was introduced for this change, and guidance on migrating.

Why This Change?

Historically, the kubeletstats receiver emitted CPU metrics labeled with .cpu.utilization, such as:

These metrics actually represent raw CPU usage in cores, derived from the Kubernetes Kubelet’s UsageNanoCores field, which is an absolute measure of CPU usage (in units of nanocores).

The term utilization generally refers to a relative metric, typically expressed as a ratio or percentage of used CPU against total CPU capacity or limits. Using .cpu.utilization for absolute usage values violates Semantic Conventions, potentially confusing users and tooling expecting utilization metrics to be relative.

What Is Changing?

To address this semantic mismatch, we introduced new .cpu.usage metrics that correctly represent raw CPU usage values:

At the same time, the legacy .cpu.utilization metrics have been marked for deprecation.

Feature Gate: receiver.kubeletstats.enableCPUUsageMetrics

Note that the .cpu.utilization metrics were enabled by default so far. To manage the transition smoothly, a feature gate named receiver.kubeletstats.enableCPUUsageMetrics was introduced:

  • Alpha (v0.111.0): The feature gate was introduced but disabled by default. Users needed to explicitly enable it to receive .cpu.usage metrics instead of .cpu.utilization by default.
  • Beta (v0.125.0): The feature gate was promoted to beta and enabled by default. In this state:
    • The .cpu.usage metrics are emitted by default. - Attempts to enable .cpu.utilization metrics will fail.
    • Users can explicitly disable the feature gate to temporarily restore the deprecated .cpu.utilization metrics if needed.
  • Stable (Upcoming): The feature gate will remain in beta for several releases to allow the community ample time to adapt. The plan and discussion for moving it to stable is tracked in Issue #39650.

What Does This Mean for You?

Impact on Existing Users

  • If you upgrade to v0.125.0 or later, the Collector will emit .cpu.usage metrics by default.
  • Any monitoring dashboards, alerting rules, or queries relying on .cpu.utilization metrics will break or not function as expected.
  • The deprecated .cpu.utilization metrics are planned for eventual removal, so updating is necessary for long-term compatibility.
  1. Audit your observability pipelines for references to .cpu.utilization metrics.
  2. Update dashboards, alerts, and queries to use the new .cpu.usage metrics.
  3. Test the new metrics by enabling the feature gate in staging (or rely on the default enabled state in v0.125.0+).
  4. Plan your migration timeline considering that .cpu.utilization will be removed in future releases.
  5. Stay engaged with the OpenTelemetry community via GitHub issues and PR discussions.

Why Keep the Feature Gate in Beta?

The decision to keep the feature gate in beta for multiple releases is driven by:

  • The critical nature of kubeletstats CPU metrics for many production observability pipelines.
  • The need to allow users and vendors ample time to adapt and update their tooling.
  • The opportunity to gather feedback and address any unexpected issues before the change becomes permanent.

This approach minimizes disruption and helps ensure a smooth transition for everyone.

Final Thoughts

The transition from .cpu.utilization to .cpu.usage metrics in the kubeletstats receiver is an important step to ensure that Kubernetes metrics conform to semantic best practices. We appreciate the community’s patience and collaboration as we make these improvements.

If you have questions, want to share feedback, or need help migrating, please join us on the CNCF Slack.

Thank you for helping us build clearer, more reliable Kubernetes observability!