Skip to content
SYS.DOCS // DOCS

Cluster Diagnostics

Use Diagnostics to investigate live Kubernetes-level issues inside a cluster. This page is separate from Cluster Settings → Events:

  • Diagnostics focuses on cluster health, Kubernetes warning events, K3s control plane signals, and problematic pods.
  • Events focuses on Edka cluster operation telemetry such as provisioning, updates, and deletions.

The Diagnostics page combines several operational views:

  • Warning Summary: current warnings derived from node readiness, pod health, Kubernetes warning events, and K3s metrics.
  • Kubernetes Events: cluster warning events for fast triage.
  • K3s Signals: control plane indicators such as certificate expiry, snapshot health, ETCD state, and API pressure.
  • Pods Requiring Attention: pods with concrete issues such as ImagePullBackOff, ErrImagePull, CrashLoopBackOff, or other container failures.

The summary card is the fastest way to see whether a cluster needs attention.

Warnings can be raised from:

  • not ready nodes
  • pods in failing or degraded states
  • warning events returned by Kubernetes
  • K3s control plane signals

Warnings also appear in the Overview page inside Capacity & Health so you can jump into Diagnostics directly from the cluster summary.

The Kubernetes Events panel is the cluster-level event console for troubleshooting runtime issues.

You can:

  • filter to warnings only
  • switch scope by namespace
  • refresh the feed on demand

Typical issues you can catch here include:

  • image pull failures
  • certificate or challenge failures
  • scheduling problems
  • repeated backoff events

This is especially useful when the pod phase is too generic on its own. For example, a pod can still show Kubernetes phase Pending while Diagnostics surfaces the real issue such as ImagePullBackOff.

K3s Signals highlights control plane health indicators without requiring you to inspect node-by-node metrics manually.

It includes:

  • certificate expiry windows
  • etcd snapshot success vs failure
  • ETCD leader visibility
  • inflight API requests and 5xx trends
  • etcd request latency summaries

If metrics are unavailable, Diagnostics shows that state explicitly.

This section groups pods by concrete failure signals instead of only showing the raw pod phase.

Examples include:

  • CrashLoopBackOff
  • ImagePullBackOff
  • ErrImagePull
  • terminated containers with errors
  • pods that remain pending because they cannot schedule

For each problematic pod, Diagnostics shows the pod message, restart counts, and container-level reasons when available.

  • Diagnostics is available as a top-level cluster section.
  • The page is intended for users with cluster visibility who need fast debugging context.
  • Kubernetes-backed diagnostics become available once cluster Kubernetes access is ready.
  • Snapshot-based sections can be refreshed manually.

Use Diagnostics when you want to answer questions like:

  • Why is this pod failing?
  • Why are there warnings in the cluster?
  • Are K3s control plane signals healthy?
  • Which namespace is producing warning events?

Use Cluster Settings → Events when you want to answer questions like:

  • What happened during provisioning?
  • Did a cluster update fail?
  • Which lifecycle operation produced this error?
  • What progress events were emitted during deletion or upgrade?