Cluster Alerts
Use Alerts inside Clusters -> Observability to manage cluster-scoped metric alerts.
The Alerts workspace stores alert definitions in Edka and evaluates them from the cluster metrics store. Firing alerts feed into the same global Notifications system as built in cluster, Kubernetes, add-on, and storage checks.
Requirements
Section titled “Requirements”Custom alerts require VictoriaMetrics on the cluster. If VictoriaMetrics is not installed, you can still open the Alerts tab, but rules wait for the metrics store before they can evaluate.
Edka Agent evaluates saved rules inside the cluster and reports the latest series state back to Edka.
Some alert packs have extra metric requirements:
- deployment, cluster health, and storage packs use kube-state-metrics, cAdvisor, or kubelet metrics
- PostgreSQL packs use CloudNativePG metrics
- NAT Gateway packs use Edka NAT Gateway metrics
- host infrastructure packs use Node Exporter metrics
Alert rules
Section titled “Alert rules”Open Alerts -> Rules to create, inspect, edit, or delete custom alert rules.
Each rule includes:
- PromQL query
- comparator and threshold
forduration- evaluation interval
- severity:
critical,warning, orinfo - labels and annotations
- optional routing labels such as
teamorowner
Rules can be enabled or disabled without deleting them. Disabled rules remain saved but are not evaluated.
Test before saving
Section titled “Test before saving”Use Test rule from the editor to run the PromQL query and threshold against VictoriaMetrics before saving.
The test result shows:
- how many series were returned
- how many series matched the comparator and threshold
- the labels, values, and per-series match status for the returned series (up to the first several)
This is useful for checking metric names, label filters, and threshold units before the rule starts producing notifications.
Alert states
Section titled “Alert states”The Rules view summarizes current evaluation state:
- OK: enabled rules with no matching alert series
- Pending: rules with matching series that have not reached their
forduration yet - Firing: rules whose matching series reached the configured
forduration - Disabled: saved rules excluded from evaluation
- Not evaluated: rules that have not reported a state yet
Rule rows also show the last evaluation time, matching series count, and any evaluation error reported by the evaluator.
Alert packs
Section titled “Alert packs”Open Alerts -> Packs to install predefined rules as editable cluster alert rules.
Current packs include:
- Deployments: rollout, availability, and pod restart alerts
- Postgres: connection pressure, replication lag, blocked backends, and long transactions for CloudNativePG clusters
- Storage: high and critical persistent volume claim usage
- Cluster Health: node readiness, pod phase, and container resource alerts
- NAT Gateway: availability, failover, route, egress, security, and conntrack alerts
- Host Infrastructure: host root filesystem usage, inode usage, and sustained CPU alerts based on Node Exporter metrics
Pack rules are installed as normal custom rules. You can edit or delete them after installation.
If a pack is already partially installed, Edka can install the missing rules. Rules with name conflicts are skipped instead of overwriting unrelated custom rules.
Node Exporter and host alerts
Section titled “Node Exporter and host alerts”The host infrastructure pack depends on the Node Exporter add-on and VictoriaMetrics.
When node monitoring is enabled for VictoriaMetrics, Edka installs the Node Exporter add-on and configures VictoriaMetrics to scrape host metrics. Once both Node Exporter and VictoriaMetrics are installed, Edka can install the host infrastructure alert pack automatically.
Host alerts cover node-level pressure that Kubernetes object state alone cannot show, such as root filesystem usage, inode usage, and sustained CPU pressure.
Notifications
Section titled “Notifications”Firing custom alert rules appear in global notifications with the severity and labels from the rule. When the evaluator later reports that a rule no longer matches, the notification occurrence resolves and remains available in notification history until retention expires.
Read/unread state is separate from alert resolution.