Metrics (Prometheus)¶
Pleiades exposes Prometheus-compatible metrics at a configurable HTTP endpoint when enabled. All metrics include constant labels cluster and node when these are set in the config.
Enable endpoint
- Metrics are available athttp://<listenAddr>:<port>/metrics.
Metric catalog (namespace: gslbd)
- dns
- gslbd_dns_requests_total
- gitops
- gslbd_gitops_fetch_total{result}
- gslbd_gitops_verify_total{result}
- gslbd_gitops_apply_total{result}
- gslbd_gitops_last_apply_info{sha,signer} value 1 for the last applied commit
- state (NATS/JetStream)
- gslbd_state_nats_connected (0/1)
- gslbd_state_nats_published_total{type}
- gslbd_state_nats_received_total{type}
- gslbd_state_kv_put_total{bucket,result}
- gslbd_state_kv_get_total{bucket,result}
- gslbd_state_merge_lag_ms (histogram)
- gslbd_state_active_members (gauge)
- health
- gslbd_health_endpoints_total{family}
- gslbd_health_endpoints_healthy{family}
- dnssec (emitted only when dnssec.enabled: true)
- gslbd_dnssec_sign_latency_seconds (histogram) — per-RRset signing latency; target p99 < 1 ms
- gslbd_dnssec_key_days_remaining{key} (gauge) — days until KSK/ZSK expiry; label value is ksk or zsk
- gslbd_dnssec_response_bytes (histogram) — signed response size in bytes; responses > 1232 bytes trigger TC=1 truncation
Scrape example
scrape_configs:
- job_name: 'gslbd'
scrape_interval: 15s
static_configs:
- targets: ['gslbd-hostname:9090']
Sample alerts
groups:
- name: gslbd
rules:
- alert: GslbdNATSDisconnected
expr: gslbd_state_nats_connected == 0
for: 2m
- alert: GslbdMergeLagHigh
expr: histogram_quantile(0.95, sum(rate(gslbd_state_merge_lag_ms_bucket[5m])) by (le)) > 2000
for: 5m
- alert: GslbdActiveMembersZero
expr: gslbd_state_active_members == 0
for: 5m
- alert: GslbdDNSSECKeyExpiryWarning
expr: gslbd_dnssec_key_days_remaining < 30
for: 1h
labels:
severity: warning
- alert: GslbdDNSSECKeyExpiryCritical
expr: gslbd_dnssec_key_days_remaining < 7
for: 1h
labels:
severity: critical
Implementation references
- internal/metrics/* for registerers, collectors, and HTTP server.
- Metrics are registered with const labels via metrics.InitLabels(clusterID, nodeID) in main.