FAQ¶

Frequently asked questions about Nexus GSLB.

General - What is Nexus GSLB? - A DNS-based global server load balancer that answers A/AAAA records using round-robin and health awareness. - Does it support IPv6? - Yes. Include IPv6 addresses in loadbalancer.endpoints to enable AAAA responses.

Health checks - What types of health checks are supported? - tcp, http, icmp, script, and webhook. HTTP supports TLS, expected status, body substring match, and custom Host/SNI header. ICMP requires CAP_NET_RAW. Script checks can store the script body in the database (scriptContent) so it is automatically replicated to all cluster nodes via NATS — no filesystem distribution needed. See Health Checks for full details. - Are HTTPS certificates verified? - Yes by default when health.http.tls: true. Set insecureSkipVerify: true only for lab/testing.

Configuration - Where is the config file? - Default /etc/gslb/config.yaml, override with -config flag. - Can I manage config via Git? - Yes. Enable GitOps with gitops.repoURL, signed commits required by default.

State synchronization - How do nodes share health information? - Via NATS + JetStream subjects and KV with TTL. Health state changes are pushed immediately on each transition (peer convergence < 200 ms). A 30 s catchup ticker handles any missed signals. Policies control how local and global health are merged. - What policy should I use? - Default prefer-local is conservative. Use global-quorum for stronger cross-region consensus.

Failover - How fast does Nexus GSLB fail over? - Failover has three independent layers: (1) Detection — gslbd stops returning a failed IP after avg(checkInterval/2) + probe_timeout; with checkInterval: 1s this is ~1.5 s. (2) Peer convergence — other cluster nodes learn of the change within <200 ms via NATS event-driven publish. (3) Client cache — clients that already have the IP cached will keep hitting it until their DNS TTL expires (configurable per service, default 60 s). For anycast deployments, BGP RHI withdraws routes in the same callback as detection, independent of TTL. See Performance — Failover latency for config tables and the scripts/measure-failover.sh tool. - Can Nexus GSLB achieve sub-second failover like NS1 or Akamai? - Sub-second server-side failover (gslbd stops returning the IP) is achievable with checkInterval: 1s and timeout: 500ms. Sub-second client-side failover requires additionally setting a very low per-service ttl (e.g. ttl: 1). NS1 and Akamai's sub-second marketing claim refers to their anycast/BGP architecture where the client resolver is at a nearby PoP — they bypass client DNS caching entirely. Nexus GSLB achieves equivalent behaviour in anycast deployments via BGP RHI.

Metrics - How do I expose Prometheus metrics? - Enable metrics.enablePrometheus: true and scrape http://<host>:9090/metrics (default port 9090).

Security - How are GitOps changes secured? - GPG-signed commit verification; optionally restrict to an allowlist of signer fingerprints. - How do clients authenticate to NATS? - Use TLS client certificates or NATS accounts/JWT. See Security Guide.

Operations - Why are no records returned sometimes? - Likely all endpoints are unhealthy for that family or health policy excludes them. Check health metrics and logs. - How do I run on port 53 without root? - Grant capability: setcap 'cap_net_bind_service=+ep' /usr/local/bin/gslbd and run as non-root.