Skip to content

Architecture Overview

Pleiades is a DNS-based Global Server Load Balancer (GSLB). It answers A/AAAA queries and selects backend IPs using a health-aware, client-IP-aware load balancer. Configuration is stored in SQLite and managed via a REST API. Runtime state (health, membership, config) is synchronized across geo-distributed nodes via NATS + JetStream. GitOps enables declarative configuration from a Git repository. Metrics are exposed for Prometheus. Licensing enforces request rate limits.

Core components

DNS Server (internal/dns) - Listens for DNS queries for a configured domain. - Extracts the DNS client's IP from the UDP remote address. - Service-aware resolution: on each query, the DNS server first looks up the queried name in the services table via GetServiceByDomain. If a matching service has a pool assigned, members of that pool are used (enabled members only, round-robin per service). Falls back to the static load balancer for domains not found in the DB. - Passes the client IP to Balancer.GetNextIPv4For/GetNextIPv6For for client-aware fallback routing. - Enforces licensing limits before answering. - Emits Prometheus metrics. - NewServer accepts a *storage.DB (may be nil; DB resolution is skipped when nil).

TUI (cmd/gslbctl, internal/tui) - Standalone binary; connects directly to the SQLite database (no API server required). - Built with Bubble Tea, Lip Gloss, and Bubbles. - Tab navigation between Pools and Services. - Pool view: list, create, rename, delete; drill into pool for Members, Health Check, Geo Rules sub-screens. - Service view: list (with assigned pool column), create, edit (pre-filled), delete; pool assignment uses a dedicated picker screen showing unassigned pools first (A–Z) then already-assigned pools (A–Z with annotation). - Launch with: ./gslbctl -db /var/lib/gslbd/gslbd.db

Load Balancer (internal/loadbalancer) - Algorithm interface: Next() net.IP for stateless selection. - ClientAwareAlgorithm interface: SortedCandidatesFor(clientIP net.IP) []net.IP for client-IP-aware selection. - RoundRobin: equal distribution; implements Algorithm. - WeightedRoundRobin: smooth weighted distribution; implements Algorithm. - GeoIPAlgorithm: sorts candidates by haversine great-circle distance from the client IP. Uses MaxMind GeoLite2 (.mmdb) for geolocation. Supports manual EndpointLocations overrides for private/DC IPs not in the DB. Implements ClientAwareAlgorithm and io.Closer. - MapFileAlgorithm: CIDR prefix matching; matched endpoint sorts first. Supports hot-reload via UpdateRules. Implements ClientAwareAlgorithm. - Balancer: wraps an algorithm, integrates with a health provider, and exposes GetNextIPv4For/GetNextIPv6For. When the algorithm implements ClientAwareAlgorithm, candidates are iterated in preference order. Member mutations from the REST API are reflected immediately in the running balancer.

Storage (internal/storage) - Pure-Go SQLite via modernc.org/sqlite (no CGO required). - WAL mode, foreign keys enabled, cascade deletes. - Schema: pools, members (cascade from pool), services (SET NULL pool_id on pool delete), health_checks (UNIQUE per pool, cascade), geo_rules (cascade from pool). - Open(path) runs migrate() to create/upgrade schema on startup. - All IDs are 32-char hex strings from crypto/rand. - ErrNotFound sentinel for missing rows. - GetServiceByDomain(ctx, domain): case-insensitive, trailing-dot-tolerant lookup used by the DNS server.

REST API (internal/api) - Go 1.22 http.NewServeMux() with {id} path patterns. - 18 routes covering full CRUD for pools, members, services, health checks, and geo rules. - Input validation: IP addresses (net.ParseIP), port range (1–65535), CIDR blocks (net.ParseCIDR). - Member create/update/delete mutations are reflected immediately in the running Balancer. - Empty collections return [] (not null) for JSON compatibility. - See docs/API.md for full route reference.

Health Checker (internal/health) - Active health checks (TCP or HTTP/HTTPS) at intervals with timeouts. - Thread-safe status store; exposes IsHealthy(net.IP).

GitOps Controller (internal/gitops) - Polls a Git repo at intervals. - Enforces GPG-signed commit policy. - Validates and applies config (endpoints, health settings) with safe rollback (last-good retained in-process).

State Sync (internal/state) - NATS connection + JetStream KV buckets for health and membership. - Publishes local health and heartbeats; subscribes to global events and KV snapshot. - Maintains GlobalHealthView and provides composite health policy (prefer-local, local-only, global-any-healthy, global-quorum). - Config sync (configsync_nats.go): fully implemented. - PublishCluster: writes YAML + metadata to KV and publishes to JetStream stream with Pleiades-Version, Pleiades-Commit, Content-Type headers. - WatchCluster: watches KV for changes; nil entry signals initial values done; delete ops emit type:"delete" events; buffered channel of 16. - SnapshotCluster: KV Get; returns nil YAML (not error) if key not found. - EnsureConfigStream / EnsureConfigKVBucket: eagerly create stream and KV bucket on startup.

Metrics (internal/metrics) - Prometheus endpoint and collectors. All metrics carry cluster and node labels when configured.

Licensing (internal/licensing) - Validates HMAC-SHA256 signed license tokens and enforces RPS limits in DNS path. - Secret loaded from PLEIADES_LICENSE_SECRET env var or config file.

Data flows

DNS query path 1. Request enters DNS server; client IP extracted from UDP remote address. 2. Licensing check for RPS. 3. DB lookup: GetServiceByDomain(qname) — if a service with an assigned pool is found, enabled members of that pool are used (round-robin per service, per IP family). 4. If no DB match, Balancer selects next healthy IP from the static endpoint set: if algorithm is ClientAwareAlgorithm, SortedCandidatesFor(clientIP) is called and candidates iterated by preference order, filtered by IP family and health. 5. Response constructed as A/AAAA answer.

Health checking 1. Checker loops across endpoints at configured interval. 2. Performs TCP or HTTP(S) probe. 3. Stores last-known status in memory.

REST API → Balancer live update 1. API handler calls db.CreateMember / db.UpdateMember / db.DeleteMember. 2. Handler immediately calls balancer.AddEndpoint, SetWeight, or RemoveEndpoint. 3. Next DNS query picks up the change without restart.

GitOps reconciliation 1. Fetch repo; verify GPG signature. 2. Parse YAML; validate configuration. 3. Apply: replace endpoint set; recreate checker if settings changed.

State synchronization 1. Publisher emits local health and heartbeats; writes KV with TTL. 2. Subscriber snapshots KV (health + membership) then subscribes to subjects. 3. GlobalHealthView merges per-node reports and membership; provider policy drives decisions.

UML - See diagrams in docs/diagrams/: - GSLB_Components.puml (component diagram) - GSLB_Classes.puml (class diagram) - seq_DNS_Query.puml (sequence: DNS resolution) - seq_Health_Checks.puml (sequence: health loop) - seq_GitOps_Reconcile.puml (sequence: GitOps apply) - seq_State_Sync.puml (sequence: state sync and quorum)

Operations at a glance - Configure via YAML (local file or fetched by GitOps). - Manage pools/members/services at runtime via REST API; changes take effect immediately. - Optional NATS settings enable global state sync and config distribution. - Prometheus metrics served at /metrics when enabled. - Secure-by-default HTTP health checks (TLS verification on unless explicitly disabled).