Internal DNS Authority and Service Endpoint Model¶
Purpose¶
Define the clean DNS model for HybridOps so workloads and operators do not depend on raw service IPs during normal operation, failover, or failback.
This standard complements:
- DR Execution and Access Model
- DR Runner Control-Plane Anti-Drift Note
- Decision-Driven DR Orchestration Contract
1. Required split¶
HybridOps MUST separate:
- DNS authority
- the system that hosts authoritative zones and answers for service records
- DNS cutover automation
- the system that updates records during failover, failback, or traffic shift
- Source-of-truth metadata
- the system that records IPAM, ownership, and desired endpoint data
These are related but not the same component.
2. Default authority model¶
HybridOps SHOULD use:
- Cloudflare DNS for public internet zones
- examples:
hybridops.techdocs.hybridops.techlearn.hybridops.tech
- shared internal authoritative DNS for private platform and workload zones
- recommended first implementation: PowerDNS Authoritative
Reason:
- Cloudflare is well-suited for public zones and public web properties
- internal DR and service discovery should remain provider-neutral
- internal DNS must work across on-prem, GCP, Azure, and AWS
- the first internal authority should not force the product into one cloud's DNS service
3. Recommended internal authority¶
The recommended first internal DNS authority for HybridOps is:
- PowerDNS Authoritative
- hosted in the shared control plane
- managed through its API
Recommended first topology:
- primary writable PowerDNS on a dedicated shared control-plane host in Hetzner
- secondary read-only PowerDNS on-prem via AXFR/IXFR
- optional additional secondary in a cloud control plane later
This keeps internal DNS:
- outside the primary on-prem failure domain
- close to the shared control plane and runner plane
- available locally on-prem for steady-state reads even when the shared link is degraded
This is preferred over:
- NetBox as the DNS engine
- GCP Cloud DNS as the primary internal authority
- Azure DNS as the primary internal authority
- Route 53 as the primary internal authority
Those remain valid adapters or forwarders later, but not the first shared authority.
4. NetBox boundary¶
NetBox SHOULD remain:
- IPAM and infrastructure source of truth
- endpoint metadata source
- ownership and inventory source
NetBox SHOULD NOT be treated as the first authoritative DNS engine for HybridOps DR.
If richer DNS modeling is desired later, it MAY be added as metadata support, but HybridOps should still keep authoritative internal DNS and DNS cutover as separate concerns.
5. Service endpoint rule¶
Applications and platform services MUST consume stable DNS names, not raw service IPs.
Examples:
postgres.dev.hyops.internalnetbox.shared.hyops.internalargocd.dev.hyops.internalthanos.shared.hyops.internal
That means:
- normal operation points the name at the primary endpoint
- failover points the same name at the DR endpoint
- failback points the same name back to the restored primary site
6. Zone model¶
HybridOps SHOULD standardize on:
- one shared internal zone for private service endpoints
- one clear naming scheme for env-scoped records
Reference pattern:
- zone:
hyops.internal - env-scoped records:
postgres.dev.hyops.internalpostgres.staging.hyops.internalpostgres.prod.hyops.internal- shared service records:
netbox.shared.hyops.internalvault.shared.hyops.internal
7. Automation rule¶
HybridOps DNS automation SHOULD operate in two layers:
- service modules and DR blueprints publish the desired endpoint contract
- DNS automation updates the authoritative DNS service accordingly
The existing dns-routing capability belongs to layer 2.
It is cutover/update automation, not the DNS authority itself.
8. Forwarding and resolution¶
Cloud and on-prem environments MAY use provider-native forwarders or conditional resolvers to reach the shared internal authority.
Examples:
- on-prem recursive resolvers forwarding
hyops.internal - GCP forwarding zones forwarding
hyops.internal - Azure/AWS equivalents later
The authority remains shared; forwarding is per-environment resolution plumbing.
Recommended first resolution posture:
- on-prem recursive resolvers forward
hyops.internalto the shared PowerDNS primary/secondary set - GCP and later Azure/AWS use forwarding or conditional resolver rules for
hyops.internal - applications never query the authoritative servers directly
9. Product rules¶
HybridOps MUST follow these rules:
- Public web DNS and internal platform DNS MUST remain separate.
- DR-critical service endpoints MUST use DNS names, not raw node IPs.
- Internal DNS authority MUST live outside the primary on-prem failure domain.
- DNS cutover automation MUST NOT be confused with DNS hosting.
- The tarball MUST ship contracts and automation hooks, but MUST NOT depend on a public website as the runtime DNS source of truth.
10. Current recommended path¶
For the current HybridOps product stage:
- keep Cloudflare for public zones
- stand up shared PowerDNS primary on a dedicated Hetzner shared control-plane host
- add an on-prem PowerDNS secondary
- standardize
hyops.internalnaming - let DR blueprints publish endpoint DNS names
- let DNS cutover automation update the shared internal authority
This is the cleanest path that remains tarball-safe, cross-cloud, and supportable.
The first executable control path for step 5 is now:
platform/network/dns-routing- with
provider: powerdns-api - using the shared internal PowerDNS authority as the authoritative target
- consuming the shared PowerDNS authority state by default, with explicit endpoint overrides reserved for break-glass cases