DR Execution and Access Model¶
Purpose: Define the clean execution and access model for HybridOps disaster recovery and workload burst operations so recovery does not depend on an operator workstation.
This standard applies to:
- DR blueprints
- burst-to-cloud blueprints
- decision service integrations
- automation runners that invoke
hyops
It complements:
- Decision-Driven DR Orchestration Contract
- Internal DNS and Cutover Contract
- Runner-Local DR Execution Model
- DR Runner Control-Plane Anti-Drift Note
0. Provider composition rule¶
HybridOps SHOULD model runner enablement as four separate concerns:
- provider-specific egress adapter
- provider-specific compute placement
- provider-specific access adapter
- generic runner bootstrap
That means:
platform/linux/ops-runnerremains generic- GCP/Azure/AWS/Proxmox handle VM placement in their own modules
- provider-specific egress is handled separately from runner bootstrap
- runner blueprints compose these layers instead of hiding them inside one module
0.1 GCP project-role rule¶
When HybridOps uses GCP, documentation and inputs SHOULD distinguish project roles instead of implying that one GCP project owns every DR asset.
Preferred roles:
host/network project- Shared VPC
- Cloud Router / NAT
- private runner placement
control project- env-scoped secret authority adapters such as GCP Secret Manager
- env-scoped object repositories such as backup buckets
- other env-scoped control artifacts
workload project- optional future location for env-scoped compute or service projects when Shared VPC attachment is intentionally in use
Norms:
- runner placement MAY live in the host/network project
- secrets and object repositories MAY live in the control project
- documents and blueprints MUST NOT collapse these roles into a generic
project_idstory when different projects are intentionally used
1. Default execution rule¶
HybridOps DR and burst workflows MUST default to runner-local execution, not workstation-direct execution.
Meaning:
- the workflow is triggered by policy/decision
- execution happens from a controlled runner in or near the chosen target environment
- the operator workstation is optional and must not be the assumed control plane
- runner provisioning and runner bootstrap are distinct concerns
This is the default product posture for SMEs, schools, and enterprise customers.
2. Planes¶
HybridOps SHOULD model DR and burst around four planes.
2.1 Decision plane¶
- evaluates health, policy, and target selection
- emits a decision artifact
- does not perform infrastructure changes directly
2.2 Execution plane¶
- runs HyOps blueprints/modules
- captures evidence
- applies approvals and safety gates
2.3 Connectivity plane¶
- provides private reachability between sites and clouds during normal operation
- may use BGP/IPsec, WireGuard, or equivalent overlay/underlay patterns
- must not be assumed to survive an on-prem outage during DR
2.4 Workload plane¶
- reconciles stateless workloads from GitOps sources
- consumes restored/promoted data services
- performs cutover after data and platform readiness checks pass
3. Access modes¶
HybridOps SHOULD support explicit access modes for cloud and DR targets.
3.1 runner-local (preferred default)¶
The automation runner has private L3 reachability to the target subnet or VPC.
Use when:
- running DR in GCP, Azure, or AWS
- bursting workloads into cloud
- executing from a shared control plane
Advantages:
- no per-VM public IP requirement
- cleanest enterprise posture
- strongest fit for pipeline-driven operations
Norm:
- a runner-local target still needs explicit outbound egress for bootstrap and tool delivery
3.2 private-overlay¶
Reachability is provided through a private inter-site path such as:
- BGP over IPsec
- Cloud VPN
- WireGuard overlay
- equivalent routed hybrid connectivity
Use when:
- on-prem and cloud need continuous hybrid connectivity
- workload burst depends on private east-west paths
Norm:
- useful for normal hybrid operation
- not sufficient alone as the DR execution assumption, because on-prem may be unavailable
3.3 bastion-explicit¶
An explicit bastion is provided by contract and used intentionally.
Use when:
- runner-local private reachability is not yet available
- a controlled hop host is acceptable
Norms:
- bastion usage MUST be explicit
- auto-inferred bastions MUST NOT be assumed for cloud DR targets
3.4 gcp-iap¶
Google Cloud IAP TCP forwarding may be used as a provider-specific access mode.
Use when:
- targets are private-only GCE VMs
- the product is operating in GCP
Norm:
- this is valid as a provider-specific enhancement
- it MUST remain optional, not the cross-cloud default
3.5 public-ephemeral¶
Temporary public access to target VMs for a drill or constrained fallback.
Norms:
- drill-only or break-glass
- MUST be time-bounded
- MUST use tightly scoped firewall rules
- MUST NOT be the shipped default for DR database nodes
4. Product rules¶
HybridOps MUST follow these rules.
- Cloud DR blueprints SHOULD default to private-only compute nodes.
- Cloud DR workflows MUST NOT assume the operator laptop can reach private cloud IPs.
ssh_proxy_jump_autoor equivalent convenience logic MUST be limited to on-prem or local-lab scenarios.- Cloud DR and burst workflows SHOULD prefer
runner-localexecution. - Public IP on every DR VM MUST NOT be the default product posture.
5. Control-plane posture¶
HybridOps SHOULD use a shared control plane outside the primary on-prem failure domain.
The shared control plane may host:
- workflow runner
- evidence collection
- decision service
- policy/approval engine
- future secret sync orchestration
This control plane is distinct from:
- the source on-prem site
- the DR target site
Reference shape:
- create the runner host with a platform VM blueprint or module
- bootstrap the runner host with the shipped runner bootstrap module
- use a cloud-side runner for failover and an on-prem runner for failback
- then execute DR or burst workflows from that runner
6. Decision-service boundary¶
Decision service MUST:
- observe
- evaluate
- select
- emit a decision
Decision service MUST NOT:
- directly create infrastructure
- directly mutate cloud or on-prem resources
- bypass workflow approvals and evidence collection
Execution runners consume decision outputs and invoke HyOps.
7. Data and secret handling¶
During DR:
- data recovery/promotion must happen before workload cutover
- secrets sync must be a separate, explicit phase
- secret authority transitions must be documented and evidence-backed
External secret sync SHOULD plug into the execution plane after the DR target has been selected and before application cutover. HashiCorp Vault is the preferred neutral authority; cloud-native secret stores remain valid adapters.
8. Recommended default product shape¶
For the target market:
- default posture: private-only DR targets, runner-local execution
- normal hybrid posture: BGP/IPsec or equivalent private interconnect
- drill fallback: explicit bastion or temporary public access only when justified
This gives the cleanest story for:
- SMEs
- schools
- enterprise customers who want a credible upgrade path later