Skip to content

Decision-Driven DR Orchestration Contract

Purpose: Define the execution contract between decision service, workflow runners, connectivity/access modes, and HyOps DR/burst automation.

1. Scope

This contract defines:

  • what decision service is allowed to output
  • what the execution runner must do
  • which access models are acceptable
  • how DR and burst runs are guarded

It does not define provider-specific implementation details for:

  • Cloud VPN
  • IAP
  • Bastion hosts
  • secret managers

2. Execution boundary

Decision service and execution runner are separate components.

Norms:

  • decision service MUST emit a decision artifact
  • execution runner MUST consume that artifact and execute HyOps
  • decision service MUST NOT directly apply infrastructure or workload changes

3. Required decision fields

At minimum, the decision artifact SHOULD contain:

  • decision_id
  • decision_type
  • burst
  • failover
  • failback
  • hold
  • target_cloud
  • gcp
  • azure
  • aws
  • none
  • dr_mode
  • restore
  • warm_promote
  • managed_promote
  • deny
  • execution_plane
  • runner-local
  • private-overlay
  • bastion-explicit
  • gcp-iap
  • public-ephemeral
  • requires_approval
  • rationale
  • repo_state_ref or equivalent state-driven repository reference

Optional but recommended:

  • workload_profile_ref
  • secondary_backup_copy_required
  • secret_sync_profile
  • cutover_policy

4. Required runner behavior

The execution runner MUST:

  1. validate the decision artifact
  2. validate approvals and policy gates
  3. confirm access-mode prerequisites
  4. invoke the selected HyOps blueprint/module path
  5. capture evidence
  6. emit normalized success/failure outputs

The execution runner MUST NOT silently downgrade:

  • from private-only access to public IPs
  • from explicit approval to implicit approval
  • from state-driven repo selection to hardcoded bucket/container names

5. Access-mode contract

The runner MUST treat access mode as an explicit contract, not an implicit guess.

The first concrete bootstrap path for this contract is:

  • a private cloud runner provisioned by a dedicated blueprint
  • current shipped example: networking/gcp-ops-runner@v1
  • host bootstrap/install performed by the generic runner module
  • current shipped example: platform/linux/ops-runner

Accepted modes

  • runner-local
  • private-overlay
  • bastion-explicit
  • gcp-iap
  • public-ephemeral

Norms

  • runner-local SHOULD be the default for cloud DR and burst
  • bastion-explicit MUST include an explicit bastion host definition
  • public-ephemeral MUST be treated as break-glass or drill-only
  • convenience auto-bastion logic MUST NOT be used as the control-plane contract for cloud DR

6. Safety gates

Before execution starts, the runner MUST fail clearly when any of the following are missing:

  • approval required by policy
  • runtime secrets or vault access
  • target access mode prerequisites
  • repository state or backup source
  • split-brain or fencing preconditions for failover/promotion

7. Workload and data ordering

The execution runner SHOULD follow this sequence:

  1. decide
  2. approve
  3. validate access mode
  4. provision/prepare target
  5. restore or promote data tier
  6. sync required secrets
  7. reconcile stateless workloads
  8. cut traffic
  9. re-enable backup/observability
  10. publish evidence

8. Evidence requirements

Every DR or burst execution MUST capture:

  • decision artifact
  • resolved access mode
  • selected blueprint/module refs
  • evidence path
  • guard confirmations
  • resulting target endpoints

9. Product posture

This contract exists so HybridOps can remain:

  • tarball-safe
  • runner-driven, not workstation-driven
  • supportable for SMEs and schools
  • extensible to enterprise workflows later