Provision GCP Ops Runner (HyOps Blueprint)¶
Purpose: Provision and bootstrap a private GCP runner in the hub core subnet so DR and burst workflows can run from inside the target cloud network.
Owner: Platform engineering / SRE
Trigger: Initial cloud control-plane bootstrap, DR pipeline preparation, or runner rebuild
Impact: Ensures private-subnet egress for the runner subnet, creates one private runner VM in GCP, and bootstraps the HybridOps runner toolchain on it for workflow or CI-driven execution
Severity: P2
Pre-reqs: hyops init gcp is ready for the target env, org/gcp/project-factory, org/gcp/wan-hub-network, and org/gcp/wan-cloud-router states exist, and GCP init has discovered or been given a real SSH public key. \
Rollback strategy: Run platform/linux/ops-runner#gcp_ops_runner_bootstrap with runner_state: absent if needed, then destroy the platform/gcp/platform-vm#gcp_ops_runner state instance or rerun with corrected inputs.
Context¶
Blueprint ref: networking/gcp-ops-runner@v1
Location: hybridops-core/blueprints/networking/gcp-ops-runner@v1/blueprint.yml
Step flow:
org/gcp/wan-cloud-nat#gcp_ops_runner_egressplatform/gcp/platform-vm#gcp_ops_runnerplatform/linux/ops-runner#gcp_ops_runner_bootstrap
This blueprint is the preferred cloud-side execution bootstrap for:
execution_plane: runner-localDR blueprints- cloud burst workflows
- shared CI or pipeline execution near the target VPC
The runner blueprint intentionally pins an Ubuntu LTS image for the execution host. The generic GCP VM module defaults to Rocky Linux, but the runner control-plane path is standardized on Ubuntu so the current toolchain bootstrap remains predictable and tarball-safe. For a private runner, inbound IAP access is not enough on its own. This blueprint composes the provider-specific egress layer explicitly by ensuring Cloud NAT for the hub core subnet before the runner VM is bootstrapped. NAT is scoped to the selected runner subnet rather than becoming an implicit blanket network rule.
GCP project roles¶
This blueprint is easiest to understand when project roles are explicit.
host/network project- derived from
network_state_ref - owns the Shared VPC, router, NAT, and runner VM placement
control project- typically derived from
org/gcp/project-factory - owns env-scoped control artifacts such as GCP Secret Manager and backup buckets
workload project- optional later role for env-specific service projects when Shared VPC attachment is intentionally enabled
This means it is normal for the GCP runner VM to appear in the host/network project while DR secrets or backup repositories for the same env live in the control project.
Preconditions and safety checks¶
- Validate upstream cloud state:
hyops state show --env dev --module org/gcp/project-factory
hyops state show --env dev --module org/gcp/wan-hub-network
hyops state show --env dev --module org/gcp/wan-cloud-router
- Validate blueprint definition and preflight:
hyops blueprint validate --ref networking/gcp-ops-runner@v1
hyops blueprint preflight --env dev --ref networking/gcp-ops-runner@v1
-
Ensure you replace the remaining environment-specific placeholders before first deploy:
-
zone
The blueprint intentionally consumes project and network/subnet from state, so you do not need to duplicate:
project_idnetworksubnetwork
For the shared GCP runner, the VM project is derived from network_state_ref.
That keeps the runner in the hub-network host project by default instead of assuming a separate service project attachment.
This is intentional and should not be read as "all env control resources must
also live in that same project."
The VM step uses ssh_keys_from_init: true.
If you prefer to override the SSH key explicitly, set:
ssh_keys_from_init: falsessh_keys: [...]
Do not set both at the same time; HyOps fails fast on that conflict.
The shipped blueprint also includes the allow-iap-ssh network tag.
That tag aligns with the default firewall policy created by org/gcp/wan-hub-network
so the bootstrap step can use GCP IAP without adding a public IP.
The bootstrap step expects either:
- the installed
hyopswrapper, which already exportsHYOPS_CORE_ROOT, or - an explicit
HYOPS_CORE_ROOT=/path/to/unpacked/hybridops-corewhen running from a source checkout
That keeps the blueprint tarball-safe while still working for development checkouts.
If you intend to run TFC-backed blueprints from this runner, also initialize Terraform Cloud in the selected local runtime first:
hyops init terraform-cloud --env dev --with-cli-login
Runner dispatch now projects that local Terraform Cloud auth into the remote job
automatically when it is available.
For runner-local GCP workflows, Terraform Cloud remains the backend for state.
HyOps-managed workspaces are expected to execute in local mode on the runner,
not as remote Terraform Cloud runs.
Steps¶
- Copy the shipped blueprint and set real values
hyops blueprint init --env dev \
--ref networking/gcp-ops-runner@v1 \
--dest-name gcp-ops-runner.yml
# edit ~/.hybridops/envs/dev/config/blueprints/gcp-ops-runner.yml and set a real zone
# by default the blueprint uses ssh_keys_from_init=true and consumes the key from ~/.hybridops/envs/<env>/meta/gcp.ready.json
hyops blueprint validate --file "$HOME/.hybridops/envs/dev/config/blueprints/gcp-ops-runner.yml"
hyops blueprint preflight --env dev --file "$HOME/.hybridops/envs/dev/config/blueprints/gcp-ops-runner.yml"
- Execute the blueprint
hyops blueprint deploy --env dev \
--file "$HOME/.hybridops/envs/dev/config/blueprints/gcp-ops-runner.yml" \
--execute
- Verify published state
hyops state show --env dev --module org/gcp/wan-cloud-nat#gcp_ops_runner_egress
hyops state show --env dev --module platform/gcp/platform-vm#gcp_ops_runner
hyops state show --env dev --module platform/linux/ops-runner#gcp_ops_runner_bootstrap
Expected:
org/gcp/wan-cloud-nat#gcp_ops_runner_egressisstatus: okplatform/gcp/platform-vm#gcp_ops_runnerisstatus: okplatform/linux/ops-runner#gcp_ops_runner_bootstrapisstatus: ok- one VM named like
platform-shared-runner-01 - private IP inside the hub core subnet
-
HybridOps installed under
/opt/hybridops/coreby default -
Validate the runner dispatch path itself
hyops runner blueprint preflight --env dev \
--runner-state-ref platform/linux/ops-runner#gcp_ops_runner_bootstrap \
--file "$HOME/.hybridops/envs/dev/config/blueprints/gcp-ops-runner.yml"
Expected:
- local evidence under
~/.hybridops/envs/dev/logs/runner/<run_id>/ runner_prepare,runner_upload,runner_extract,runner_exec,runner_download, andrunner_cleanupevidence files- final summary:
runner=platform/linux/ops-runner#gcp_ops_runner_bootstrap status=ok
If the remote workflow depends on secrets already stored in the selected runtime vault, or temporarily present in your local shell, forward them explicitly per run. hyops runner resolves --sync-env keys from the runtime vault first and lets shell env override when needed:
hyops runner blueprint preflight --env dev \
--runner-state-ref platform/linux/ops-runner#gcp_ops_runner_bootstrap \
--sync-env PATRONI_SUPERUSER_PASSWORD \
--sync-env PATRONI_REPLICATION_PASSWORD \
--sync-env PG_BACKUP_GCS_SA_JSON \
--file "$HOME/.hybridops/envs/dev/config/blueprints/dr-postgresql-ha-failover-gcp.yml"
If you want the runtime vault cache refreshed from an external secret authority before the runner job is staged, use one of these patterns.
HashiCorp Vault:
VAULT_ADDR=https://vault.example.com \
VAULT_TOKEN=... \
hyops runner blueprint preflight --env dev \
--runner-state-ref platform/linux/ops-runner#gcp_ops_runner_bootstrap \
--secret-source vault \
--secret-scope dr \
--sync-env PATRONI_SUPERUSER_PASSWORD \
--sync-env PATRONI_REPLICATION_PASSWORD \
--sync-env PG_BACKUP_GCS_SA_JSON \
--file "$HOME/.hybridops/envs/dev/config/blueprints/dr-postgresql-ha-failover-gcp.yml"
GCP Secret Manager:
hyops runner blueprint preflight --env dev \
--runner-state-ref platform/linux/ops-runner#gcp_ops_runner_bootstrap \
--secret-source gsm \
--secret-scope dr \
--gsm-project-state-ref org/gcp/project-factory \
--sync-env PATRONI_SUPERUSER_PASSWORD \
--sync-env PATRONI_REPLICATION_PASSWORD \
--sync-env PG_BACKUP_GCS_SA_JSON \
--file "$HOME/.hybridops/envs/dev/config/blueprints/dr-postgresql-ha-failover-gcp.yml"
By default the GSM path resolves the project from env-scoped GCP state/config.
For a cleaner control-plane split, prefer --gsm-project-state-ref org/gcp/project-factory
so the runner can stay in the host/network project while secrets stay in the
env control project. Use --gsm-project-id only when you need to override that
explicitly.
How this fits DR¶
Use this runner as the execution host for cloud DR and burst workflows that declare:
execution_plane: runner-local
That keeps the product posture clean:
- private target VMs by default
- no workstation dependency
- no public IP per target node
- no cloud DR dependency on convenience bastion inference
The shipped GCP runner blueprint already composes the egress adapter, the VM lifecycle, and the runner bootstrap lifecycle.
Use platform/linux/ops-runner directly only when you are bootstrapping a runner outside this blueprint or doing day-2 maintenance.
Once the runner exists, prefer hyops runner blueprint ... for cloud DR and burst execution instead of invoking the target blueprint from the workstation.