Terraform Execution Modes and HCP Workspace Governance for Multi-Cloud and On-Prem¶
Status: Accepted
1. Context¶
HybridOps.Studio uses Terraform and Terragrunt across:
- On-prem infrastructure (currently Proxmox, future additional on-prem platforms).
- Cloud platforms (Azure today, GCP/AWS later).
HCP Terraform is used as:
- The system of record for Terraform state.
- The control plane for policy enforcement (Sentinel/OPA), RBAC, and audit history.
- The workspace catalogue for platform components.
However, execution constraints differ by platform:
- Proxmox and other on-prem APIs live on private networks that HCP Terraform runners cannot reach.
- Cloud providers are reachable from HCP Terraform’s SaaS runners and benefit from fully managed remote execution.
The platform must:
- Keep all state and governance centralised in HCP Terraform.
- Allow on-prem plans/applies to run locally on the control node while still using HCP Terraform workspaces and state.
- Avoid manual drift, where a workspace execution mode is left in an invalid/undesired state.
2. Decision¶
-
HCP Terraform is the single source of truth for Terraform state, runs, and policy across HybridOps.Studio.
-
Execution mode is selected by platform type:
-
On-prem (Proxmox and similar):
- Workspaces use
localexecution. - Terraform runs from the control node / Terragrunt runner inside the private network.
- Workspaces still use the HCP backend for state and governance.
- Workspaces use
-
Cloud (Azure, GCP, AWS):
- Workspaces use
remoteexecution. - Plans/applies run on HCP Terraform infrastructure.
- Workspaces use
-
Workspace execution mode is enforced by automation:
-
A script,
configure-workspace-execution-mode.sh, is the single implementation point for setting execution mode. - Proxmox Terragrunt roots add an
after_hookonterraform initthat calls this script to enforcelocalexecution for their workspace. -
Cloud Terragrunt roots may enforce or validate
remoteexecution in a similar way. -
Workspace naming remains consistent:
-
Workspace names follow the pattern:
hybridops-<provider>-<rel_path>
- The same naming is used by Terragrunt, the execution-mode script, and documentation.
3. Rationale¶
- Network reachability: HCP Terraform runners cannot reach private Proxmox APIs. Local execution on the control node is the only viable option while still using HCP as backend.
- Governance: Keeping state, policies, and run metadata in one HCP organisation avoids split-brain between local state and SaaS state.
- Repeatability: Centralising execution-mode logic in a single script prevents ad hoc per-workspace configuration.
- Clarity for assessors and reviewers: The model is easy to explain: “HCP is always the source of truth; execution mode is chosen by where the API lives.”
4. Execution model¶
4.1 Workspace execution-mode script¶
Location:
control/tools/provision/terragrunt/configure-workspace-execution-mode.sh
Responsibilities:
- Ensure a given HCP Terraform workspace exists with the desired execution mode (
localorremote). - Be idempotent: safe to run on every
terraform initfrom Terragrunt. - Log outcome clearly for evidence and debugging.
Usage pattern (called by Terragrunt, not directly by users):
./configure-workspace-execution-mode.sh <workspace_name> <org> <execution_mode>
workspace_name– HCP Terraform workspace name.org– HCP Terraform organisation (default:hybridops-studio).execution_mode–localorremote.
Prerequisites:
- HCP Terraform credentials via
init-tfc-env.sh: - Populates
~/.terraform.d/credentials.tfrc.json. - Network connectivity from the control node to
app.terraform.io(or the relevant HCP endpoint).
4.2 Terragrunt integration for Proxmox¶
Proxmox live root (infra/terraform/live-v1/onprem/proxmox) includes a terraform block with an after_hook on init:
terraform {
after_hook "configure_workspace_execution_mode" {
commands = ["init"]
execute = [
"${get_repo_root()}/control/tools/provision/terragrunt/configure-workspace-execution-mode.sh",
"hybridops-${local.provider}-${replace(local.rel_path, "/", "-")}",
"hybridops-studio",
"local"
]
}
}
This guarantees that:
- The workspace exists with the expected name.
- Execution mode is always corrected to
localbefore any plan/apply. - HCP still holds the backend state and history for Proxmox modules.
4.3 Cloud workspaces¶
Cloud roots (for example, Azure) use:
remoteexecution, where plans and applies run on HCP Terraform infrastructure.- The same naming convention and backend organisation.
- Optional use of the same script to assert/repair
remoteexecution mode if needed.
5. Implications and risks¶
Positive¶
- Consistent governance: Single HCP org for state and policy across on-prem and cloud.
- Clear separation of concerns: Execution location is decoupled from state storage.
- Operational safety: Proxmox workspaces can never be accidentally left in
remoteexecution mode, which would fail at runtime. - Auditable: The execution-mode script and Terragrunt hooks provide repeatable behaviour and loggable evidence.
Risks / trade-offs¶
- More moving parts: Adds another helper script to the provisioning toolchain.
- Partial failure scenarios: If the execution-mode script fails,
initmay still succeed locally; operators must read logs and fix the workspace manually. - HCP dependency: Even local execution for Proxmox depends on HCP availability for state operations.
Mitigations:
- Keep the script small, tested, and documented under
control/tools/provision/terragrunt. - Make failure paths explicit and log clearly when workspace execution mode could not be reconciled.
- Provide a runbook for diagnosing HCP workspace issues when
terraform inithooks fail.
6. Alternatives considered¶
-
HCP-only remote execution for all platforms
Rejected: Proxmox and similar on-prem APIs are not reachable from public HCP runners. -
Local state for on-prem, HCP for cloud only
Rejected: Splits the state/governance model; makes cross-environment reasoning and evidence more complex. -
Per-workspace manual configuration in the HCP UI
Rejected: Error-prone, hard to automate, difficult to reproduce for assessors or colleagues.
The chosen approach keeps HCP as the common control plane while adapting execution to network realities.
7. Validation and evidence¶
- Proxmox SDN module (
onprem/proxmox/core/00-foundation/network-sdn): - Runs via Terragrunt with Proxmox workspace in
localexecution mode. - Uses HCP Terraform backend for state.
- Proxmox Packer templates and env init scripts:
- Use the same naming conventions and backend organisation.
- Provide logs and evidence artefacts under
output/for assessors. - Future evidence items will include screenshots or CLI transcripts showing:
- HCP workspace in
localmode with successful runs for Proxmox. - HCP workspace in
remotemode for Azure modules with the same naming scheme.
8. References¶
- Terraform and Terragrunt layout – Backend, workspace naming, and state strategy.
- Terragrunt tools README – Workspace execution-mode helper and usage.
- ADR-0016 – Packer Cloud-Init VM Templates for Proxmox.
- ADR-0601 – Nornir + Ansible Hybrid Automation.
- HCP Terraform – Execution modes and workspaces.
- Terragrunt – terraform block and hooks.
Maintainer: HybridOps.Studio
License: MIT-0 for code, CC BY 4.0 for documentation unless otherwise stated.