Skip to content

Terraform Execution Modes and HCP Workspace Governance for Multi-Cloud and On-Prem

Status: Accepted

1. Context

HybridOps.Studio uses Terraform and Terragrunt across:

  • On-prem infrastructure (currently Proxmox, future additional on-prem platforms).
  • Cloud platforms (Azure today, GCP/AWS later).

HCP Terraform is used as:

  • The system of record for Terraform state.
  • The control plane for policy enforcement (Sentinel/OPA), RBAC, and audit history.
  • The workspace catalogue for platform components.

However, execution constraints differ by platform:

  • Proxmox and other on-prem APIs live on private networks that HCP Terraform runners cannot reach.
  • Cloud providers are reachable from HCP Terraform’s SaaS runners and benefit from fully managed remote execution.

The platform must:

  • Keep all state and governance centralised in HCP Terraform.
  • Allow on-prem plans/applies to run locally on the control node while still using HCP Terraform workspaces and state.
  • Avoid manual drift, where a workspace execution mode is left in an invalid/undesired state.

2. Decision

  1. HCP Terraform is the single source of truth for Terraform state, runs, and policy across HybridOps.Studio.

  2. Execution mode is selected by platform type:

  3. On-prem (Proxmox and similar):

    • Workspaces use local execution.
    • Terraform runs from the control node / Terragrunt runner inside the private network.
    • Workspaces still use the HCP backend for state and governance.
  4. Cloud (Azure, GCP, AWS):

    • Workspaces use remote execution.
    • Plans/applies run on HCP Terraform infrastructure.
  5. Workspace execution mode is enforced by automation:

  6. A script, configure-workspace-execution-mode.sh, is the single implementation point for setting execution mode.

  7. Proxmox Terragrunt roots add an after_hook on terraform init that calls this script to enforce local execution for their workspace.
  8. Cloud Terragrunt roots may enforce or validate remote execution in a similar way.

  9. Workspace naming remains consistent:

  10. Workspace names follow the pattern:

    • hybridops-<provider>-<rel_path>
  11. The same naming is used by Terragrunt, the execution-mode script, and documentation.

3. Rationale

  • Network reachability: HCP Terraform runners cannot reach private Proxmox APIs. Local execution on the control node is the only viable option while still using HCP as backend.
  • Governance: Keeping state, policies, and run metadata in one HCP organisation avoids split-brain between local state and SaaS state.
  • Repeatability: Centralising execution-mode logic in a single script prevents ad hoc per-workspace configuration.
  • Clarity for assessors and reviewers: The model is easy to explain: “HCP is always the source of truth; execution mode is chosen by where the API lives.”

4. Execution model

4.1 Workspace execution-mode script

Location:

  • control/tools/provision/terragrunt/configure-workspace-execution-mode.sh

Responsibilities:

  • Ensure a given HCP Terraform workspace exists with the desired execution mode (local or remote).
  • Be idempotent: safe to run on every terraform init from Terragrunt.
  • Log outcome clearly for evidence and debugging.

Usage pattern (called by Terragrunt, not directly by users):

./configure-workspace-execution-mode.sh <workspace_name> <org> <execution_mode>
  • workspace_name – HCP Terraform workspace name.
  • org – HCP Terraform organisation (default: hybridops-studio).
  • execution_modelocal or remote.

Prerequisites:

  • HCP Terraform credentials via init-tfc-env.sh:
  • Populates ~/.terraform.d/credentials.tfrc.json.
  • Network connectivity from the control node to app.terraform.io (or the relevant HCP endpoint).

4.2 Terragrunt integration for Proxmox

Proxmox live root (infra/terraform/live-v1/onprem/proxmox) includes a terraform block with an after_hook on init:

terraform {
  after_hook "configure_workspace_execution_mode" {
    commands = ["init"]
    execute  = [
      "${get_repo_root()}/control/tools/provision/terragrunt/configure-workspace-execution-mode.sh",
      "hybridops-${local.provider}-${replace(local.rel_path, "/", "-")}",
      "hybridops-studio",
      "local"
    ]
  }
}

This guarantees that:

  • The workspace exists with the expected name.
  • Execution mode is always corrected to local before any plan/apply.
  • HCP still holds the backend state and history for Proxmox modules.

4.3 Cloud workspaces

Cloud roots (for example, Azure) use:

  • remote execution, where plans and applies run on HCP Terraform infrastructure.
  • The same naming convention and backend organisation.
  • Optional use of the same script to assert/repair remote execution mode if needed.

5. Implications and risks

Positive

  • Consistent governance: Single HCP org for state and policy across on-prem and cloud.
  • Clear separation of concerns: Execution location is decoupled from state storage.
  • Operational safety: Proxmox workspaces can never be accidentally left in remote execution mode, which would fail at runtime.
  • Auditable: The execution-mode script and Terragrunt hooks provide repeatable behaviour and loggable evidence.

Risks / trade-offs

  • More moving parts: Adds another helper script to the provisioning toolchain.
  • Partial failure scenarios: If the execution-mode script fails, init may still succeed locally; operators must read logs and fix the workspace manually.
  • HCP dependency: Even local execution for Proxmox depends on HCP availability for state operations.

Mitigations:

  • Keep the script small, tested, and documented under control/tools/provision/terragrunt.
  • Make failure paths explicit and log clearly when workspace execution mode could not be reconciled.
  • Provide a runbook for diagnosing HCP workspace issues when terraform init hooks fail.

6. Alternatives considered

  1. HCP-only remote execution for all platforms
    Rejected: Proxmox and similar on-prem APIs are not reachable from public HCP runners.

  2. Local state for on-prem, HCP for cloud only
    Rejected: Splits the state/governance model; makes cross-environment reasoning and evidence more complex.

  3. Per-workspace manual configuration in the HCP UI
    Rejected: Error-prone, hard to automate, difficult to reproduce for assessors or colleagues.

The chosen approach keeps HCP as the common control plane while adapting execution to network realities.

7. Validation and evidence

  • Proxmox SDN module (onprem/proxmox/core/00-foundation/network-sdn):
  • Runs via Terragrunt with Proxmox workspace in local execution mode.
  • Uses HCP Terraform backend for state.
  • Proxmox Packer templates and env init scripts:
  • Use the same naming conventions and backend organisation.
  • Provide logs and evidence artefacts under output/ for assessors.
  • Future evidence items will include screenshots or CLI transcripts showing:
  • HCP workspace in local mode with successful runs for Proxmox.
  • HCP workspace in remote mode for Azure modules with the same naming scheme.

8. References


Maintainer: HybridOps.Studio
License: MIT-0 for code, CC BY 4.0 for documentation unless otherwise stated.