Skip to content

Infrastructure as Code Engine: Terraform with Terragrunt Composition

Status

Accepted — Terraform is the IaC execution engine; Terragrunt is the composition layer for live environment trees.


1. Context

HybridOps.Core executes infrastructure changes as contract-driven modules via a stable operator interface (hyops). Many modules require an IaC engine that can:

  • provision across Proxmox, Hetzner, Azure, and GCP using a consistent workflow,
  • separate reusable building blocks from environment-specific configuration,
  • emit verifiable artefacts (plans, applies, outputs) suitable for audit and support,
  • scale from controlled development in HybridOps.Workbench to supported module execution in HybridOps.Core.

Terraform provides the widest provider coverage and the most recognisable execution model for operators and reviewers. Terragrunt provides disciplined composition and DRY environment wiring for “live trees” without pushing business logic into wrapper files.


2. Decision

  1. Terraform is the standard IaC execution engine in HybridOps.
  2. Terragrunt is the standard composition layer for live environment trees (layered stacks, remote state wiring, and DRY configuration).
  3. HybridOps.Core modules do not embed environment composition logic. Modules call a Terraform driver that:
  4. executes Terraform directly for single-module runs, or
  5. executes Terragrunt when a module is backed by a composed live stack.
  6. Evidence and outputs are first-class:
  7. plans/applies and module outputs are captured as module evidence,
  8. downstream modules consume outputs via declared dependencies (contract model).

3. Scope

In scope:

  • Terraform execution for infrastructure provisioning in HybridOps modules.
  • Terragrunt usage to compose environment stacks where a layered live tree is required.
  • Driver-based execution model (HybridOps.Core invokes IaC via drivers, not ad-hoc scripts).

Out of scope:

  • Selecting specific cloud services (AKS vs GKE, database engines, etc.).
  • Kubernetes desired-state reconciliation tooling (Argo CD / Flux), covered by separate ADRs.
  • Introducing alternative IaC engines (Pulumi, CDKTF, cloud-native templates) without a superseding ADR.

4. Rationale

4.1 Why Terraform

  • Broad, mature provider ecosystem across target platforms.
  • Declarative model supports repeatability and evidence capture (plan/apply outputs, state transitions, deterministic inputs).
  • Recognisable operational model for assessors, hiring reviewers, and engineering teams.
  • Fits well with contract-driven modules: inputs → execution → outputs → probes.

4.2 Why Terragrunt (as composition, not business logic)

  • Reduces duplication of backend/provider wiring across multiple stacks.
  • Supports layered environment composition (foundation → platform → services).
  • Encourages thin stack entrypoints while keeping business logic in Terraform modules and module contracts.
  • Improves operator UX in large trees without turning wrappers into an application framework.

5. Alternatives considered

5.1 Terraform without Terragrunt

Rejected for multi-environment trees due to repeated backend/provider wiring and higher drift risk between stacks.

5.2 Cloud-native templates only (Bicep / ARM, Deployment Manager, etc.)

Rejected due to fragmented execution model across on-prem and multi-cloud targets.

5.3 Pulumi / general-purpose IaC engines

Deferred. May be revisited if provider coverage, operator ergonomics, or governance requirements change.


6. Consequences

6.1 Positive

  • Single IaC execution model across hybrid targets.
  • Clear separation:
  • Terraform modules are reusable building blocks,
  • Terragrunt composes environment stacks when needed,
  • HybridOps.Core modules remain contract-driven and portable.
  • Strong evidence story: plans/applies/outputs become standard artefacts for validation and support.

6.2 Negative / trade-offs

  • Additional dependency and operational surface area when Terragrunt is used.
  • Version management becomes important; changes to Terraform/Terragrunt versions must be deliberate and tested.
  • Operators must understand when a module runs Terraform directly vs via a Terragrunt-backed live stack.

7. Implementation notes

  • HybridOps.Core invokes IaC via drivers (for example drivers/iac/terraform/).
  • Workbench retains full environment trees for development and composition; promoted modules reference stable driver entrypoints.
  • Module contracts declare required inputs and dependency outputs; the runtime injects dependency outputs into the execution context.
  • Module evidence should include:
  • plan/apply output,
  • resulting outputs (outputs.json where relevant),
  • any state or inventory exports required by probes.

8. References

  • Link module-specific runbooks/HOWTOs here once they exist (avoid speculative links).

draft reference: ## 8. References