LXC Containers for Lightweight Workloads on Proxmox¶

Status¶

Accepted — LXC containers are used for lightweight, non-critical helper workloads on Proxmox.
Core control-plane components (ctrl-01), shared databases (PostgreSQL) and Kubernetes nodes run on full VMs, per ADR-0012, ADR-0202 and ADR-0204.

1. Context¶

HybridOps.Studio runs on an on-premises hypervisor (initially Proxmox, but patterns are portable to other enterprise hypervisors) and needs to balance:

Realistic enterprise patterns (full VMs for critical components, clear OS baselines).
Homelab constraints (single node or small clusters, limited CPU/RAM).
A credible DR story where:
Control plane and stateful tiers are VM-based and portable.
Lightweight helpers do not consume disproportionate resources.

Earlier experiments used LXC too aggressively, including:

Running PostgreSQL in LXC (ADR-0013, now superseded by ADR-0501).
Considering LXC for broader control-plane workloads.

Subsequent ADRs clarified:

ADR-0012 — ctrl-01 runs as a VM with cloud-init.
ADR-0017 — OS baseline (Rocky/Ubuntu/Windows) for infra and control layers.
ADR-0202 / ADR-0204 — RKE2 runs on Rocky Linux VMs on enterprise hypervisors.
ADR-0501 — PostgreSQL runs on a dedicated VM with DR replication, not in LXC.

This ADR narrows the LXC story to something that is both realistic and safe:
helpers, tools and teaching workloads only, not shared infrastructure state.

2. Decision¶

HybridOps.Studio adopts the following pattern:

Full VMs (via Packer + cloud-init per ADR-0016) are used for:
Control plane (ctrl-01), PostgreSQL, RKE2 nodes, NetBox (initially) and other core services.
LXC containers on Proxmox are used only for:
Lightweight, non-critical helper services.
Academy / demo workloads that benefit from higher density.
Short-lived tools that do not hold authoritative state.

Typical in-scope LXC workloads:

Docs / site preview containers.
Log shippers, small exporters, or protocol test helpers.
Small demo apps that do not store critical data.
“Utility shells” for teaching Linux/networking concepts.

Explicitly out of scope for LXC:

ctrl-01 and other control nodes (see ADR-0012).
RKE2 control-plane and worker nodes (see ADR-0204).
Shared PostgreSQL instances or other primary databases (see ADR-0501).
Anything considered part of the authoritative state or DR tier.

LXC provisioning is:

Driven via Terraform (Proxmox provider) or Proxmox API directly.
Kept deliberately simpler than the VM pipeline (no Packer for LXC images).
Documented as a complementary option, not a primary building block.

3. Rationale¶

3.1 Why keep LXC at all?¶

Density for helpers and teaching
Homelab resources are finite; LXC allows more “small helpers” and demo nodes without ballooning VM count.
Realism with guard rails
Many enterprises still run light tooling (exporters, small services) on less isolated platforms, but:
- Core control plane and stateful services remain on full VMs or managed services.
Operational clarity
By explicitly scoping LXC to helpers, the platform story stays clean:
- VMs = control, state, cluster nodes.
- LXC = helpers and demos.

3.2 Why not use LXC for PostgreSQL or control plane?¶

Isolation and DR expectations
Control-node and primary database patterns must be easily portable between Proxmox, VMware and cloud.
VM images and VM-level snapshots are more portable and better understood in enterprise DR.
Kernel and cgroup subtleties
LXC introduces host-kernel coupling that can surprise people during upgrades and DR tests.
Evidence clarity
For assessors, it is cleaner to say:
- “Authoritative state lives on a VM with DR replication” (ADR-0501),
- “Kubernetes is stateless compute atop that.”

4. Consequences¶

4.1 Positive consequences¶

Clear, opinionated boundary
VMs for control/state, LXC for helpers only.
Resource efficiency
You can run:
- Docs preview,
- Small exporters,
- Academy demo nodes
  in LXCs without burning VM slots.
Better storytelling
Easier to tell a clean DR and portability story:
- “If a hypervisor dies, we care about restoring VMs; LXCs are nice-to-have.”

4.2 Negative consequences / risks¶

Two provisioning models
Team must understand both VM and LXC provisioning flows.
Potential misuse
Without discipline, someone might again place stateful workloads into LXC “because it’s lighter”.
Kernel coupling
LXC shares the host kernel, so kernel regressions can affect many helpers at once (acceptable for non-critical roles, but still a consideration).

Mitigations:

Document the decision matrix (VM vs LXC) in OS / platform guides.
Require architectural review before putting any new workload into LXC.
Keep LXC inventory clearly marked in NetBox / docs as “helper / non-critical”.

5. Alternatives considered¶

No LXC at all (VMs only)
Simpler to reason about, but:
- Reduces density on a small homelab node.
- Makes some Academy examples more expensive to run concurrently.
“LXC-first” pattern (including databases and control plane)
Rejected:
- Harder to tell a portable DR story.
- Kernel/cgroup edge cases for tools like Kubernetes, Longhorn, etc.
- Conflicts with ADR-0012 and ADR-0501.
Move all helper workloads into RKE2 pods instead of LXC
Conceptually attractive, but:
- Some helpers are explicitly pre-cluster or used to debug the cluster itself.
- You still want a place to run tooling when RKE2 is unhealthy.

6. Implementation notes¶

LXC containers are provisioned via:
Terraform proxmox_lxc resources, or
A small Proxmox API wrapper, where appropriate for demos.
Base images:
Prefer the same OS family as the VM baseline (Rocky / Ubuntu) per ADR-0017.
Use Proxmox standard templates rather than building LXC images with Packer.
Storage:
For helpers and demos, rootfs and any small data live on Proxmox storage.
Do not use LXC for authoritative data; shared databases and critical state live on VMs per ADR-0501.
Inventory:
NetBox and docs should mark LXC nodes clearly as:
- role = helper / tier = non-critical.

7. Operational impact and validation¶

Operational impact:

Platform team must:
Monitor LXC node resource usage to avoid contention with core VMs.
Ensure that no critical services are silently moved into LXC.
Keep a simple runbook for creating, updating and retiring helper containers.

Validation:

Runbooks (to be created or updated):
runbook_lxc-container-provisioning.md — create/update/destroy helper containers.
Evidence:
Screenshots and logs in ../evidence/evidence-02-platform-lxc-lightweight-workloads.md showing:
- Helper services running in LXC.
- Core services (ctrl-01, db-01, RKE2 nodes) running as VMs.
Review trigger:
Revisit this ADR if:
- The platform moves away from Proxmox to another hypervisor where LXC is not available, or
- All helper workloads move into RKE2 permanently, making LXC redundant.

8. References¶

Maintainer: HybridOps.Studio
License: MIT-0 for code, CC-BY-4.0 for documentation unless otherwise stated.