LXC Containers for Lightweight Workloads on Proxmox¶
Status¶
Accepted — LXC containers are used for lightweight, non-critical helper workloads on Proxmox.
Core control-plane components (ctrl-01), shared databases (PostgreSQL) and Kubernetes nodes run on full VMs, per ADR-0012, ADR-0202 and ADR-0204.
1. Context¶
HybridOps.Studio runs on an on-premises hypervisor (initially Proxmox, but patterns are portable to other enterprise hypervisors) and needs to balance:
- Realistic enterprise patterns (full VMs for critical components, clear OS baselines).
- Homelab constraints (single node or small clusters, limited CPU/RAM).
- A credible DR story where:
- Control plane and stateful tiers are VM-based and portable.
- Lightweight helpers do not consume disproportionate resources.
Earlier experiments used LXC too aggressively, including:
- Running PostgreSQL in LXC (ADR-0013, now superseded by ADR-0501).
- Considering LXC for broader control-plane workloads.
Subsequent ADRs clarified:
- ADR-0012 —
ctrl-01runs as a VM with cloud-init. - ADR-0017 — OS baseline (Rocky/Ubuntu/Windows) for infra and control layers.
- ADR-0202 / ADR-0204 — RKE2 runs on Rocky Linux VMs on enterprise hypervisors.
- ADR-0501 — PostgreSQL runs on a dedicated VM with DR replication, not in LXC.
This ADR narrows the LXC story to something that is both realistic and safe:
helpers, tools and teaching workloads only, not shared infrastructure state.
2. Decision¶
HybridOps.Studio adopts the following pattern:
- Full VMs (via Packer + cloud-init per ADR-0016) are used for:
- Control plane (
ctrl-01), PostgreSQL, RKE2 nodes, NetBox (initially) and other core services. - LXC containers on Proxmox are used only for:
- Lightweight, non-critical helper services.
- Academy / demo workloads that benefit from higher density.
- Short-lived tools that do not hold authoritative state.
Typical in-scope LXC workloads:
- Docs / site preview containers.
- Log shippers, small exporters, or protocol test helpers.
- Small demo apps that do not store critical data.
- “Utility shells” for teaching Linux/networking concepts.
Explicitly out of scope for LXC:
ctrl-01and other control nodes (see ADR-0012).- RKE2 control-plane and worker nodes (see ADR-0204).
- Shared PostgreSQL instances or other primary databases (see ADR-0501).
- Anything considered part of the authoritative state or DR tier.
LXC provisioning is:
- Driven via Terraform (Proxmox provider) or Proxmox API directly.
- Kept deliberately simpler than the VM pipeline (no Packer for LXC images).
- Documented as a complementary option, not a primary building block.
3. Rationale¶
3.1 Why keep LXC at all?¶
- Density for helpers and teaching
- Homelab resources are finite; LXC allows more “small helpers” and demo nodes without ballooning VM count.
- Realism with guard rails
- Many enterprises still run light tooling (exporters, small services) on less isolated platforms, but:
- Core control plane and stateful services remain on full VMs or managed services.
- Operational clarity
- By explicitly scoping LXC to helpers, the platform story stays clean:
- VMs = control, state, cluster nodes.
- LXC = helpers and demos.
3.2 Why not use LXC for PostgreSQL or control plane?¶
- Isolation and DR expectations
- Control-node and primary database patterns must be easily portable between Proxmox, VMware and cloud.
- VM images and VM-level snapshots are more portable and better understood in enterprise DR.
- Kernel and cgroup subtleties
- LXC introduces host-kernel coupling that can surprise people during upgrades and DR tests.
- Evidence clarity
- For assessors, it is cleaner to say:
- “Authoritative state lives on a VM with DR replication” (ADR-0501),
- “Kubernetes is stateless compute atop that.”
4. Consequences¶
4.1 Positive consequences¶
- Clear, opinionated boundary
- VMs for control/state, LXC for helpers only.
- Resource efficiency
- You can run:
- Docs preview,
- Small exporters,
- Academy demo nodes
in LXCs without burning VM slots.
- Better storytelling
- Easier to tell a clean DR and portability story:
- “If a hypervisor dies, we care about restoring VMs; LXCs are nice-to-have.”
4.2 Negative consequences / risks¶
- Two provisioning models
- Team must understand both VM and LXC provisioning flows.
- Potential misuse
- Without discipline, someone might again place stateful workloads into LXC “because it’s lighter”.
- Kernel coupling
- LXC shares the host kernel, so kernel regressions can affect many helpers at once (acceptable for non-critical roles, but still a consideration).
Mitigations:
- Document the decision matrix (VM vs LXC) in OS / platform guides.
- Require architectural review before putting any new workload into LXC.
- Keep LXC inventory clearly marked in NetBox / docs as “helper / non-critical”.
5. Alternatives considered¶
- No LXC at all (VMs only)
-
Simpler to reason about, but:
- Reduces density on a small homelab node.
- Makes some Academy examples more expensive to run concurrently.
-
“LXC-first” pattern (including databases and control plane)
-
Rejected:
- Harder to tell a portable DR story.
- Kernel/cgroup edge cases for tools like Kubernetes, Longhorn, etc.
- Conflicts with ADR-0012 and ADR-0501.
-
Move all helper workloads into RKE2 pods instead of LXC
- Conceptually attractive, but:
- Some helpers are explicitly pre-cluster or used to debug the cluster itself.
- You still want a place to run tooling when RKE2 is unhealthy.
6. Implementation notes¶
- LXC containers are provisioned via:
- Terraform
proxmox_lxcresources, or - A small Proxmox API wrapper, where appropriate for demos.
- Base images:
- Prefer the same OS family as the VM baseline (Rocky / Ubuntu) per ADR-0017.
- Use Proxmox standard templates rather than building LXC images with Packer.
- Storage:
- For helpers and demos, rootfs and any small data live on Proxmox storage.
- Do not use LXC for authoritative data; shared databases and critical state live on VMs per ADR-0501.
- Inventory:
- NetBox and docs should mark LXC nodes clearly as:
role = helper/tier = non-critical.
7. Operational impact and validation¶
Operational impact:
- Platform team must:
- Monitor LXC node resource usage to avoid contention with core VMs.
- Ensure that no critical services are silently moved into LXC.
- Keep a simple runbook for creating, updating and retiring helper containers.
Validation:
- Runbooks (to be created or updated):
runbook_lxc-container-provisioning.md— create/update/destroy helper containers.- Evidence:
- Screenshots and logs in
../evidence/evidence-02-platform-lxc-lightweight-workloads.mdshowing:- Helper services running in LXC.
- Core services (ctrl-01, db-01, RKE2 nodes) running as VMs.
- Review trigger:
- Revisit this ADR if:
- The platform moves away from Proxmox to another hypervisor where LXC is not available, or
- All helper workloads move into RKE2 permanently, making LXC redundant.
8. References¶
- ADR-0012 – Control Node Runs as a VM (Cloud-Init); LXC Reserved for Light Helpers
- ADR-0016 – Adopt Packer + Cloud-Init for VM Template Standardization
- ADR-0017 – Operating System Baseline for HybridOps.Studio
- ADR-0202 – Adopt RKE2 as Primary Runtime for Platform and Applications
- ADR-0204 – RKE2 Runs on Rocky VMs on Enterprise Hypervisors
- ADR-0501 – PostgreSQL on Dedicated VM with DR Replication
Maintainer: HybridOps.Studio
License: MIT-0 for code, CC-BY-4.0 for documentation unless otherwise stated.