Secrets Management Strategy for Hybrid Kubernetes & Platform Workloads¶
Status¶
Superseded by ADR-0020. This ADR remains as historical context for the original governance pattern and the move to external secret stores plus pull-based synchronisation.
1. Context¶
HybridOps.Studio spans:
- On-prem RKE2 clusters running on enterprise hypervisors (ADR-0202, ADR-0204).
- Supporting platform services (for example Jenkins, NetBox, PostgreSQL) and future cloud clusters.
Early iterations used:
- Raw Kubernetes
Secretmanifests with inline values in non-production. - Ad-hoc
.envfiles during bootstrap phases.
This does not scale or audit well:
- Secrets can leak into Git history or local artefacts.
- Rotation requires manual edits across multiple places.
- It is hard to prove where secret values live and how they are controlled.
The platform requires a governed approach that:
- Keeps secret values out of Git.
- Uses external, audit-capable secret stores as the steady-state source of truth.
- Works across on-prem and cloud deployments.
- Integrates with GitOps and CI/CD workflows.
2. Decision¶
HybridOps.Studio adopts the following secrets management strategy:
- External secret stores are authoritative for steady-state secret values
- Azure Key Vault is the primary store for application and platform secret values (ADR-0020, ADR-0502).
-
Kubernetes
Secretobjects are runtime projections and are not primary storage. -
Kubernetes receives secrets via pull-based synchronisation
- External Secrets Operator (ESO) syncs secret values from Azure Key Vault into Kubernetes
Secretobjects (ADR-0502). -
Git contains
ExternalSecretresources and wiring, not secret values. -
Git holds references, not values
- Plaintext
.envfiles are restricted to local bootstrap connectivity and are not committed. -
An encrypted vault bundle (
control/secrets.vault.env) is permitted to enable non-interactive bootstrap, CI, and DR automation. It must not be used as a steady-state application secret store. -
Technology-specific implementation lives in category ADRs
- ADR-0502 defines the primary RKE2 implementation (AKV + ESO).
- ADR-0020 defines the overall hierarchy (AKV steady state, encrypted vault bundle for bootstrap/CI/DR automation, SOPS optional policy artefact).
This ADR defines the governance pattern. Implementation specifics are handled in ADR-0502 and ADR-0020.
3. Rationale¶
- Security and auditability
- External secret stores provide RBAC and audit logs.
-
Rotation is performed at the source of truth rather than across multiple manifests.
-
Separation of concerns
- Git holds desired wiring; secret stores hold values.
-
Clusters and pipelines consume secrets but do not own the source of truth.
-
Hybrid readiness
-
The approach applies consistently across on-prem and cloud clusters.
-
Evidence and traceability
- Clear artefact trail for secret creation, sync, and consumption without exposing secret values.
4. Consequences¶
4.1 Positive consequences¶
- Reduced risk of secret leakage into Git repositories.
- Consistent pattern across environments: external store + operator.
- Clear ADR layering:
- Governance pattern (this ADR, historical).
- Concrete implementation (ADR-0502).
- End-to-end secrets hierarchy and runner-driven automation model (ADR-0020).
4.2 Negative consequences / risks¶
- Operational dependency on external secret stores and the operator.
- Bootstrap requires an initial trust anchor (service principal, token, or runner identity).
- Additional documentation required for onboarding, rotation, and validation flows.
Mitigations:
- Runbooks for operator bootstrap, identity setup, and end-to-end validation.
- CI workflows that validate secret sync health without printing secret values.
5. Alternatives considered¶
- Plain Kubernetes Secrets with Git-stored values: rejected due to leakage risk and poor audit trail.
- Per-cluster bespoke approaches: rejected due to operational complexity and inconsistent teaching/evidence story.
- Single universal vault product as baseline: deferred; AKV-based patterns are lower operational weight and aligned with target deployments.
6. Implementation notes¶
- Platform ADRs
- ADR-0502 defines the primary implementation: Azure Key Vault + External Secrets Operator for RKE2 workloads.
-
ADR-0020 defines bootstrap/CI/DR automation via encrypted vault bundle and optional policy-driven SOPS artefacts.
-
Code and configuration
deploy/*/secrets/holdsExternalSecretresources (or equivalent) and secret wiring.-
ESO is installed as part of the RKE2 platform bootstrap.
-
Bootstrap and automation inputs
control/secrets.vault.envmay be used as a short-scope encrypted bundle for non-interactive automation.-
Decryption occurs at runtime only and is scoped to single command execution.
-
Validation and evidence
- Evidence 4 captures KMS entries, ESO sync behaviour, and workloads consuming secrets without exposing values.
7. Operational impact and validation¶
Operational impact:
- Secrets are managed primarily in the external secret store.
.envfiles and ad-hoc secrets remain temporary bootstrap artefacts only.- Operator health is included in monitoring and validation workflows.
Validation:
- Runbooks under
../ops/runbooks/security/describe onboarding and rotation flows. - Proof artefacts referenced here and in ADR-0502 capture end-to-end secret sync and consumption.
8. References¶
- ADR-0001 – ADR Process & Conventions
- ADR-0202 – Adopt RKE2 as Primary Runtime for Platform and Applications
- ADR-0204 – RKE2 Runs on Rocky VMs on Enterprise Hypervisors
- ADR-0502 – Use External Secrets Operator with Azure Key Vault for Application Secrets
- ADR-0020 – Secrets Strategy
- ADR-0600 – Environment Guard Framework
Maintainer: HybridOps.Studio
License: MIT-0 for code, CC-BY-4.0 for documentation unless otherwise stated.