Skip to content

External Secrets Operator with GCP Secret Manager for On-Prem Platform Workloads

This ADR documents the decision to use GCP Secret Manager as the ESO backend for the on-prem RKE2 cluster's platform workloads. ADR-0502 standardises on ESO as the Kubernetes secrets operator; this ADR specifies the backend selection for on-prem platform workloads. The backend differs because on-prem platform workloads are cost-aware and GCP is already present in the deployment picture for DR and cloud burst paths.


1. Context

ADR-0502 established External Secrets Operator as the mechanism for projecting external secret store values into Kubernetes. That ADR uses Azure Key Vault as the backend.

The on-prem RKE2 cluster hosts platform workloads that include:

  • Keycloak (identity provider) with social login credentials for Google and Microsoft
  • NetBox database credentials
  • Platform service shared secrets (shared keys, theme configuration)

For these workloads, GCP Secret Manager is the more appropriate backend because:

  • GCP is already in the deployment picture. The platform uses GCP for DR paths (Cloud SQL, cloud burst), and a GCP project is already provisioned and managed.
  • Azure is not always present in every deployment target. Azure Key Vault introduces an Azure dependency even in deployments where the operator does not use Azure for DR or cloud workloads. GCP is required for the on-prem DR lane.
  • Cost profile. GCP Secret Manager has a lower operational overhead in the context of the existing GCP project than provisioning a separate AKV instance solely for on-prem platform secrets.
  • Consistent secret naming and governance. The HybridOps GSM sync tool (hyops secrets gsm-sync) and the tools/secrets/gsm/map/allowed.csv allowlist already define the authoritative naming convention for platform secrets in GCP Secret Manager.

2. Decision

For on-prem RKE2 platform workloads:

  1. GCP Secret Manager is the ESO backend. A ClusterSecretStore named gcp-secret-manager is deployed in the external-secrets namespace and targets the GCP project associated with the deployment environment.

  2. ESO is deployed as an ArgoCD application with sync wave ordering. The operator runs in wave 1; the ClusterSecretStore (and any namespace-level secret stores) run in wave 2. ExternalSecret resources in application namespaces have no explicit wave and rely on ESO's reconciliation loop to converge after the store is ready.

  3. The bootstrap trust anchor is a GCP Service Account key, stored in the HybridOps bootstrap vault as HYOPS_GSM_SA_KEY_JSON. This key is provisioned into the cluster by the platform/k8s/gsm-bootstrap module, which runs as part of the onprem/rke2-workloads@v1 blueprint immediately after the ArgoCD bootstrap step.

  4. Secret naming follows the GSM allowlist convention. All platform secrets are named hyops-{env}-platform-{KEY} in GCP Secret Manager and are managed through hyops secrets gsm-persist and hyops secrets gsm-sync.

  5. ADR-0502 (AKV + ESO) remains valid for deployments where Azure is the primary cloud provider and AKV is the preferred backend. These two patterns are not mutually exclusive: ESO supports multiple concurrent SecretStore backends within a single cluster.


3. Rationale

3.1 Why GCP Secret Manager over Azure Key Vault for on-prem?

  • GCP presence is guaranteed by the DR and cloud burst lanes. A GCP project is a hard dependency of the HybridOps platform; AKV is not in all deployment shapes.
  • Using a backend that is already funded and managed avoids provisioning a second cloud secret store solely for on-prem use.
  • The GSM sync tooling already implements the allowlist, naming conventions, and secret rotation pipeline. Adding AKV as a second backend would require duplicate tooling or a bridge.

3.2 Why a dedicated module (platform/k8s/gsm-bootstrap) rather than a manual step?

The bootstrap trust anchor (gsm-sa-credentials Kubernetes secret) must exist before ArgoCD can reconcile the ClusterSecretStore. A dedicated module in the blueprint chain eliminates the remaining manual kubectl create secret step, ensures the secret is created idempotently from the bootstrap vault, and produces run evidence consistent with other blueprint steps. The alternative (a post-deploy manual step) is incompatible with the non-interactive and repeatable delivery model HybridOps requires.

3.3 Sync wave ordering

ESO CRDs must exist before the ClusterSecretStore manifest is applied. Wave ordering enforces this:

Wave Resource
1 ESO operator (Helm chart via ArgoCD Application)
2 ClusterSecretStore (gcp-secret-manager)
: ExternalSecret resources (application-level, no wave; converge via ESO reconciliation)

4. Consequences

4.1 Positive

  • Single bootstrap command runs the full chain: VMs → RKE2 → ArgoCD → GSM secret → GitOps takeover.
  • No manual kubectl steps required after initial HYOPS_GSM_SA_KEY_JSON vault population.
  • Secret rotation in GCP Secret Manager propagates to the cluster automatically on the ESO refresh interval (default: 5 minutes).
  • Pattern is consistent with the broader GSM tooling already in hybridops-core.

4.2 Negative / trade-offs

  • GCP dependency. The on-prem cluster requires a reachable GCP project and a valid SA key to start reconciling secrets. In a full GCP outage, ESO cannot refresh secrets (though existing Kubernetes secrets remain until they are manually deleted).
  • Bootstrap key management. HYOPS_GSM_SA_KEY_JSON is stored in the bootstrap vault. Rotation requires updating the vault entry and re-running the gsm-bootstrap module step.
  • Dual backend documentation. Operators need to know which backend applies to which deployment shape. ADR-0502 covers AKV; this ADR covers GCP SM. The distinction must be clear in runbooks and onboarding material.

5. Implementation

5.1 Blueprint integration

The platform/k8s/gsm-bootstrap module is the final step in onprem/rke2-workloads@v1:

template_image_rocky9  rke2_vms  rke2_cluster  gitops_workloads  gsm_bootstrap

The kubeconfig_path is imported automatically from gitops_workloads outputs.

5.2 Workloads repository structure

apps/platform/external-secrets/   # ESO operator ArgoCD Application (wave 1)
apps/platform/secret-stores/      # ClusterSecretStore (wave 2)
apps/platform/keycloak/           # ExternalSecret → platform-keycloak-secrets

GCP project ID is environment-specific and is applied via a Kustomize strategic merge patch in each cluster overlay:

apps/platform/secret-stores/manifests/overlays/onprem/cluster-secret-store-patch.yaml

5.3 Operator bootstrap sequence

The ESO trust anchor is established through a structured sequence of hyops commands. Each step is a prerequisite for the next.

# 1. Bootstrap operator ADC and provision required GCP roles.
#    Grants the Terraform SA roles/editor, roles/resourcemanager.projectIamAdmin,
#    and roles/secretmanager.admin on the project. Sets the project-level
#    iam.disableServiceAccountKeyCreation override using operator ADC.
hyops init gcp --env <env> --with-cli-login

# 2. Provision the GCP project boundary (idempotent on re-run).
hyops apply --env <env> --module org/gcp/project-factory

# 3. Create the ESO service account with roles/secretmanager.secretAccessor.
hyops apply --env <env> --module org/gcp/gsm-eso-sa

# 4. Generate an SA key and write it to the bootstrap vault.
hyops init gcp --env <env> --force --with-eso-sa

# 5. Provision the gsm-sa-credentials Kubernetes secret from the vault.
hyops apply --env <env> --module platform/k8s/gsm-bootstrap

Steps 1–4 use operator ADC credentials. Step 5 runs Ansible locally against the RKE2 cluster kubeconfig. In the onprem/rke2-workloads@v1 blueprint, steps 4–5 are the final two steps following cluster and ArgoCD bootstrap.

5.4 Required bootstrap vault keys

Key Description
HYOPS_GSM_SA_KEY_JSON GCP SA key JSON for eso-gsm-reader with roles/secretmanager.secretAccessor on the target project. Written by hyops init gcp --with-eso-sa.

5.5 Secret allowlist

All platform secrets managed through this path are registered in:

hybridops-core/tools/secrets/gsm/map/allowed.csv

Secrets are provisioned to GCP Secret Manager with:

hyops secrets gsm-persist --env <env> --scope shared

6. Operational considerations

  • SA key rotation: Generate a new key and re-provision the cluster secret:

    hyops init gcp --env --force --with-eso-sa hyops apply --env --module platform/k8s/gsm-bootstrap

This creates a new key, overwrites the bootstrap vault entry, and updates the gsm-sa-credentials Kubernetes secret. The previous key remains active in GCP IAM until explicitly removed. After confirming ESO is reconciling successfully with the new key, delete the old key:

  gcloud iam service-accounts keys list --iam-account eso-gsm-reader@<project>.iam.gserviceaccount.com
  gcloud iam service-accounts keys delete <old-key-id> --iam-account eso-gsm-reader@<project>.iam.gserviceaccount.com
  • GCP project ID change: Update the kustomize patch in apps/platform/secret-stores/manifests/overlays/<cluster>/cluster-secret-store-patch.yaml and commit. ArgoCD will reconcile.
  • DR readiness: Confirm ESO sync health is included in platform DR drills. ESO operator logs and ExternalSecret status conditions are the primary signal.

7. References


Maintainer: HybridOps License: MIT-0 for code, CC-BY-4.0 for documentation unless otherwise stated.


Runbooks