Skip to content

Burst Web Platform on GKE

Overview

Burst Web Platform on GKE adds warm cloud capacity to a public service without changing its hostname or delivery model. The burst origin stays ready in advance, and traffic moves only when the control plane calls for it.

This showcase covers capacity control, not migration or disaster recovery.

Case study

  • Context: the public web platform served from a single origin with no mechanism to add cloud capacity without changing the public hostname or taking a service cut during the transition.
  • Challenge: burst capacity had to be ready before the event. Provisioning under pressure was not an acceptable operating model.
  • Approach: GKE hosts the burst origin via gcp/gke-burst@v1. Cloudflare steers the same hostname (burst.hybridops.tech) between primary and burst origins. A decision service evaluates pressure and health signals before changing traffic state. Failback is an explicit operator action once pressure clears.
  • Outcome: the recorded burst exercise completed cleanly. Traffic steering was confirmed via traffic.status: live-ok, the burst origin was verified before the event, primary service was restored cleanly after, and the hostname remained unchanged throughout.

Shows the Grafana event view, the decision-service trigger, same-host steering into burst capacity, and the controlled return to primary.

Outcome

The result is a public web platform that can add capacity without changing its address or identity.

  • One public hostname remains in use throughout the event.
  • Burst capacity is activated under measured pressure instead of being left to an unplanned cutover.
  • Traffic can return to primary cleanly after the event without abandoning operator control.

Operating model

This is a warm burst model.

  • Burst capacity is provisioned before the event and verified in service.
  • Cloudflare steers the same hostname between the primary origin and the burst origin.
  • Decision service evaluates the relevant signals before traffic state changes.
  • Failback is explicit and reviewable once pressure clears.

The pattern is deliberately focused on the public edge and control path. It suits services that need short-duration capacity expansion without introducing a second public front door.

Architecture

Burst web operating model showing one Cloudflare hostname, decision-driven steering, and traffic shifting between the primary origin and a pre-warmed GKE burst origin.

The burst origin is pre-warmed and already verified before any traffic event. One hostname remains in use throughout; the decision service controls when traffic shifts and when it returns.

Activation sequence

  1. The primary origin serves normal traffic on the shared hostname.
  2. Observability and probes detect pressure or degraded service conditions.
  3. Decision service shifts the shared hostname into balanced delivery.
  4. The burst origin absorbs capacity, then traffic returns to primary after recovery.

Platform state

Grafana burst control dashboard showing probe health, pressure, degraded state, and steering posture during the recorded event Cloudflare Workers metrics for the burst web proxy handling the public hostname during the recorded event window Public web experience on the shared hostname during the recorded burst event

IP addresses, hostnames, and instance identifiers visible in screenshots and recordings reflect the ephemeral infrastructure provisioned during the recorded exercise.

Component Module Last run Status
GKE burst cluster platform/gcp/gke-cluster#gke_burst_cluster 2026-03-18T11:54:47Z ok
Argo CD bootstrap platform/k8s/argocd-bootstrap 2026-03-20T23:59:23Z ok
Runtime bundle delivery platform/k8s/runtime-bundle-secret 2026-03-21T00:07:25Z ok

Run records are retained as part of the platform execution log for review and audit.

Implementation

  • Burst workload: GKE hosts the burst origin.
  • Traffic control: Cloudflare keeps one public hostname while applying the steering policy.
  • Control loop: decision service acts on observed pressure and health conditions.
  • Observability: Grafana and Thanos provide the operator view of control state and signal history.
  • GitOps discipline: the public and private workload paths remain separately managed.
  • Runtime delivery: private application payloads are delivered through the runtime-bundle path rather than being mixed into the public docs surface.

Key components

  • Burst baseline blueprint: gcp/gke-burst@v1
  • Front door steering: platform/network/cloudflare-traffic-steering#showcase_burst_balancer
  • Decision control: platform/network/decision-service
  • Observability: platform/network/edge-observability
  • Public GitOps root: platform/k8s/argocd-bootstrap#gke_burst_gitops
  • Private GitOps root: platform/k8s/argocd-bootstrap#gke_burst_web_gitops
  • Secret store bootstrap: platform/k8s/gcp-secret-store#gke_burst_secret_store
  • Runtime payload delivery: platform/k8s/runtime-bundle-secret#showcase_burst_web_runtime_dev_burst

Where it fits

  • customer and partner portals
  • launch and campaign traffic spikes
  • product documentation and public web surfaces
  • hybrid delivery patterns where the public edge must move first

References

Further reading
Implementation references
  • platform/network/cloudflare-traffic-steering#showcase_burst_balancer
  • platform/network/decision-service
  • platform/network/edge-observability
  • platform/k8s/argocd-bootstrap#gke_burst_gitops
  • platform/k8s/argocd-bootstrap#gke_burst_web_gitops
  • platform/k8s/runtime-bundle-secret#showcase_burst_web_runtime_dev_burst

What was verified

Verified during the recorded HybridOps v1.0.1 burst exercise with same-host traffic steering, authenticated observability, decision-service state, and a clean return to primary.