PostgreSQL DR Product Lanes¶

Purpose: Define the clean product and architecture split for PostgreSQL disaster recovery in HybridOps so that the platform remains supportable, tarball-safe, and commercially clear for SMEs, schools, and enterprise customers.

This standard complements:

It is not an ADR. It describes the current product and architecture boundary that shipped modules and blueprints should follow.

1. Product lanes¶

HybridOps SHOULD present PostgreSQL DR in three lanes.

1.1 Baseline lane: self-managed restore DR¶

Target fit:

SMEs
schools
cost-sensitive customers

Architecture:

On-prem PostgreSQL HA primary (Patroni + etcd)
pgBackRest backup and WAL archive to one object repository
DR restore to self-managed cloud VMs
Controlled failback to on-prem

Commercial posture:

default offer
lowest steady-state cost
strongest packaged/tarball fit

1.2 Premium lane: managed warm-standby DR¶

Target fit:

customers with tighter RTO/RPO requirements
customers willing to pay for lower recovery time and higher operational sophistication

Architecture:

On-prem PostgreSQL HA primary
one managed cloud PostgreSQL standby target
controlled promotion during DR
controlled failback by reseed or reverse replication

Commercial posture:

premium or enterprise add-on
not the default packaged story

1.3 Advanced lane: multi-cloud resilience¶

Target fit:

customers with explicit policy or regulatory requirements
customers with operational capacity for multi-cloud DR

Architecture:

one active warm standby target at a time
optional secondary backup copy in a second cloud
later decision-service-driven target selection

Commercial posture:

advanced offering
not required for baseline product success

2. Default product rule¶

HybridOps MUST ship and document the baseline self-managed restore lane as the default PostgreSQL DR path.

Rationale:

simplest to explain
easiest to support
lowest cost for the target market
best fit for tarball-first product packaging
already aligned with current module and blueprint contracts

Managed PostgreSQL DR MUST be implemented as a separate lane, not as a hidden variation of the self-managed restore path.

3. Module and blueprint boundaries¶

3.1 Baseline lane¶

Baseline DR SHOULD use these contracts:

backup configuration:
platform/onprem/postgresql-ha-backup
object repository:
org/gcp/object-repo
org/aws/object-repo
org/azure/object-repo
failover restore blueprint:
dr/postgresql-ha-failover-gcp@v1
future equivalent restore blueprints for other clouds
failback blueprint:
dr/postgresql-ha-failback-onprem@v1

This lane SHOULD consume repo_state_ref and inventory_state_ref rather than duplicating bucket names or VM IPs.

3.2 Premium managed lane¶

Managed DR SHOULD be modeled as separate modules and blueprints, for example:

managed database provisioning:
org/gcp/cloudsql-postgresql or equivalent
source preparation:
platform/onprem/postgresql-dr-source
managed standby/replication setup:
org/gcp/cloudsql-external-replica or equivalent
DR promotion blueprint:
dr/postgresql-cloudsql-promote-gcp@v1
failback blueprint:
dr/postgresql-cloudsql-failback-onprem@v1

These names are target-state examples, not currently shipped contracts.

Managed DR MUST NOT overload:

platform/onprem/postgresql-ha
dr/postgresql-ha-failover-gcp@v1

with provider-managed database semantics.

4. Failback rules¶

4.1 Baseline self-managed lane¶

Failback SHOULD remain:

maintenance-window based
explicit
evidence-driven

The on-prem target SHOULD be treated as a rebuilt or re-seeded cluster, not as an attempt to resume an old ambiguous timeline.

4.2 Managed lane¶

Failback from managed PostgreSQL SHOULD default to one of:

controlled export / reseed into a fresh on-prem leader, then rebuild replicas
reverse replication into on-prem when explicitly implemented and tested

Managed failback MUST be treated as a new lineage / reseed operation unless reverse replication is an explicitly shipped and validated feature.

5. Destructive operations¶

Backup repository purge MUST NOT be bundled into normal rerun or routine destroy flows.

Destructive backup cleanup SHOULD be:

separate
explicit
opt-in
clearly confirmed

Routine destroy for backup configuration may disable schedules or cluster-side configuration, but SHOULD NOT imply repository data deletion.

6. RTO/RPO positioning¶

Baseline self-managed restore DR:

higher RTO than warm standby
RPO bounded by backup/WAL position
best price-to-operability balance

Premium managed warm standby DR:

lower RTO
better RPO, typically bounded by replication lag
higher steady-state cost
more complex failback

HybridOps marketing and documentation SHOULD describe these tradeoffs explicitly and avoid presenting managed DR as “better in every way.”

7. Decision service boundary¶

Decision service SHOULD be introduced only after the baseline and premium DR lanes are independently proven.

Decision service SHOULD select among already-tested options, for example:

target cloud
DR mode
whether to enable secondary backup copy

Decision service MUST NOT be used to compensate for unproven or underspecified DR paths.