Linux Edge WAN with strongSwan and FRR for Hybrid Cloud Connectivity¶
Status¶
Accepted — Linux-based WAN edge using strongSwan (swanctl) for route-based IPsec and FRR for BGP adopted as the standard pattern for site-to-cloud connectivity.
1. Context¶
HybridOps.Studio requires reliable, cost-effective connectivity between on-premises/colocation sites and cloud providers (GCP, Azure). Commercial SD-WAN appliances add licensing cost and vendor lock-in unsuitable for the platform scale.
Requirements:
- Route-based IPsec compatible with cloud-native VPN gateways (GCP HA VPN, Azure VPN Gateway)
- Dynamic routing via BGP for automatic failover and prefix exchange
- Narrow traffic selectors to protect management traffic on shared hosts
- Deterministic configuration via Ansible for repeatability and evidence
Constraints:
- Must run on standard Linux (Debian/Ubuntu) without proprietary software
- Must support dual-tunnel HA patterns matching cloud VPN gateway designs
- Configuration must be auditable and version-controlled
2. Decision¶
Adopt strongSwan with swanctl configuration and FRR for BGP as the standard WAN edge stack:
- IPsec: strongSwan swanctl with route-based VTI interfaces
- Routing: FRR BGPd with strict prefix-list filtering
- Tunnels: Dual VTI interfaces per site for HA (matches GCP HA VPN / Azure active-active)
- Traffic selectors: Narrow selectors based on advertised/imported prefixes
- Automation: Ansible roles
wan_edge(configuration) andwan_validate(verification)
3. Rationale¶
strongSwan + swanctl over other IPsec implementations:
- Native Linux, widely deployed, active maintenance
- swanctl provides declarative configuration (vs legacy ipsec.conf)
- VTI support for route-based tunnels required by cloud providers
- Mark-based SA selection avoids policy conflicts with management traffic
FRR over other routing daemons:
- Industry-standard BGP implementation
- Integrated vtysh for operational familiarity
- Prefix-list and route-map support for policy control
- Active community, Debian/Ubuntu packages available
Dual-tunnel HA pattern:
- Matches GCP HA VPN interface model (two tunnels, two inside /30s)
- Provides redundancy without complex failover scripts
- BGP handles path selection automatically
Narrow traffic selectors:
- Prevents IPsec policies from capturing management (SSH) traffic
- Allows shared-host deployments where tunnel and management coexist
- Matches effective behavior in production (only routed prefixes traverse tunnel)
4. Consequences¶
4.1 Positive consequences¶
- Zero licensing cost for WAN edge functionality
- Consistent configuration across sites via Ansible
- Cloud-provider agnostic (same pattern for GCP, Azure, AWS)
- Full observability via standard Linux tools (ip xfrm, vtysh, journalctl)
- Testable locally with WAN simulator before production deployment
4.2 Negative consequences / risks¶
- Requires Linux networking expertise for troubleshooting
- No vendor support; community and internal knowledge required
- BGP misconfiguration can cause routing loops or blackholes
- IPsec rekeying during high traffic may cause brief packet loss
Mitigations:
wan_validaterole provides automated health checks- Strict prefix-lists prevent route leaks
- Smoke tests validate configuration before production apply
5. Alternatives considered¶
Commercial SD-WAN (Cisco, Fortinet, Palo Alto)
- Rejected: Licensing cost prohibitive for platform scale
- Rejected: Vendor lock-in conflicts with multi-cloud strategy
WireGuard
- Rejected: No native BGP integration
- Rejected: Not supported by GCP/Azure VPN gateways for site-to-cloud
OpenVPN
- Rejected: TLS-based, not compatible with cloud IPsec gateways
- Rejected: Performance inferior to kernel IPsec
LibreSwan
- Considered: Similar capability to strongSwan
- Rejected: Less active development, smaller community
6. Implementation notes¶
Ansible roles:
hybridops.network.wan_edge— strongSwan, VTI, FRR configurationhybridops.network.wan_validate— IPsec, BGP, route, reachability checks
Key files:
roles/wan_edge/templates/swanctl.conf.j2— IPsec configurationroles/wan_edge/templates/frr.conf.j2— BGP configurationroles/wan_edge/defaults/main.yml— tunable defaults
Configuration flow:
- Packages installed (strongswan-swanctl, frr)
- VTI interfaces created with marks matching IPsec SA
- swanctl.conf rendered with narrow traffic selectors
- frr.conf rendered with prefix-lists and peer-group
- Services enabled, handlers restart on config change
- CHILD_SAs verified installed before completion
7. Operational impact and validation¶
Validation role checks:
- CHILD_SA count matches tunnel count
- No SAs in transient state (REKEYING, DELETING)
- BGP neighbors established (not Active/Idle)
- Accepted prefix count >= 1 per neighbor
- Expected routes present in BGP table
- End-to-end ping to remote loopbacks
Smoke test:
- Local WAN simulator with two VMs
- Exercises full IPsec + BGP + routing chain
- Run via
make test.local ROLE=wan_edge
Production monitoring:
swanctl --list-sasfor IPsec statevtysh -c "show bgp summary"for BGP state- Prometheus exporters available for both (future enhancement)
8. References¶
Related ADRs:
- ADR-0101 – VLAN Allocation Strategy
- ADR-0109 – NCC Primary Hub with Routed Azure Spoke
- ADR-0601 – Nornir-Ansible Hybrid Automation
- ADR-0606 – Ansible Collections Release Process
External:
Maintainer: HybridOps.Studio
License: MIT-0 for code, CC-BY-4.0 for documentation