Operate Proxmox SDN (core/onprem/network-sdn)¶
Demonstration¶
Recorded walkthrough of a full SDN deploy and validation sequence using this runbook.
Reference Scenario¶
The Reusable Proxmox SDN Foundation reference scenario shows a live SDN deployment result with structured run records and validation output.
Operational procedures for managing the Proxmox SDN and DHCP configuration produced by:
- module:
core/onprem/network-sdn - driver: Terragrunt (internal), executed via
hyopsfor structured run records - Terraform module:
hybridops-tech/sdn/proxmox(pinned in packs)
Conventions:
<PROXMOX_HOST>is the Proxmox management endpoint (IP or DNS name).- Prefer
hyopscommands over rawterragruntto keep state and run records consistent. - SDN
zone_nameis env-scoped to avoid collisions when using--env <env>.
Recommended operating modes:
host-routed modeenable_host_l3 = trueenable_snat = trueenable_dhcp = true- Use for bootstrap foundations, academy, labs, and small-site deployments.
edge-routed modeenable_host_l3 = falseenable_snat = falseenable_dhcp = false- Use when VyOS or another edge tier owns north-south routing and egress.
Brownfield caution:
- Do not point the module at manually created SDN objects and assume it will safely “adopt” them.
- Safe brownfield options are:
- create a new module-owned zone/VNet set, or
- import/cut over existing SDN objects into Terraform state deliberately.
destroyis safe for the zone managed in module state; it is not a safe takeover mechanism for hand-built SDN.
1. Deployment¶
1.1 Initial deployment¶
# Optional: validate readiness and inputs before mutating the target
hyops preflight --env <env> --target proxmox
hyops preflight --env <env> --strict --module core/onprem/network-sdn --inputs <inputs.yml>
# Review changes (recommended)
hyops plan --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
# Apply configuration
hyops apply --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
This provisions (defaults; adjust for your inputs):
- SDN zone derived from
inputs.zone_name(default base:hybzone, env-scoped in Proxmox when--envis set). - VNets and subnets (default:
vnetmgmt,vnetobs,vnetdata,vnetlab). - Optional host-level L3 gateways, NAT, and dnsmasq-based DHCP, depending on
enable_host_l3,enable_snat, andenable_dhcpin module inputs.
1.2 Update configuration¶
Typical updates:
- Add or adjust VNets/subnets.
- Change DHCP ranges or DNS domain.
- Enable/disable host L3, NAT, or DHCP.
Edit your inputs overlay file (passed via --inputs), then:
hyops plan --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
hyops apply --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
HybridOps will converge SDN objects and host configuration in place.
1.2.1 Force host-side reconcile (same-input drift recovery)¶
If host-side SDN state drifts (for example a vnet* interface exists but the
expected gateway IP is missing) and your topology inputs are unchanged, use the
explicit reconcile token instead of changing unrelated inputs:
HYOPS_INPUT_host_reconcile_nonce="$(date -u +%Y%m%dT%H%M%SZ)" \
hyops apply --env shared --module core/onprem/network-sdn --inputs <inputs.yml>
This forces the host-side gateway/NAT/DHCP setup scripts to re-run while keeping the SDN topology inputs unchanged.
1.3 Destroy (lab only)¶
hyops destroy --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
Warning: This removes SDN VNets, subnets, and host DHCP/NAT configuration. Only use for lab tear-down or controlled rebuilds.
Destroy scope is zone-scoped, not a blanket Proxmox network wipe. The module removes:
- the SDN zone/VNet/subnet objects it manages
- gateway addresses derived from that zone's gateway state
- NAT rules tagged for that zone
- DHCP units/config for that zone
It does not intentionally remove unrelated bridges, unrelated SDN zones, or NAT/DHCP state outside the managed zone.
2. Validation¶
2.1 Quick health check¶
# Inspect latest module outputs for this env
cat ~/.hybridops/envs/<env>/state/modules/core__onprem__network-sdn/latest.json
# Check SDN zones on Proxmox
ssh root@<PROXMOX_HOST> 'pvesh get /cluster/sdn/zones'
# Check VNet bridges (defaults; adjust for your vnet names)
ssh root@<PROXMOX_HOST> 'ip link show | grep -E "vnet(mgmt|obs|data|lab)"'
# Check gateways and routes
ssh root@<PROXMOX_HOST> 'ip -4 addr show | grep "inet 10\."'
ssh root@<PROXMOX_HOST> 'ip route | grep 10\.'
Expected:
- Zone present and
active(note: zone ID may be env-scoped, for exampledhybzone). - Bridges for configured VNets (default:
vnetmgmt,vnetobs,vnetdata,vnetlab). - Gateways for configured subnets (default:
10.10.0.1,10.11.0.1,10.12.0.1,10.50.0.1) when host L3 is enabled.
2.2 DHCP validation (when enabled)¶
# List per-VNet DHCP units
ssh root@<PROXMOX_HOST> 'systemctl list-units "dnsmasq@hybridops-sdn-dhcp-*"'
# Check DHCP listeners on port 67
ssh root@<PROXMOX_HOST> 'ss -ulpn | grep ":67" || true'
# Inspect generated per-VNet config
ssh root@<PROXMOX_HOST> 'ls -1 /etc/dnsmasq.d/dhcp-hybridops-sdn-dhcp-*.conf || true'
For each DHCP-enabled subnet you should see:
- A corresponding
dnsmasq@hybridops-sdn-dhcp-*.serviceunit. - A matching
dhcp-hybridops-sdn-dhcp-*.conffile.
2.3 Test from a VM¶
On a VM attached to vnetmgmt:
# 1. Gateway reachable
ping 10.10.0.1
# 2. Internet reachable (NAT working)
ping 8.8.8.8
# 3. DNS resolution
nslookup google.com || dig google.com
# 4. DHCP lease obtained
ip addr show | grep "inet 10.10."
If DHCP is disabled but L3+NAT are enabled, you should still see:
- Default route via
10.10.0.1(when configured statically). - Successful ping to
8.8.8.8.
2.4 Current shared-lane verification¶
Use these checks when you need current proof of the shared SDN baseline rather than just module deployment history:
hyops show module core/onprem/network-sdn --env shared
ssh root@10.10.0.1 'hostname && pvesh get /cluster/sdn/zones --output-format json'
ssh root@10.10.0.1 'ip -brief link | grep -E "vnet(data|dev|ddev|mgmt)"'
Expected:
- the shared SDN module is
status=ok - the Proxmox SDN zone
shybzoneis present - the expected
vnet*interfaces areUP
3. DHCP and L3/NAT behaviour¶
The SDN module treats L3, NAT, and DHCP as separate host-side concerns:
enable_host_l3controls whether gateways (.1per subnet) are configured.enable_snatcontrols NAT out via the uplink (typicallyvmbr0).enable_dhcpcontrols creation of per-VNet dnsmasq units.
3.1 DHCP behaviour at a glance¶
Per-subnet behaviour is driven by the combination of:
- Global flags:
enable_host_l3,enable_dhcp. - Subnet fields:
dhcp_range_start,dhcp_range_end, optionaldhcp_enabled.
enable_host_l3 |
enable_dhcp |
Subnet flags / ranges | Result |
|---|---|---|---|
false |
false |
anything | Pure L2 SDN only. No host gateways, no NAT, no DHCP. |
true |
false |
ranges optional | Host has .1 gateway per subnet, optional SNAT, no DHCP. |
true |
true |
dhcp_range_start + dhcp_range_end, dhcp_enabled omitted |
DHCP enabled for that subnet (implicit, “ranges = on”). |
true |
true |
ranges set, dhcp_enabled = true |
DHCP enabled for that subnet (explicit). |
true |
true |
ranges set, dhcp_enabled = false |
DHCP disabled – ranges treated as documentation only. |
Guardrails enforced by the module:
enable_dhcp = truerequiresenable_host_l3 = trueso dnsmasq can bind to VNet interfaces.- If
dhcp_enabled = true, bothdhcp_range_startanddhcp_range_endmust be set.
4. Routine DHCP management¶
4.1 Service-level view¶
# List all HybridOps SDN DHCP units
ssh root@<PROXMOX_HOST> 'systemctl list-units "dnsmasq@hybridops-sdn-dhcp-*"'
# Focus on a single VNet (replace <ZONE_NAME> with the effective zone ID, for example dhybzone)
ssh root@<PROXMOX_HOST> 'systemctl status dnsmasq@hybridops-sdn-dhcp-vnetmgmt-<ZONE_NAME>-10-10-0-0-24.service'
4.2 Restart DHCP for all SDN VNets¶
ssh root@<PROXMOX_HOST> '
systemctl list-unit-files "dnsmasq@hybridops-sdn-dhcp-*" --no-legend | awk "{print \$1}" | while read -r unit; do
[ -n "$unit" ] || continue
systemctl restart "$unit" || true
done
'
4.3 Inspect leases (pattern)¶
Lease files are per-instance. A typical location pattern:
ssh root@<PROXMOX_HOST> '
ls -1 /var/lib/misc/dnsmasq.hybridops-sdn-dhcp-*leases 2>/dev/null || true
'
If needed, document exact lease file names once the first leases have been issued on your node.
5. Troubleshooting¶
5.1 DHCP unit will not start¶
Symptoms
systemctl status dnsmasq@hybridops-sdn-dhcp-…showsfailed.- Logs report
unknown interface vnet…orFAILED to start up. - GUI shows DHCP status as failed for the SDN zone.
Diagnosis
ssh root@<PROXMOX_HOST> '
systemctl status "dnsmasq@hybridops-sdn-dhcp-*" --no-pager
journalctl -u "dnsmasq@hybridops-sdn-dhcp-*" -n 50 --no-pager
ip link show | grep -E "vnet(mgmt|obs|data|lab)" || true
'
5.2 Proxmox GUI shows SDN error even though traffic works¶
Current expected fix path:
- use
hybridops-tech/sdn/proxmoxv0.1.5or newer - rerun
core/onprem/network-sdn
What changed in v0.1.5:
- the host-side status helper now normalises generated
vnet*stanzas in/etc/network/interfaces.d/sdntoinet staticwith the derived gateway address - Proxmox SDN status then agrees with the actual host gateway state instead of showing a false red
error
This is non-destructive. It does not change the declared topology; it only keeps the generated host interface file and the live gateway state in agreement.
Common causes
pvesh set /cluster/sdnwas run while VNets were still converging.- VNet interfaces were removed or renamed.
- Another DHCP server is already bound to port 67.
Fix (pattern)
-
Confirm VNets exist:
ssh root@
' pvesh get /cluster/sdn/zones pvesh get /cluster/sdn/vnets ' -
Re-apply the stack so gateways and DHCP units are recreated:
hyops apply --env
--module core/onprem/network-sdn --inputs -
If a conflicting DHCP service is present, disable it:
ssh root@
' ss -ulpn | grep ":67" || true systemctl stop isc-dhcp-server 2>/dev/null || true systemctl disable isc-dhcp-server 2>/dev/null || true '
5.2 VNet bridges missing or stale¶
Symptoms
ip link showdoes not list expectedvnet*interfaces.- Proxmox SDN UI shows VNets in
errorordeletedstate. - After
hyops destroy, kernel interfaces persist.
Diagnosis
ssh root@<PROXMOX_HOST> '
pvesh get /cluster/sdn
ip link show | grep vnet || true
'
Fix (lab-safe pattern)
ssh root@<PROXMOX_HOST> '
for v in vnetmgmt vnetobs vnetdata vnetlab; do
ip link set "$v" down 2>/dev/null || true
ip link delete "$v" 2>/dev/null || true
done
ifreload -a || true
pvesh set /cluster/sdn || true
'
Use with care if the node runs other SDN zones or VNets with different naming conventions.
5.3 GUI mismatch vs actual state¶
On some Proxmox VE builds, the SDN GUI can show stale information about gateways or DHCP status.
- Prefer CLI checks (
pvesh,ip addr,iptables -t nat -S,systemctl,ss). - Treat the GUI as advisory rather than authoritative.
6. Operational recommendations¶
- Prefer
hyops plan/apply–driven changes to SDN over manual edits. - Reserve
hyops destroyfor full lab tear-downs. - Keep DHCP optional in environments where another IPAM/DHCP solution is used.
- Capture CLI output from validation and troubleshooting commands as part of the run record for observability and DR runbooks.
7. Recommended operating modes¶
Use the SDN module in one of two intentional ways.
7.1 Host-routed mode¶
Best for: - bootstrap foundations - academy or lab environments - smaller sites where the Proxmox host can safely provide L3/NAT/DHCP
Typical input posture:
enable_host_l3: true
enable_snat: true
enable_dhcp: true
In this mode the Proxmox node owns:
- gateway IPs on vnet*
- optional SNAT via the uplink bridge
- optional dnsmasq DHCP
7.2 Edge-routed mode¶
Best for: - production environments - HybridOps WAN/VyOS edge designs - sites where north-south routing should not live on the hypervisor
Typical input posture:
enable_host_l3: false
enable_snat: false
enable_dhcp: false
In this mode Proxmox SDN is kept to segmentation and bridge/VLAN orchestration, while routing and DHCP are delegated to the edge/network layer.
Preferred production guidance: - let the edge tier own egress and north-south routing - use Proxmox SDN mainly for segmentation unless you explicitly want host-routed subnets
8. Brownfield adoption¶
If the site already has manually created Proxmox SDN objects:
- do not simply point the module at the same names and assume destroy will sort it out
- either create a new module-owned zone, or import the existing SDN objects into state as part of a planned cutover
Destroy is scoped to the zone managed by the module state, but that still means it will remove the SDN objects and host-side services for that module-owned zone.
9. Destroy semantics¶
hyops destroy --module core/onprem/network-sdn removes only the SDN zone and
host-side services owned by that module instance.
It does remove, for that zone: - module-managed SDN zone/VNet/subnet objects - gateway IPs for that zone - NAT rules tagged for that zone - dnsmasq DHCP units/configs for that zone
It does not intentionally remove: - unrelated Proxmox bridges - unrelated SDN zones - NAT rules without the module's zone tags - DHCP services/configs for other zones
Further reading¶
- HOWTO: Proxmox SDN with Terraform: full authoring guide covering inputs, VLAN strategy, and brownfield adoption (Academy)