Operate Proxmox SDN (core/onprem/network-sdn)¶
Operational procedures for managing the Proxmox SDN and DHCP configuration produced by:
- module:
core/onprem/network-sdn - driver: Terragrunt (internal), executed via
hyopsfor evidence-first runs - Terraform module:
hybridops-studio/sdn/proxmox(pinned in packs)
Conventions:
<PROXMOX_HOST>is the Proxmox management endpoint (IP or DNS name).- Prefer
hyopscommands over rawterragruntto keep state/evidence consistent. - SDN
zone_nameis env-scoped to avoid collisions when using--env <env>.
1. Deployment¶
1.1 Initial deployment¶
# Optional: validate readiness and inputs before mutating the target
hyops preflight --env <env> --target proxmox
hyops preflight --env <env> --strict --module core/onprem/network-sdn --inputs <inputs.yml>
# Review changes (recommended)
hyops plan --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
# Apply configuration
hyops apply --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
This provisions (defaults; adjust for your inputs):
- SDN zone derived from
inputs.zone_name(default base:hybzone, env-scoped in Proxmox when--envis set). - VNets and subnets (default:
vnetmgmt,vnetobs,vnetdata,vnetlab). - Optional host-level L3 gateways, NAT, and dnsmasq-based DHCP, depending on
enable_host_l3,enable_snat, andenable_dhcpin module inputs.
1.2 Update configuration¶
Typical updates:
- Add or adjust VNets/subnets.
- Change DHCP ranges or DNS domain.
- Enable/disable host L3, NAT, or DHCP.
Edit your inputs overlay file (passed via --inputs), then:
hyops plan --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
hyops apply --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
HybridOps will converge SDN objects and host configuration in place.
1.2.1 Force host-side reconcile (same-input drift recovery)¶
If host-side SDN state drifts (for example a vnet* interface exists but the
expected gateway IP is missing) and your topology inputs are unchanged, use the
explicit reconcile token instead of changing unrelated inputs:
HYOPS_INPUT_host_reconcile_nonce="$(date -u +%Y%m%dT%H%M%SZ)" \
hyops apply --env shared --module core/onprem/network-sdn --inputs <inputs.yml>
This forces the host-side gateway/NAT/DHCP setup scripts to re-run while keeping the SDN topology inputs unchanged.
1.3 Destroy (lab only)¶
hyops destroy --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
Warning: This removes SDN VNets, subnets, and host DHCP/NAT configuration. Only use for lab tear-down or controlled rebuilds.
2. Validation¶
2.1 Quick health check¶
# Inspect latest module outputs for this env
cat ~/.hybridops/envs/<env>/state/modules/core__onprem__network-sdn/latest.json
# Check SDN zones on Proxmox
ssh root@<PROXMOX_HOST> 'pvesh get /cluster/sdn/zones'
# Check VNet bridges (defaults; adjust for your vnet names)
ssh root@<PROXMOX_HOST> 'ip link show | grep -E "vnet(mgmt|obs|data|lab)"'
# Check gateways and routes
ssh root@<PROXMOX_HOST> 'ip -4 addr show | grep "inet 10\."'
ssh root@<PROXMOX_HOST> 'ip route | grep 10\.'
Expected:
- Zone present and
active(note: zone ID may be env-scoped, for exampledhybzone). - Bridges for configured VNets (default:
vnetmgmt,vnetobs,vnetdata,vnetlab). - Gateways for configured subnets (default:
10.10.0.1,10.11.0.1,10.12.0.1,10.50.0.1) when host L3 is enabled.
2.2 DHCP validation (when enabled)¶
# List per-VNet DHCP units
ssh root@<PROXMOX_HOST> 'systemctl list-units "dnsmasq@hybridops-sdn-dhcp-*"'
# Check DHCP listeners on port 67
ssh root@<PROXMOX_HOST> 'ss -ulpn | grep ":67" || true'
# Inspect generated per-VNet config
ssh root@<PROXMOX_HOST> 'ls -1 /etc/dnsmasq.d/dhcp-hybridops-sdn-dhcp-*.conf || true'
For each DHCP-enabled subnet you should see:
- A corresponding
dnsmasq@hybridops-sdn-dhcp-*.serviceunit. - A matching
dhcp-hybridops-sdn-dhcp-*.conffile.
2.3 Test from a VM¶
On a VM attached to vnetmgmt:
# 1. Gateway reachable
ping 10.10.0.1
# 2. Internet reachable (NAT working)
ping 8.8.8.8
# 3. DNS resolution
nslookup google.com || dig google.com
# 4. DHCP lease obtained
ip addr show | grep "inet 10.10."
If DHCP is disabled but L3+NAT are enabled, you should still see:
- Default route via
10.10.0.1(when configured statically). - Successful ping to
8.8.8.8.
3. DHCP and L3/NAT behaviour¶
The SDN module treats L3, NAT, and DHCP as separate host-side concerns:
enable_host_l3controls whether gateways (.1per subnet) are configured.enable_snatcontrols NAT out via the uplink (typicallyvmbr0).enable_dhcpcontrols creation of per-VNet dnsmasq units.
3.1 DHCP behaviour at a glance¶
Per-subnet behaviour is driven by the combination of:
- Global flags:
enable_host_l3,enable_dhcp. - Subnet fields:
dhcp_range_start,dhcp_range_end, optionaldhcp_enabled.
enable_host_l3 |
enable_dhcp |
Subnet flags / ranges | Result |
|---|---|---|---|
false |
false |
anything | Pure L2 SDN only. No host gateways, no NAT, no DHCP. |
true |
false |
ranges optional | Host has .1 gateway per subnet, optional SNAT, no DHCP. |
true |
true |
dhcp_range_start + dhcp_range_end, dhcp_enabled omitted |
DHCP enabled for that subnet (implicit, “ranges = on”). |
true |
true |
ranges set, dhcp_enabled = true |
DHCP enabled for that subnet (explicit). |
true |
true |
ranges set, dhcp_enabled = false |
DHCP disabled – ranges treated as documentation only. |
Guardrails enforced by the module:
enable_dhcp = truerequiresenable_host_l3 = trueso dnsmasq can bind to VNet interfaces.- If
dhcp_enabled = true, bothdhcp_range_startanddhcp_range_endmust be set.
4. Routine DHCP management¶
4.1 Service-level view¶
# List all HybridOps SDN DHCP units
ssh root@<PROXMOX_HOST> 'systemctl list-units "dnsmasq@hybridops-sdn-dhcp-*"'
# Focus on a single VNet (replace <ZONE_NAME> with the effective zone ID, for example dhybzone)
ssh root@<PROXMOX_HOST> 'systemctl status dnsmasq@hybridops-sdn-dhcp-vnetmgmt-<ZONE_NAME>-10-10-0-0-24.service'
4.2 Restart DHCP for all SDN VNets¶
ssh root@<PROXMOX_HOST> '
systemctl list-unit-files "dnsmasq@hybridops-sdn-dhcp-*" --no-legend | awk "{print \$1}" | while read -r unit; do
[ -n "$unit" ] || continue
systemctl restart "$unit" || true
done
'
4.3 Inspect leases (pattern)¶
Lease files are per-instance. A typical location pattern:
ssh root@<PROXMOX_HOST> '
ls -1 /var/lib/misc/dnsmasq.hybridops-sdn-dhcp-*leases 2>/dev/null || true
'
If needed, document exact lease file names once the first leases have been issued on your node.
5. Troubleshooting¶
5.1 DHCP unit will not start¶
Symptoms
systemctl status dnsmasq@hybridops-sdn-dhcp-…showsfailed.- Logs report
unknown interface vnet…orFAILED to start up. - GUI shows DHCP status as failed for the SDN zone.
Diagnosis
ssh root@<PROXMOX_HOST> '
systemctl status "dnsmasq@hybridops-sdn-dhcp-*" --no-pager
journalctl -u "dnsmasq@hybridops-sdn-dhcp-*" -n 50 --no-pager
ip link show | grep -E "vnet(mgmt|obs|data|lab)" || true
'
Common causes
pvesh set /cluster/sdnwas run while VNets were still converging.- VNet interfaces were removed or renamed.
- Another DHCP server is already bound to port 67.
Fix (pattern)
- Confirm VNets exist:
ssh root@<PROXMOX_HOST> '
pvesh get /cluster/sdn/zones
pvesh get /cluster/sdn/vnets
'
- Re-apply the stack so gateways and DHCP units are recreated:
hyops apply --env <env> --module core/onprem/network-sdn --inputs <inputs.yml>
- If a conflicting DHCP service is present, disable it:
ssh root@<PROXMOX_HOST> '
ss -ulpn | grep ":67" || true
systemctl stop isc-dhcp-server 2>/dev/null || true
systemctl disable isc-dhcp-server 2>/dev/null || true
'
5.2 VNet bridges missing or stale¶
Symptoms
ip link showdoes not list expectedvnet*interfaces.- Proxmox SDN UI shows VNets in
errorordeletedstate. - After
hyops destroy, kernel interfaces persist.
Diagnosis
ssh root@<PROXMOX_HOST> '
pvesh get /cluster/sdn
ip link show | grep vnet || true
'
Fix (lab-safe pattern)
ssh root@<PROXMOX_HOST> '
for v in vnetmgmt vnetobs vnetdata vnetlab; do
ip link set "$v" down 2>/dev/null || true
ip link delete "$v" 2>/dev/null || true
done
ifreload -a || true
pvesh set /cluster/sdn || true
'
Use with care if the node runs other SDN zones or VNets with different naming conventions.
5.3 GUI mismatch vs actual state¶
On some Proxmox VE builds, the SDN GUI can show stale information about gateways or DHCP status.
- Prefer CLI checks (
pvesh,ip addr,iptables -t nat -S,systemctl,ss). - Treat the GUI as advisory rather than authoritative.
6. Operational recommendations¶
- Prefer
hyops plan/apply–driven changes to SDN over manual edits. - Reserve
hyops destroyfor full lab tear-downs. - Keep DHCP optional in environments where another IPAM/DHCP solution is used.
- Capture CLI output from validation and troubleshooting commands as evidence for observability and DR runbooks.