HOWTO: Deploy Observability Stack (Prometheus/Grafana/Loki) in VLAN 11¶
Purpose: Deploy a unified Prometheus + Grafana + Loki stack in VLAN 11 so you can observe dev/staging/prod clusters from a single pane of glass.
Difficulty: Intermediate
Prerequisites:
- VLAN 11 (Observability) configured as per ADR‑0101 and reachable from management.
- Proxmox VM templates ready (Linux base with cloud-init).
- Basic familiarity with systemd services and Docker/Podman or native binaries.
Demo / Walk-through¶
▶ Watch the observability stack walk-through
If the embed does not load, use the direct link:
Open on YouTube
1. Context¶
This HOWTO complements decisions captured in:
- ADR-0101 – VLAN Allocation Strategy
- ADR-0103 – Inter-VLAN Firewall Policy
- ADR-0401 – Unified Observability with Prometheus
- Network Architecture overview
The observability stack lives in VLAN 11 and:
- Scrapes metrics from workloads in VLANs 20/30/40.
- Serves dashboards to authorised users in management VLAN 10.
- Stores logs centrally via Loki to correlate incidents across environments.
This guide uses VM-per-component (Prometheus, Grafana, Loki) but can be adapted to containers or Kubernetes later.
2. Lab Assumptions¶
Replace the examples with your real values.
2.1 Networks and Hosts¶
| Item | Example value |
|---|---|
| VLAN ID | 11 |
| Subnet | 10.11.0.0/24 |
| Gateway (Proxmox) | 10.11.0.1 |
| Prometheus VM | 10.11.0.10 |
| Grafana VM | 10.11.0.11 |
| Loki VM | 10.11.0.12 |
| Management jump host | 10.10.0.10 |
2.2 Access and Tools¶
- SSH access from VLAN 10 to VLAN 11.
- Terraform + Ansible available on your management host.
- DNS or
/etc/hostsentries forprometheus.lab,grafana.lab,loki.lab(optional but recommended).
3. Provision the VMs in VLAN 11¶
You can use Terraform modules or Proxmox GUI; the important part is:
- Each VM attaches to the Proxmox bridge for VLAN 11.
- Static IPs are assigned according to ADR‑0104 (Terraform IPAM is preferred).
3.1 Using Terraform IPAM + VM module (recommended)¶
module "ipam_observability" {
source = "../../modules/proxmox/ipam"
allocations = {
prometheus-01 = { vlan = 11, offset = 10 } # 10.11.0.10
grafana-01 = { vlan = 11, offset = 11 } # 10.11.0.11
loki-01 = { vlan = 11, offset = 12 } # 10.11.0.12
}
}
module "vm_observability" {
source = "../../modules/proxmox/vm"
vms = {
prometheus-01 = {
ipv4_address = module.ipam_observability.ipv4_addresses["prometheus-01"]
ipv4_gateway = module.ipam_observability.gateways[11]
# other CPU/RAM/template settings
}
# grafana-01, loki-01...
}
}
Apply and verify that all three VMs:
ping -c3 10.11.0.10
ping -c3 10.11.0.11
ping -c3 10.11.0.12
4. Install and Configure Prometheus¶
4.1 Install Prometheus¶
On prometheus-01:
sudo useradd --no-create-home --shell /usr/sbin/nologin prometheus || true
sudo mkdir -p /etc/prometheus /var/lib/prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.55.0/prometheus-2.55.0.linux-amd64.tar.gz
tar -xzf prometheus-2.55.0.linux-amd64.tar.gz
cd prometheus-2.55.0.linux-amd64
sudo cp prometheus promtool /usr/local/bin/
sudo cp -r consoles console_libraries /etc/prometheus/
4.2 Base configuration¶
/etc/prometheus/prometheus.yml:
global:
scrape_interval: 15s
evaluation_interval: 30s
scrape_configs:
- job_name: 'node-dev'
static_configs:
- targets:
- '10.20.0.10:9100'
- '10.20.0.11:9100'
labels:
environment: dev
- job_name: 'node-staging'
static_configs:
- targets:
- '10.30.0.10:9100'
labels:
environment: staging
- job_name: 'node-prod'
static_configs:
- targets:
- '10.40.0.10:9100'
labels:
environment: prod
4.3 Systemd service¶
/etc/systemd/system/prometheus.service:
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
ExecStart=/usr/local/bin/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --web.listen-address=0.0.0.0:9090
[Install]
WantedBy=multi-user.target
sudo chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus
sudo systemctl daemon-reload
sudo systemctl enable --now prometheus
Validate:
curl -s http://10.11.0.10:9090/metrics | head
5. Install and Configure Grafana¶
On grafana-01 (Ubuntu example):
sudo apt-get update
sudo apt-get install -y apt-transport-https software-properties-common wget
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt-get update
sudo apt-get install -y grafana
sudo systemctl enable --now grafana-server
Add Prometheus data source via UI (or provisioning):
- URL:
http://10.11.0.10:9090 - Access:
Server
Create an “Environment Overview” dashboard with:
- Variable
environment(dev|staging|prod). - Panels for CPU, memory, node count filtered by label.
6. Install and Configure Loki (Optional but Recommended)¶
On loki-01:
wget https://github.com/grafana/loki/releases/download/v3.1.0/loki-linux-amd64.zip
unzip loki-linux-amd64.zip
sudo mv loki-linux-amd64 /usr/local/bin/loki
sudo chmod +x /usr/local/bin/loki
Minimal config /etc/loki/loki.yml (single-node):
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
chunk_idle_period: 5m
schema_config:
configs:
- from: 2024-01-01
store: boltdb
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb:
directory: /var/lib/loki/index
filesystem:
directory: /var/lib/loki/chunks
Create systemd unit similar to Prometheus and start Loki on :3100.
Configure Grafana Loki data source pointing at http://10.11.0.12:3100.
7. Firewall and Connectivity Validation¶
Ensure firewall rules from ADR‑0103 allow:
- From VLAN 11 to dev/staging/prod on exporter ports (e.g. 9100, 10250, etc.).
- From VLAN 10 (management) to VLAN 11 on 9090 (Prometheus), 3000 (Grafana), 3100 (Loki, if required).
Basic checks:
# From management jump host
curl -I http://10.11.0.10:9090
curl -I http://10.11.0.11:3000
# From prometheus-01
curl -I http://10.20.0.10:9100
In Grafana, confirm:
- Prometheus data source shows as green.
- Environment dashboards show metrics for dev/staging/prod.
8. Validation Checklist¶
- [ ] Prometheus VM up in VLAN 11 and reachable on
:9090. - [ ] Grafana VM up in VLAN 11 and reachable on
:3000. - [ ] (Optional) Loki VM up in VLAN 11 and reachable on
:3100. - [ ] Firewall allows Prometheus to scrape exporters in VLANs 20/30/40.
- [ ] Grafana dashboards show environment-labelled metrics.
- [ ] Prometheus rules / alerts fire correctly in test scenarios.
References¶
- ADR‑0101 – VLAN Allocation Strategy
- ADR‑0103 – Inter-VLAN Firewall Policy
- ADR‑0401 – Unified Observability with Prometheus
- Network Architecture
Maintainer: HybridOps.Studio
License: MIT-0 for code, CC-BY-4.0 for documentation unless otherwise stated.