A Year and a Half of Proxmox

What started as a single Tailscale container grew into a 12+ service environment spanning personal services, home networking, observability, and a production deployment target for my own software.

The Short Version

I've been running a Proxmox homelab for about a year and a half. It didn't start as a "homelab." It started as one container doing one useful thing. Every era since has been driven by a concrete problem I wanted to solve, or a piece of infrastructure I wanted to stop misunderstanding.

The lab now serves me, my roommate, and my baseball team. It has a real observability stack, real network segmentation, and real rollback plans. This is the history of how it got there.

Timeline

Late 2024 · Rosita
The Beginning

Installed Proxmox on a spare box at Rosita, the house my roommates and I named. First CT was Tailscale, so I could VPN into the apartment from anywhere. Second CT was Pi-hole, running network-wide DNS ad-blocking for the whole place. That was the entire lab for a while: two containers doing useful household work.

Early 2025 · First VM
Someone Else Depends On It

Spun up a Windows 11 VM for my roommate Matthew to run his trading automations. First time the lab was serving someone besides me, and the first time an outage actually mattered to anyone else. Uptime stopped being theoretical.

June 2025 · Felton
The Move + Immich

Rack moved with me from SLO to Felton. Stood up Immich as a self-hosted photo library, originally to host photos for my Cal Poly club baseball team. Later added a dedicated 300 GiB volume for personal photo backup so the team library and my own photos weren't fighting for space.

Late 2025 · Early 2026
The DevOps Era

This is where the lab stopped being "a few useful services" and became a real learning environment. Each addition forced me to learn something I'd been hand-waving over:

  • n8n (LXC) for scheduled automation, including a weekly apt-upgrade job that SSHes into every host in the fleet.
  • ntopng for flow-level network visibility. Finally being able to see what my network was actually doing.
  • Home Assistant Green (physical device) for home automation.
  • Samba video share (LXC) as a macOS-friendly editing scratch volume.
April 2026
Network Redesign + Observability Stack

The biggest level-up so far. Four things happened back-to-back:

  • Migrated the network off a consumer R9000 router onto pfSense CE 2.7.2 running as a VM. Configured VLANs for LAN, IoT, Servers, and Guest. Cutover was April 13.
  • Redesigned the IP scheme from scattered (.9, .12, .14, .17–.20) to sequential (.2–.11) with DHCP static reservations for every service. Every host has a predictable address now.
  • Built a ClickHouse + Grafana observability stack from scratch. ClickHouse on a dedicated 200 GiB disk, database homelab. Grafana CT with provisioned dashboards: "Host Overview" and "Outage Timeline."
  • Exposed ClickHouse publicly via Cloudflare Tunnel at clickhouse.nicolod.org, gated to an INSERT-only user. Pivoted here from Tailscale Funnel after discovering it blocks datacenter egress, which had been silently breaking my heartbeat Worker.

The heartbeat itself: a Cloudflare Worker (homelab-heartbeat) runs cron-every-minute and writes a row into homelab.heartbeats. Independent external uptime verification instead of the lab self-reporting.

What the Lab Looks Like Today

Hypervisor
Proxmox VE 9.1.7
Compute
24 logical CPUs
Guests
12+ CT / VM
Network
192.168.1.0/24 · VLANs
Switch
TL-SG108E (managed, trunked)
Edge / Firewall
pfSense CE 2.7.2 (VM)
Warehouse Disk
ClickHouse · 200 GiB
Public Ingress
Cloudflare Tunnel only

Fleet-wide SSH: my ed25519 key is authorized on every CT and VM I operate, so I can hop into any service from my workstation with zero friction. Different users per host: CTs use root@, VMs use non-root service accounts like clickhouse@ and devhost@.

No port-forwarding, no exposed IP. Cloudflare Tunnel is the only public ingress. Everything is version-controlled configs with explicit rollback plans. I've learned the hard way that "I'll remember how I set this up" is a lie.

Services

The 12+ containers and VMs that make up the current fleet.

pfSense

Running

Edge router + firewall running as a VM. VLANs for LAN, IoT, Servers, and Guest. Replaced the consumer R9000 in April 2026.

ClickHouse

Running

Time-series warehouse (LTS 26.3) on a dedicated 200 GiB disk. Database homelab stores heartbeats, host metrics, and outage timeline data.

Grafana

Running

LXC with provisioned dashboards ("Host Overview" and "Outage Timeline") backed by the ClickHouse native-protocol datasource.

n8n

Running

Scheduled workflow runner. Hosts the daily Homelab report and the weekly fleet-wide apt-upgrade job.

DevHost (Zedi)

Running

VM 110 · production deployment target for Zedi. Rust + Axum + Postgres in Docker. My own software, running on my own infrastructure.

Pi-hole

Running

Network-wide DNS and ad-blocking. 14k+ queries/day, ~3% blocked.

ntopng

Running

Flow-level network visibility. When something on the LAN is acting up, this is where I go first.

Tailscale

Running

Exit node + subnet router. Private mesh access to internal services from anywhere.

Immich

Running

Self-hosted photo backup: baseball team library plus a dedicated 300 GiB volume for personal photos.

Home Assistant Green

Running

Physical device for home automation. Not a VM, but a dedicated box living on the IoT VLAN.

Samba Video Share

Running

LXC serving a macOS-friendly SMB share. Scratch volume for editing without filling up the MacBook.

Traderbot

Running

Windows 11 VM running my roommate's trading automations 24/7. The first "someone-else-depends-on-this" workload.

Observability

The goal: stop guessing whether the lab is healthy. Three pieces work together: a time-series warehouse I control (ClickHouse), a dashboard layer (Grafana), and an external heartbeat that verifies the lab from outside itself (Cloudflare Worker).

The whole reason ClickHouse is publicly reachable is so my heartbeat Worker can insert into it. The auth posture is narrow on purpose: one Cloudflare Tunnel hostname (clickhouse.nicolod.org), authenticated as an INSERT-only user. No SELECT, no DROP, no schema access. Nothing a leaked credential could exfiltrate.

ClickHouse

LTS 26.3 on its own VM with a 200 GiB dedicated disk. Database homelab. Stores heartbeats, Telegraf metrics, and outage records.

Grafana

"Host Overview" and "Outage Timeline" dashboards, both provisioned as code. Backed by the ClickHouse native-protocol datasource.

homelab-heartbeat (Cloudflare Worker)

Cron-every-minute Worker that inserts a row over Cloudflare Tunnel. External verification: if the Worker stops inserting, I know before the lab realizes it's down.

Automation

Two n8n workflows do most of the day-to-day fleet ops. Both run from a dedicated n8n LXC with an ed25519 key authorized across every host.

Daily Homelab Report · 9:00 AM

Polls every host, rolls up health + storage + network + security into a single message, and posts it to Telegram. If anything is off, I see it before I open my laptop.

Weekly Fleet Upgrade

SSHes into every capable CT and VM, runs apt update && apt upgrade, reports back. Kernel and dependency drift is a boring problem, and automation is the right answer.

daily-report · n8n · sample
🟢 Homelab Daily · Apr 18, 2026, 9:00 AM Status: OK · 0 warning · 0 critical 🖥️ Hosts & Health Guests 12/13 up PVE uptime 11d · PVE 9.1.7 Telegraf 6/6 reporting Backups 24h 0 OK 💾 Storage TANK (zfs) 443.9 / 899.3 GB (49.4% · +0.0 GB/24h) clickhouse 250 / 1863 GB (13.4% · +0.0 GB/24h) local-lvm 8.5 / 137.4 GB (6.2%) 📈 Per-host anomalies clickhouse load peak 10.5 🌐 Network & Security WAN online · unchanged Gateway 8.0ms avg · 0% loss PF firewall 1,920 block entries in window PF auth 0 failed login patterns DNS 14,634 queries · 3.3% blocked Top blocked 192.168.1.5: 186 Quiet hosts traderbot, tailscale, immich, n8n, pihole, ntopng, clawdbot, devhost, grafana, video-share, pfsense, immich (expected), arbitrage-bot (expected)

Cloudflare Ingress

Cloudflare Tunnel is the only way into the lab from the public internet. No port-forwarding on the router, no exposed WAN IP, no services listening on the open internet. Each tunnel is a narrow, encrypted connection from a specific service to a specific hostname.

  • clickhouse.nicolod.org · INSERT-only
  • os.nicolod.org · NicoOS (Workers)
  • nicolod.org · portfolio
  • homelab-heartbeat · Worker cron

I pivoted here from Tailscale Funnel after finding out it blocks datacenter egress, which silently broke my heartbeat Worker. Cloudflare Tunnel is free, works from anywhere, and keeps the attack surface at "one auth-gated hostname per service."

Why I Built It

Infrastructure-as-a-learning-tool. Reading about pfSense or ClickHouse is not the same as cutting your home network over to a new router at 11 PM and needing it back up before anyone notices. You remember the thing you built under pressure for real people.

The lab is also the production deployment target for my own software. Zedi runs on DevHost (VM 110) against my real financial data. If I can't keep my own infrastructure up, I can't credibly ship software to anyone else.

Tech Stack

Everything currently load-bearing.

Proxmox VE 9.1 pfSense CE 2.7.2 ClickHouse 26.3 Grafana Telegraf n8n ZFS LVM Docker Linux (Debian / Ubuntu) Windows 11 VM Cloudflare Tunnel Cloudflare Workers Cloudflare R2 Cloudflare DNS Tailscale Immich Pi-hole ntopng Home Assistant Samba TL-SG108E VLAN trunking SSH (ed25519, fleet-wide) Bash Python Node.js