A Year and a Half of Proxmox
What started as a single Tailscale container grew into a 12+ service environment spanning personal services, home networking, observability, and a production deployment target for my own software.
The Short Version
I've been running a Proxmox homelab for about a year and a half. It didn't start as a "homelab." It started as one container doing one useful thing. Every era since has been driven by a concrete problem I wanted to solve, or a piece of infrastructure I wanted to stop misunderstanding.
The lab now serves me, my roommate, and my baseball team. It has a real observability stack, real network segmentation, and real rollback plans. This is the history of how it got there.
Timeline
Installed Proxmox on a spare box at Rosita, the house my roommates and I named. First CT was Tailscale, so I could VPN into the apartment from anywhere. Second CT was Pi-hole, running network-wide DNS ad-blocking for the whole place. That was the entire lab for a while: two containers doing useful household work.
Spun up a Windows 11 VM for my roommate Matthew to run his trading automations. First time the lab was serving someone besides me, and the first time an outage actually mattered to anyone else. Uptime stopped being theoretical.
Rack moved with me from SLO to Felton. Stood up Immich as a self-hosted photo library, originally to host photos for my Cal Poly club baseball team. Later added a dedicated 300 GiB volume for personal photo backup so the team library and my own photos weren't fighting for space.
This is where the lab stopped being "a few useful services" and became a real learning environment. Each addition forced me to learn something I'd been hand-waving over:
- n8n (LXC) for scheduled automation, including a weekly apt-upgrade job that SSHes into every host in the fleet.
- ntopng for flow-level network visibility. Finally being able to see what my network was actually doing.
- Home Assistant Green (physical device) for home automation.
- Samba video share (LXC) as a macOS-friendly editing scratch volume.
The biggest level-up so far. Four things happened back-to-back:
- Migrated the network off a consumer R9000 router onto pfSense CE 2.7.2 running as a VM. Configured VLANs for LAN, IoT, Servers, and Guest. Cutover was April 13.
- Redesigned the IP scheme from scattered (.9, .12, .14, .17–.20) to sequential (.2–.11) with DHCP static reservations for every service. Every host has a predictable address now.
- Built a ClickHouse + Grafana observability stack from scratch. ClickHouse on a dedicated 200 GiB disk, database
homelab. Grafana CT with provisioned dashboards: "Host Overview" and "Outage Timeline." - Exposed ClickHouse publicly via Cloudflare Tunnel at
clickhouse.nicolod.org, gated to an INSERT-only user. Pivoted here from Tailscale Funnel after discovering it blocks datacenter egress, which had been silently breaking my heartbeat Worker.
The heartbeat itself: a Cloudflare Worker (homelab-heartbeat) runs cron-every-minute
and writes a row into homelab.heartbeats. Independent external uptime verification instead of
the lab self-reporting.
What the Lab Looks Like Today
Fleet-wide SSH: my ed25519 key is authorized on every CT and VM I operate, so I can hop into
any service from my workstation with zero friction. Different users per host: CTs use root@,
VMs use non-root service accounts like clickhouse@ and devhost@.
No port-forwarding, no exposed IP. Cloudflare Tunnel is the only public ingress. Everything is version-controlled configs with explicit rollback plans. I've learned the hard way that "I'll remember how I set this up" is a lie.
Services
The 12+ containers and VMs that make up the current fleet.
pfSense
RunningEdge router + firewall running as a VM. VLANs for LAN, IoT, Servers, and Guest. Replaced the consumer R9000 in April 2026.
ClickHouse
RunningTime-series warehouse (LTS 26.3) on a dedicated 200 GiB disk. Database homelab stores heartbeats, host metrics, and outage timeline data.
Grafana
RunningLXC with provisioned dashboards ("Host Overview" and "Outage Timeline") backed by the ClickHouse native-protocol datasource.
n8n
RunningScheduled workflow runner. Hosts the daily Homelab report and the weekly fleet-wide apt-upgrade job.
DevHost (Zedi)
RunningVM 110 · production deployment target for Zedi. Rust + Axum + Postgres in Docker. My own software, running on my own infrastructure.
Pi-hole
RunningNetwork-wide DNS and ad-blocking. 14k+ queries/day, ~3% blocked.
ntopng
RunningFlow-level network visibility. When something on the LAN is acting up, this is where I go first.
Tailscale
RunningExit node + subnet router. Private mesh access to internal services from anywhere.
Immich
RunningSelf-hosted photo backup: baseball team library plus a dedicated 300 GiB volume for personal photos.
Home Assistant Green
RunningPhysical device for home automation. Not a VM, but a dedicated box living on the IoT VLAN.
Samba Video Share
RunningLXC serving a macOS-friendly SMB share. Scratch volume for editing without filling up the MacBook.
Traderbot
RunningWindows 11 VM running my roommate's trading automations 24/7. The first "someone-else-depends-on-this" workload.
Observability
The goal: stop guessing whether the lab is healthy. Three pieces work together: a time-series warehouse I control (ClickHouse), a dashboard layer (Grafana), and an external heartbeat that verifies the lab from outside itself (Cloudflare Worker).
The whole reason ClickHouse is publicly reachable is so my heartbeat Worker can insert into it.
The auth posture is narrow on purpose: one Cloudflare Tunnel hostname (clickhouse.nicolod.org),
authenticated as an INSERT-only user. No SELECT, no DROP, no schema access.
Nothing a leaked credential could exfiltrate.
ClickHouse
LTS 26.3 on its own VM with a 200 GiB dedicated disk. Database homelab. Stores heartbeats, Telegraf metrics, and outage records.
Grafana
"Host Overview" and "Outage Timeline" dashboards, both provisioned as code. Backed by the ClickHouse native-protocol datasource.
homelab-heartbeat (Cloudflare Worker)
Cron-every-minute Worker that inserts a row over Cloudflare Tunnel. External verification: if the Worker stops inserting, I know before the lab realizes it's down.
Automation
Two n8n workflows do most of the day-to-day fleet ops. Both run from a dedicated n8n LXC with an ed25519 key authorized across every host.
Daily Homelab Report · 9:00 AM
Polls every host, rolls up health + storage + network + security into a single message, and posts it to Telegram. If anything is off, I see it before I open my laptop.
Weekly Fleet Upgrade
SSHes into every capable CT and VM, runs apt update && apt upgrade, reports back. Kernel and dependency drift is a boring problem, and automation is the right answer.
Cloudflare Ingress
Cloudflare Tunnel is the only way into the lab from the public internet. No port-forwarding on the router, no exposed WAN IP, no services listening on the open internet. Each tunnel is a narrow, encrypted connection from a specific service to a specific hostname.
- clickhouse.nicolod.org · INSERT-only
- os.nicolod.org · NicoOS (Workers)
- nicolod.org · portfolio
- homelab-heartbeat · Worker cron
I pivoted here from Tailscale Funnel after finding out it blocks datacenter egress, which silently broke my heartbeat Worker. Cloudflare Tunnel is free, works from anywhere, and keeps the attack surface at "one auth-gated hostname per service."
Why I Built It
Infrastructure-as-a-learning-tool. Reading about pfSense or ClickHouse is not the same as cutting your home network over to a new router at 11 PM and needing it back up before anyone notices. You remember the thing you built under pressure for real people.
The lab is also the production deployment target for my own software. Zedi runs on DevHost (VM 110)
against my real financial data. If I can't keep my own infrastructure up, I can't credibly ship software to anyone else.
Tech Stack
Everything currently load-bearing.