SYSLOG 25-W03

Published on January 19, 2025

FreeBSD ZFS drive replacement via geom disk/part IDs; PBS co-hosted with PVE for rotating backups; homelab rack reshuffle and RAM reallocation

Avalon Drive Failure and Replacement

Just after a successful ZFS scrub on my FreeBSD NAS (Avalon), I woke up to see a drive fully faulted. As a Linux user my go-to would be lsblk but... -sh: lsblk: not found

On FreeBSD you must use geom disk list which gives something much different:

Geom name: da0
Providers:
1. Name: da0
  Mediasize: 8001563222016 (7.3T)
  Sectorsize: 512
  Stripesize: 4096
  Stripeoffset: 0
  Mode: r1w1e3
  descr: SEAGATE ST8000NM0075
  lunid: 5000c500858ba407
  ident: ZA12JDML
  rotationrate: 7200
  fwsectors: 63
  fwheads: 255

With this I can cross-reference the "LUN ID" and "IDENT" with my own cheat sheets to identify the correct drive to offline... but there is another way using geom part list to extract the specific ZFS device needed for the zfs replace command. I have a script that returns something like this:

da0: zfs-c2038ba77f95e6a4 5000c500858ba407 ZA12JDML
da1: zfs-7c82cd2a41324d33 5000c500858bc6ef ZA12JYD2
da2: zfs-8403658091857737 5000c500858b8543 ZA12JE19
da3: zfs-b3f624caad3fe7ba 5000c500858b736b ZA12JEF1
da4: zfs-a5158f526261ddad 5000c500858b9cbf ZA12JZ88

With this I can physically identify the drive (represented by the ZA* label) with the internal device name as recognized by FreeBSD. I replaced it with a spare drive, ran a zpool replace VAULT zfs-0x50abc zfs-0x50xyz using the ZFS partition label shown above and let resilvering do its thing.

Proxmox Backup Server Alongside PVE

I decided to install PBS (Proxmox Backup Server) alongside my existing Proxmox host on "node-berlin". The main attraction: I can attach and detach external drives easily (new PBS feature) for rotating backups as that server has a single 5.25" slot that I can use for a front-loading 3.5" hard drive. I was unable to pass the SATA port in full to a VM in the past so I knew I'd need direct hardware access to achieve this. I didn't expect to be able to run PVE on port 8006 and PBS on port 8007 but I am! They operate independently and there is no functional overlap other than the shared storage of the boot OS. I highly recommend everyone try this before going virtualizing PBS as it gives you the option to add another PVE node.

In my opinion: the ideal homelab PVE cluster is a two-node setup with one handling the entire workload and the second as standby. The third node can operate as a PBS server primarily with the option to test VMs and LXCs there before moving them to the main node.

Physical Rearrangement and RAM Shuffling

I made time to move my entire cluster onto a lower shelf so I could better manage cables and hot-swap nodes. My plan:

Migrate all production VMs off "node-cairo" to the new cluster.
Shut down node-cairo to reclaim 16GB of RAM.
Shuffle memory among "hydra2" and "hydra3" to see if I can get more total capacity recognized. (Some motherboards can be finicky with slot pairings.)

I also used an extra set of bungee cords to secure the nodes, so they wouldn't topple if bumped... low-tech rackmounting.