The road to enterprise at home: A DR-test!

I’ve had some issues with my kubernetes-node, basically a few random crashes. A bit inconvenient as it’s summer-time. As I am writing this, I am at our cabin and the kubernetes-node is down.

But wait a minute? Doesn’t the blog run on kubernetes? Yes, it does. But I do have backup.

A while back, I switched to zfs-sync-based backup of my node. This means all my file systems exist on the backup node – though not necessarily with same names etc. So I could’t let the downtime pass without trying to bring up services on the backup node!

Bringing up kubernetes

I have made some bootstrap scripts in the kubernetes-bootstrap repo that I plan to use for bringing up kubernetes and enough services to let argocd do it’s job. Now, I wasn’t planning on bringing up things from scratch, but the scripts to install k3s was still good to be tested. I found one spelling error (testing is always good), but other than that installing a bare, uninitalized k3s worked pretty well. Then, I put the configuration in place, mounted the file system with the etcd database, and fired up.

It failed.

2025-07-30T06:45:25.840968+00:00 remote k3s[4186]: time="2025-07-30T06:45:25Z" level=info msg="Failed to test etcd connection: this server is a not a member of the etcd cluster. Found [hassio-616dc712=https://192.168.1.153:2380], expect: hassio-616dc712=https://192.168.1.240:2380"

It seems I need to just reset the node state, which turned out to be pretty simple:

k3s server --cluster-reset

This will do it’s stuff, relabel things within etcd, and I can start k3s normally.

Now, I’m not home free. I have tons of leftover pods like this:

unifi                           unpoller-579fff97fd-dkzjt                                    1/1     Terminating        1 (22d ago)       35d

These are seemingly running, but they are on my old node. My new node is added with the nodename remote, and these run at hassio. My node knows about hassio, but since it’s down, there’s no way to check whether or not these pods are actually alive.

It’s possible quorums or whatever could save me, but with two nodes (my real node and my new one), any meaningful quorum is hard to do. But I know they are down, there’s no cleanup to do wrt mounts, containers or whatever, so I’ll just go ahead and force delete them.

kubectl delete --force -n unifi pod unpoller-579fff97fd-dkzjt

Now, they can restart on the new node.

Next issue: Storage. All my PVCs are created on hassio. While I do have the file systems locally on remote, kubernetes doesn’t know that. So, I need to see what tricks I can do. I could probably recreate things from scratch on new volumes and copy the data over, but that’s more time-consuming. I could also have synced over my data onto zpools that ware named the same, but the primary purpose is backup and not DR-node, so storage flexibility is more important.

So: The metadata for my storage basically has two errors: It think the data is on a different node, and the path is wrong. For a PV, this is how it looks:

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: zfs.csi.openebs.io
    volume.kubernetes.io/provisioner-deletion-secret-name: ""
    volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
  creationTimestamp: "2025-07-02T12:42:02Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-73f9fc83-2188-40ec-817d-c13604f3616a
  resourceVersion: "19862352"
  uid: e1aab6fa-6634-4583-83f4-b238675d7e69
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: syncthing-config-pvc
    namespace: syncthing
    resourceVersion: "19862300"
    uid: 73f9fc83-2188-40ec-817d-c13604f3616a
  csi:
    driver: zfs.csi.openebs.io
    fsType: zfs
    volumeAttributes:
      openebs.io/cas-type: localpv-zfs
      openebs.io/poolname: nasdisk/k3s
      storage.kubernetes.io/csiProvisionerIdentity: 1751064728233-9252-zfs.csi.openebs.io
    volumeHandle: pvc-73f9fc83-2188-40ec-817d-c13604f3616a
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: openebs.io/nodeid
          operator: In
          values:
          - hassio
  persistentVolumeReclaimPolicy: Retain
  storageClassName: zfs-storage-nas
  volumeMode: Filesystem
status:
  lastPhaseTransitionTime: "2025-07-02T12:42:02Z"
  phase: Bound

In addition, there is a zfsvolume property:

apiVersion: zfs.openebs.io/v1
kind: ZFSVolume
metadata:
  creationTimestamp: "2025-07-02T12:42:01Z"
  finalizers:
  - zfs.openebs.io/finalizer
  generation: 2
  labels:
    kubernetes.io/nodename: hassio
  name: pvc-73f9fc83-2188-40ec-817d-c13604f3616a
  namespace: openebs
  resourceVersion: "19862331"
  uid: d3879115-36c3-4e80-8b9e-ee2463c928f0
spec:
  capacity: "1073741824"
  fsType: zfs
  ownerNodeID: hassio
  poolName: nasdisk/k3s
  quotaType: quota
  volumeType: DATASET
status:
  state: Ready

In the first one, the fields are immutable, so I’m in a bit of trouble. I can’t update them. However, if I delete the PV, I can create them with the data changed, and it will happly find it in the disk. However, when I do:

kubectl delete pv pvc-73f9fc83-2188-40ec-817d-c13604f3616a

This hangs on the finalizer – I of course am not able to delete it on the hassio node, as I can’t contact it. This is actually fine, I plan to scratch this DR environment once I bring up hassio, which should have all the data intact, but I still need to be able to change things on the DR node.

There is a trick: If I do

kubectl edit pv pvc-73f9fc83-2188-40ec-817d-c13604f3616a

I can now delete the finalizer, and the delete command returns. The PV is gone. So, I recreate with these changes:

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: zfs.csi.openebs.io
    volume.kubernetes.io/provisioner-deletion-secret-name: ""
    volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
  creationTimestamp: "2025-07-02T12:42:02Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-73f9fc83-2188-40ec-817d-c13604f3616a
  resourceVersion: "19862352"
  uid: e1aab6fa-6634-4583-83f4-b238675d7e69
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: syncthing-config-pvc
    namespace: syncthing
    resourceVersion: "19862300"
    uid: 73f9fc83-2188-40ec-817d-c13604f3616a
  csi:
    driver: zfs.csi.openebs.io
    fsType: zfs
    volumeAttributes:
      openebs.io/cas-type: localpv-zfs
      openebs.io/poolname: backup/encrypted/nasdisk/k3s
      storage.kubernetes.io/csiProvisionerIdentity: 1751064728233-9252-zfs.csi.openebs.io
    volumeHandle: pvc-73f9fc83-2188-40ec-817d-c13604f3616a
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: openebs.io/nodeid
          operator: In
          values:
          - remote
  persistentVolumeReclaimPolicy: Retain
  storageClassName: zfs-storage-nas
  volumeMode: Filesystem
status:
  lastPhaseTransitionTime: "2025-07-02T12:42:02Z"
  phase: Bound

The zfsvolume, I can edit and basically do the same changes. And bingo – kubernetes finds my data again!

Now, it’s a bit tricky – and I still haven’t figured out all the nuances – to make the PODs fully use them. I might have to unmount them with zfs umount, and do various tricks like deleting stuff under containers, but eventually, the PODs find the data. I’ll update this article if I ever find out the proper steps….

So, at this stage, I can bring up workloads on my DR node, but there’s still no way for the world to reach them property.

My loadbalancers with pools are intact, but they are wrong for the environment they run. The IPV6 addresses belong at my home, and there is no Unifi Gateway doing port forwarding for IPV4.

Rather than redoing all my loadbalancers and networking, I decide to create a VPN connection to the Unifi gateway and run with my BGP setup as before, with minimal changes.

In my earlier blog posts My Unifi Gateway just learned to do BGP! , BGP part two – A VPN connection to the cloud. and BGP part three – eBGP between a VPS and on-prem I have all the research done. I basically want iBGP from part one, over a VPN connection to the cloud as in part two.

I set up a VPN connection where remote has 192.168.228.2 and the Unifi gateway has 192.168.228.1. I also decide to set the IPV4 default gateway to 192.168.228.1, but adding a static route for the VPN endpoint, the DNS server and a few other (like the API endpoint for Letsencrypt DNS) out directly.

I have left out IPv6 at the time of writing, priortizing getting a working solution up – which I’d probably have done in an enterprise solution at this point, focusing on core functionality. (It pains me, but IPv6 isn’t that core functionality – most people still have IPv4).

So, I need to update the BGP peer:

apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  creationTimestamp: "2025-04-25T09:44:03Z"
  name: unifi
  resourceVersion: "27822736"
  uid: 0a771e2c-8f51-48ae-9236-bd76a2134f62
spec:
  asNumber: 64512
  peerIP: 192.168.228.1
  sourceAddress: None

The sourceAddress, I specified because else, it put the node interfaces ip as the next hop, and not the VPN endpoint

The Unifi end of it:

ip prefix-list LIST-REMOTE-OUTGOING seq 5 deny X.X.X.X/21 le 32
ip prefix-list LIST-REMOTE-OUTGOING seq 6 deny 194.168.228.0/24 le 32
ip prefix-list LIST-REMOTE-OUTGOING seq 8 permit 0.0.0.0/0 le 24


.......

router bgp 64512
 bgp router-id 192.168.1.5
 neighbor linode peer-group
 neighbor linode remote-as 64513
 neighbor metallb peer-group
 neighbor metallb remote-as 64512
 neighbor remote peer-group
 neighbor remote remote-as 64512
 neighbor remote update-source 192.168.228.1
 neighbor 192.168.229.2 peer-group linode
 neighbor fd46:c709:32c6:3::1 peer-group linode
 neighbor 192.168.1.153 peer-group metallb
 neighbor fd46:c709:32c6:0:1e69:7aff:fe64:12e1 peer-group metallb
 neighbor 192.168.228.2 peer-group remote
 !
 address-family ipv4 unicast
  redistribute connected
  neighbor linode next-hop-self
  neighbor linode soft-reconfiguration inbound
  neighbor linode route-map ALLOW-ALL in
  neighbor linode route-map LINODE-OUTGOING out
  neighbor metallb next-hop-self
  neighbor metallb soft-reconfiguration inbound
  neighbor metallb route-map ALLOW-ALL in
  neighbor metallb route-map ALLOW-NONE out
  neighbor remote next-hop-self
  neighbor remote soft-reconfiguration inbound
  neighbor remote route-map ALLOW-ALL in
  neighbor remote route-map REMOTE-OUTGOING out
  maximum-paths 2
 exit-address-family
 !
 address-family ipv6 unicast
  redistribute connected
  neighbor linode activate
  neighbor linode next-hop-self
  neighbor linode soft-reconfiguration inbound
  neighbor linode route-map LINODE-INCOMING-IPV6 in
  neighbor linode route-map LINODE-OUTGOING-IPV6 out
  neighbor metallb activate
  neighbor metallb next-hop-self
  neighbor metallb soft-reconfiguration inbound
  neighbor metallb route-map ALLOW-ALL in
  neighbor metallb route-map ALLOW-ALL out
 exit-address-family
exit
!

route-map REMOTE-OUTGOING permit 10
 match ip address prefix-list LIST-REMOTE-OUTGOING
exit
!
route-map ALLOW-ALL permit 10
exit
!
route-map LINODE-OUTGOING permit 10
 match ip address prefix-list LIST-LINODE-OUTGOING
exit
!
route-map LINODE-OUTGOING-IPV6 permit 10
 match ipv6 address prefix-list LINODE-OUTGOING-IPV6
exit
!
route-map LINODE-INCOMING-IPV6 permit 10
 match ipv6 address prefix-list LINODE-INCOMING-IPV6
exit
!
end

As you can see, I am still running my other existing BGP configuration, just adding to it. I am using the same BGP AS as on my Unifi gateway and my primary node, making it iBGP.

This was basically it, and once configured, prefixes started flowing:

 *> 10.151.24.0/26   192.168.228.2                 100      0 i
 *> 10.151.24.0/26   192.168.228.2                 100      0 i
 *> 192.168.250.0/24 192.168.228.2                 100      0 i
 *> 192.168.250.129/32
                    192.168.228.2                 100      0 i
 *> 192.168.250.151/32
                    192.168.228.2                 100      0 i
 *> 192.168.250.153/32
                    192.168.228.2                 100      0 i
 *> 192.168.250.155/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.0/24 192.168.228.2                 100      0 i
 *> 192.168.251.0/32 192.168.228.2                 100      0 i
 *> 192.168.251.1/32 192.168.228.2                 100      0 i
 *> 192.168.251.5/32 192.168.228.2                 100      0 i
 *> 192.168.251.6/32 192.168.228.2                 100      0 i
 *> 192.168.251.8/32 192.168.228.2                 100      0 i
 *> 192.168.251.9/32 192.168.228.2                 100      0 i
 *> 192.168.251.10/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.11/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.12/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.13/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.14/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.16/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.17/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.18/32
                    192.168.228.2                 100      0 i
 *> 192.168.251.19/32
                    192.168.228.2                 100      0 i

Eureka! My workloads are accessible again.

Now, there’s a whole lot of stuff still not done. Traefik decided it lost all certificates, probably because of storage issues, but it happily recreates them.

But I managed to bring up this blog, and that’s something. The rest is just work and replication, which I’ll probably not do as my other node will be up some time tomorrow.

Careful planning, better naming of things, maybe renaming my node and the file systems matching would probably make this more trivial.

I did, however, achieve my goal: Verify that I can bring up stuff from my backup. The rest of it is just work, which I’ll probably leave out of it this time!

Vegards Blog

The road to enterprise at home: A DR-test!

Bringing up kubernetes

Legg igjen en kommentar Avbryt svar

The road to enterprise at home: A DR-test!

Bringing up kubernetes

Del dette:

Legg igjen en kommentar Avbryt svar