Over the last few weeks, I have been tinkering with Harvester upgrades in my homelab. Harvester v1.4.1 just came out, and I was eager to upgrade my cluster running v1.4.0. But it got stuck during the upgrade.
Almost since the beginning of my Harvester journey, my upgrades always get stuck at a node being in a pre-drained state.
The issue is addressed here, and is related to Longhorn.
It’s odd to me that this issue continues to pop up. I have seen it on the following paths:
v1.2.0 → v1.2.1
v1.2.1 → v1.2.2
v1.2.2 → v1.3.1
v1.4.0 → v1.4.1
It really is a simple fix:
Identify the stuck node
Log into the management node and run the following command:
export TARGET_NODE="name-of-node-that-is-stuck"
kubectl get pods \
--namespace longhorn-system \
--field-selector spec.nodeName=${TARGET_NODE} \
-o custom-columns=":metadata.name" \
--no-headers \
| grep "instance-manager" \
| while read pod; do
kubectl delete poddisruptionbudget $pod -n longhorn-system
done
After that, you will have to repeat for each stuck node.
Anyway, maybe this will help someone.
Cheers,
Joe