Since I started running Kubernetes with MicroK8s on my Raspberry Pi's and later on my HP Microserver, there's been one challenge, which I have been putting out. Storage.
That is persistent storage for my containers. Take this site running Ghost. It needs some persistent storage, in order to persist the configuration and articles being posted.
Till recently it has been solved with local storage, binding the container to a specific node. That is quite the hassle - I guess for obvious reasons, but let me outline the major pains:
- The node cannot be drained - unless downtime is acceptable
- Nodes needs disk space locally
- Backup needs to happen on the nodes

NFS to the rescue
Looking for solutions, I had been considering NFS for a while, and wanted to try it out - just making an MVP (minimal viable product). It would be setup on a virtual server I have in my Proxmox Virtual Environment. I installed the NFS server
sudo apt install nfs-kernel-server
Then created the nfs export structure:
mkdir -p /data/nfs/ghost-skov-codes
Added it to the export definition /etc/exports
/data/nfs/ghost-skov-codes *(rw,sync,no_subtree_check,insecure)
Finally listing the exports on the server with showmount:
showmount -e 10.0.0.100
Export list 10.0.0.100:
/data/nfs/ghost-skov-codes *
showmount lists exported NFS directories available on a server
Persistent volume
Then the part of getting the persistent volume created:
apiVersion: v1
kind: PersistentVolume
metadata:
name: skov-codes-content-nfs1
namespace:
labels:
directory: skov-codes-content-nfs1
spec:
capacity:
storage: 100M
accessModes:
- ReadWriteMany
storageClassName: manual
nfs:
server: 10.0.0.100
path: /data/nfs/ghost-skov-codes/
And finally after applying, running kubectl to describe the persistent volume:
kubectl describe pv skov-codes-content-nfs1
Will give the following result
Name: skov-codes-content-nfs1
Labels: app.kubernetes.io/managed-by=Helm
directory=skov-codes-content-nfs1
Annotations: meta.helm.sh/release-name: skov-codes
meta.helm.sh/release-namespace: default
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pv-protection]
StorageClass: manual
Status: Bound
Claim: default/skovcodes-local-content-volume-nfs1
Reclaim Policy: Retain
Access Modes: RWX
VolumeMode: Filesystem
Capacity: 100M
Node Affinity: <none>
Message:
Source:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: 10.0.0.100
Path: /data/nfs/ghost-skov-codes/
ReadOnly: false
Events: <none>
After using this new persistent volume, it's possible to drain a node and have the pod continuing to serve.
Listing the pods on the nodes with:
kubectl get pods --output 'jsonpath={range .items[*]}{.spec.nodeName}{" "}{.metadata.namespace}{" "}{.metadata.name}{"\n"}{end}'
Shows all pods to be on the node "kube03":
kube03 default skov-codes-skov-run-c77684b5d-s5v6k
kube03 default skov-codes-skov-run-c77684b5d-fqggt
kube03 default skov-run-594f65775b-b44tc
Trying to drain the node with:
kubectl drain kube03 --ignore-daemonsets
Will now show us the pods being evicted:
➜ ~ kubectl drain kube01 --ignore-daemonsets
node/kube01 cordoned
WARNING: ignoring DaemonSet-managed Pods: ingress/nginx-ingress-microk8s-controller-754gx, kube-system/calico-node-cjbjk
evicting pod kube-system/calico-kube-controllers-dc86ccb69-5ggc4
evicting pod default/skov-codes-skov-run-c77684b5d-vx9tt
evicting pod default/skov-codes-skov-run-c77684b5d-g64j2
error when evicting pods/"skov-codes-skov-run-c77684b5d-g64j2" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod default/skov-codes-skov-run-c77684b5d-g64j2
error when evicting pods/"skov-codes-skov-run-c77684b5d-g64j2" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/skov-codes-skov-run-c77684b5d-vx9tt evicted
pod/skov-codes-skov-run-c77684b5d-g64j2 evicted
node/kube01 evicted
After the pods are all evicted, the node is now drained, and ready for maintenance
Writing this article taught me quite a few things. Having storage sorted with NFS gives a lot of freedom, in terms of shuffling pods around your cluster - however, it does not guarantee you 0 downtime.
I had to implement a PodDisruptionBudget, making sure at least one POD was always up. As well as having more than 1 replica. Something I wasn't sure would work with the Ghost blog image, but it works apparantly due to the NFS storage being "ReadWriteMany".
The POD Disruption Budget ended up like this:
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: skov.codes-skov-run-pdb
namespace:
labels:
app.kubernetes.io/name: skov-run
app.kubernetes.io/instance: skov.codes
spec:
selector:
matchLabels:
app.kubernetes.io/name: skov-run
app.kubernetes.io/instance: skov.codes
minAvailable: 1
At the same time I made sure to up the number of replicas to 2:
apiVersion: apps/v1
kind: Deployment
metadata:
name: skov.codes-skov-run
namespace:
labels:
helm.sh/chart: skov-run-0.1.0
app.kubernetes.io/name: skov-run
app.kubernetes.io/instance: skov.codes
app.kubernetes.io/version: "1.16.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 2
selector:
matchLabels:
app.kubernetes.io/name: skov-run
app.kubernetes.io/instance: skov.code
....

One final thing I had to get in place, was my NGiNX configuration. As previously, the configuration pointed only to the Master Node thus not taking full advantage of having 4 nodes, and leveraging the full potential of running multiple nodes.
I changed the configuration from a simple proxy_pass configuration to a load balancer configuration instead.
The new configuration is split into 2 parts - one for the upstream definition.
upstream app {
server 10.0.0.150:80;
server 10.0.0.151:80;
server 10.0.0.152:80;
server 10.0.0.153:80;
}
cluster_upstream.conf
And then the change to the proxy_pass directive in my virtual host configuration.
proxy_pass http://app;
Conclusions
I came from a shaky implementation of my Ghost blog site, where a node upgrade would most definetely lead to downtime, and moved to a far more resilient setup. The webserver now has load balancing. My application deployment on my Kubernetes cluster, now have a POD Disruption Policy making sure a POD is always up. Both of these things allowing me to drain nodes and patch and upgrade my cluster, without downtime. Very satisfying for a good nerdy weekend!