UpTimeRobot and Kubernetes nodes

I recently moved my home Kubernetes cluster to using Talos Linux. A new take on how nodes by consuming them as immutable resources. Thus avoiding the cumbersome work of maintaining an OS.

Previously I ran my cluster on Ubuntu Cloud Init images and then using MicroK8s. I however found myself using time on OS patching and when I was introduced to Talos Linux through work, I knew I just had to try it out.

Setting up the cluster was incredibly easy, and there even was a great article on how set it up on Proxmox - my hypervisor of choice :-) That article can be found here

There is a whole list of how to install it on different platforms - virtual, cloud and bare metal.

Back to that monitoring thing....

So back on the Ubuntu servers I had an ansible script to add a cronjob, which would hit up a heartbeat on UptimeRobot - so I would know if a node was down. After moving to Talos Linux, that was no longer possible. So I came to thinking, why not get a cronjob running inside my Kubernetes Cluster. That way, it would also tell me if scheduling workloads was out of order.

The update of the heartbeat is a simple https request, and I chose to execute it with wget. So after doing some thinking, I decided to create my own Docker image based on alpine and install wget in that.

So just a short Dockerfile

FROM alpine:3.15.4
RUN apk add wget

So build and push it:

docker build . -t simcax/alpine-wget:3.15.4
docker push simcax/alpine-wget:3.15.4

Now I was ready to use it in a cronjob. So I created my cronjob.yaml for the first master:

apiVersion: batch/v1
kind: CronJob
metadata:
  namespace: crons
  name: uptime-robot-master-01
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      ttlSecondsAfterFinished: 30
      template:
        spec:
          nodeName: talos01
          containers:
          - name: uptime-robot-heartbeat-talos-master-01
            image: simcax/alpine-wget:3.15.4
            imagePullPolicy: IfNotPresent
            command:
            - wget
            - --spider
            - https://heartbeat.uptimerobot.com/<uptime-robot-heartbeat-random-uid>
          restartPolicy: OnFailure

A couple of things of note: - I stuck with best practices and had the jobs be created in a namespace by itself - crons - The ttlSecondsAfterFinished makes sure to get the jobs cleaned up once they are done

And that was it - now I have a monitor for not only the nodes being up, but also scheduable!

Skov Codes

"Uptimerobot Monitoring Talos"

UpTimeRobot and Kubernetes nodes

Back to that monitoring thing....

UpTimeRobot and Kubernetes nodes

Back to that monitoring thing....

links

social