Run-once Kubernetes DaemonSet pods

When you need to run something just once, but on each node

Sometimes there’s a need to run some code just once, but on each node in your Kubernetes cluster: collecting some immutable node information is a prime use case. For example, for my Floating IP controller I need to collect the Anchor IP address on every node. See my previous post for context.

Kubernetes does not support this out-of-the box: DaemonSet supports only RestartPolicy of Always, so any pod that stops will be started immediately again. There are workarounds documented only, but they all rotate around the idea of having something lightweight running all the time on the node, to satisfy Kubernetes restart policy. This strikes me as inelegant.

Here I present an alternative idea:

Use node affinity to simulate “RunOnce” restart policy

Simple idea: if a Pod changes the label of the node where it runs, and node affinity of the DaemonSet is configured to exclude nodes with such a label, then once the pod exits, it won’t be scheduled again on this node.

See the full source code of my “Anchor IP annotator” for the full details, but it’s pretty easy in Rust (and probably other languages as well 😀):

use anyhow::Result;
use k8s_openapi::api::core::v1::Node;
use kube::api::{Api, Patch, PatchParams};
use kube::Client;
use serde_json::json;
use tracing::{debug, info};

pub async fn annotate_node(node_name: &str, anchor_ip: &str) -> Result<()> {
    let client = Client::try_default().await?;
    let nodes: Api<Node> = Api::all(client);

    let patch = json!({
        "apiVersion": "v1",
        "kind": "Node",
        "metadata": {
            "annotations": {
                "k8s.haim.dev/digital-ocean-anchor-ip": anchor_ip
            },
            "labels": {
                "k8s.haim.dev/digital-ocean-anchor-ip": "saved"
            },
        },
    });
    debug!(node = node_name, ip = anchor_ip, "Applying annotation");
    let patch_params = PatchParams::default();
    nodes.patch(node_name, &patch_params, &Patch::Strategic(&patch)).await?;
    info!(node = node_name, ip = anchor_ip, "Annotation applied successfully");
    Ok(())
}

Kubernetes setup

Besides the above code, we need to:

  1. Configure RBAC to allow DaemonSet service account write access to the nodes. In my particular case, it’s not a problem because I need this for the core functionality of this app, but otherwise I consider this the one biggest downside of this method.

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: do-floating-ip
    rules:
      - apiGroups:
          - ""
        resources:
          - nodes
        verbs:
          - get
          - list
          - watch
          - patch
          - update
    
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: do-floating-ip
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: do-floating-ip
    subjects:
      - kind: ServiceAccount
        name: do-floating-ip
        namespace: default
    
  2. Create the DaemonSet:

    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: anchor-ip-annotator
      labels:
        app.kubernetes.io/name: do-floating-ip
        app.kubernetes.io/component: annotator
      namespace: default
    spec:
      selector:
        matchLabels:
          app.kubernetes.io/name: do-floating-ip
          app.kubernetes.io/component: annotator
      template:
        metadata:
          name: anchor-ip-annotator
          labels:
            app.kubernetes.io/name: do-floating-ip
            app.kubernetes.io/component: annotator
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                  - matchExpressions:
                    - key: k8s.haim.dev/digital-ocean-anchor-ip
                      operator: DoesNotExist
          containers:
            - name: anchor-ip-annotator
              image: ghcr.io/haimgel/do-floating-ip:0.2.0
              command: ["/app/anchor-ip-annotator"]
          serviceAccount: do-floating-ip
    

    That nodeAffinity in the spec above should do the trick!

  3. Let’s watch the events and verify:

    kubectl get events -w  --sort-by='.metadata.creationTimestamp'
    
    SuccessfulCreate    daemonset/anchor-ip-annotator           Created pod: anchor-ip-annotator-kwjct
    Scheduled           pod/anchor-ip-annotator-kwjct           Successfully assigned default/anchor-ip-annotator-kwjct to do-2gb-1cpu-utcur
    Pulling             pod/anchor-ip-annotator-kwjct           Pulling image "ghcr.io/haimgel/do-floating-ip:main"
    Pulled              pod/anchor-ip-annotator-kwjct           Successfully pulled image "ghcr.io/haimgel/do-floating-ip:main" in 9.817472304s
    Created             pod/anchor-ip-annotator-kwjct           Created container anchor-ip-annotator
    Started             pod/anchor-ip-annotator-kwjct           Started container anchor-ip-annotator
    SuccessfulDelete    daemonset/anchor-ip-annotator           Deleted pod: anchor-ip-annotator-kwjct
    

    Works as intended! 🎉 If we need to re-run the job, it’s equally easy: just remove the label from the node, and the pod will start there immediately.