Sometimes there’s a need to run some code just once, but on each node in your Kubernetes cluster: collecting some immutable node information is a prime use case. For example, for my Floating IP controller I need to collect the Anchor IP address on every node. See my previous post for context.
Kubernetes does not support this out-of-the box: DaemonSet supports
only RestartPolicy
of Always
, so any pod that stops will be started immediately again. There are workarounds
documented only, but they all rotate around the idea of having something lightweight running all the time on the node,
to satisfy Kubernetes restart policy. This strikes me as inelegant.
Here I present an alternative idea:
Use node affinity to simulate “RunOnce” restart policy
Simple idea: if a Pod changes the label of the node where it runs, and node affinity of the DaemonSet is configured to exclude nodes with such a label, then once the pod exits, it won’t be scheduled again on this node.
See the full source code of my “Anchor IP annotator” for the full details, but it’s pretty easy in Rust (and probably other languages as well 😀):
use anyhow::Result;
use k8s_openapi::api::core::v1::Node;
use kube::api::{Api, Patch, PatchParams};
use kube::Client;
use serde_json::json;
use tracing::{debug, info};
pub async fn annotate_node(node_name: &str, anchor_ip: &str) -> Result<()> {
let client = Client::try_default().await?;
let nodes: Api<Node> = Api::all(client);
let patch = json!({
"apiVersion": "v1",
"kind": "Node",
"metadata": {
"annotations": {
"": anchor_ip
"labels": {
"": "saved"
debug!(node = node_name, ip = anchor_ip, "Applying annotation");
let patch_params = PatchParams::default();
nodes.patch(node_name, &patch_params, &Patch::Strategic(&patch)).await?;
info!(node = node_name, ip = anchor_ip, "Annotation applied successfully");
Kubernetes setup
Besides the above code, we need to:
Configure RBAC to allow DaemonSet service account write access to the nodes. In my particular case, it’s not a problem because I need this for the core functionality of this app, but otherwise I consider this the one biggest downside of this method.
apiVersion: kind: ClusterRole metadata: name: do-floating-ip rules: - apiGroups: - "" resources: - nodes verbs: - get - list - watch - patch - update
apiVersion: kind: ClusterRoleBinding metadata: name: do-floating-ip roleRef: apiGroup: kind: ClusterRole name: do-floating-ip subjects: - kind: ServiceAccount name: do-floating-ip namespace: default
Create the DaemonSet:
apiVersion: apps/v1 kind: DaemonSet metadata: name: anchor-ip-annotator labels: do-floating-ip annotator namespace: default spec: selector: matchLabels: do-floating-ip annotator template: metadata: name: anchor-ip-annotator labels: do-floating-ip annotator spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: operator: DoesNotExist containers: - name: anchor-ip-annotator image: command: ["/app/anchor-ip-annotator"] serviceAccount: do-floating-ip
in the spec above should do the trick!Let’s watch the events and verify:
kubectl get events -w --sort-by='.metadata.creationTimestamp'
SuccessfulCreate daemonset/anchor-ip-annotator Created pod: anchor-ip-annotator-kwjct Scheduled pod/anchor-ip-annotator-kwjct Successfully assigned default/anchor-ip-annotator-kwjct to do-2gb-1cpu-utcur Pulling pod/anchor-ip-annotator-kwjct Pulling image "" Pulled pod/anchor-ip-annotator-kwjct Successfully pulled image "" in 9.817472304s Created pod/anchor-ip-annotator-kwjct Created container anchor-ip-annotator Started pod/anchor-ip-annotator-kwjct Started container anchor-ip-annotator SuccessfulDelete daemonset/anchor-ip-annotator Deleted pod: anchor-ip-annotator-kwjct
Works as intended! 🎉 If we need to re-run the job, it’s equally easy: just remove the label from the node, and the pod will start there immediately.