The problem
DigitalOcean Kubernetes platform is great. Its Load Balancer is well-integrated and works great, too, unless you need to expose UDP services in your cluster: UDP load balancing is not supported! So it’s not possible to load-balance DNS, HTTP/3 or any other UDP service.
How are we going to solve this?
DigitalOcean supports floating IPs: publicly-accessible static IP addresses that you can assign to Droplets and instantly remap between other Droplets in the same datacenter.
Even though the documentation says that DigitalOcean Kubernetes worker nodes are not supported, I verified that the floating IP traffic, is, in fact, routed to the worker nodes. But getting that traffic to the ingress service is challenging!
Challenge #1: Assigning the floating IP to the right node
I looked at this challenge as an opportunity to write a Kubernetes controller in Rust. It turned out to be fairly straight-forward and quite ergonomic. The main idea is:
- Label the pods that need a floating IP routed to them. I’m using Traefik, and this is quite easy if you’re using Helm:I’m not putting the IP directly into the label: hopefully one day DigitalOcean will support IPv6 and IPv6 addresses contain characters that are illegal for K8S label values.
deployment: podAnnotations: k8s.haim.dev/floating-ip: "104.248.100.100" podLabels: k8s.haim.dev/floating-ip: "true"
- Watch these pods, and when we see a pod that is running and ready:
- Query the node that this pod runs on, to get DigitalOcean droplet ID for this node.
- Call DigitalOcean API to assign the floating IP to this droplet.
- Take care to retry in case of API failures and such.
Challenge #2: DigitalOcean firewall rules
DigitalOcean manages firewall rules for the Kubernetes cluster automatically, and blocks any direct incoming traffic. If you add a rule to allow traffic to the floating IP to come through, it will be removed automatically within a few minutes.
However, firewall rules are additive! All we need to do is to create a new firewall, target all the nodes in the cluster using tags (just like the automatic firewall does it), and open the needed ports there.
It is possible to query K8S service spec and manage firewall automatically, but for now a manual config will do.
Challenge #3: Listening address
The floating IP address is not assigned to the worker nodes directly. Instead, worker nodes are assigned anchor IP
addresses (in the 10.20.x.x range), and the traffic to the floating IP is directed to this IP address. Therefore,
if we want a Kubernetes service to receive traffic to a floating IP address, we could just add these anchor IPs as
externalIPs
:
apiVersion: v1
kind: Service
metadata:
name: traefik
spec:
type: ClusterIP
externalIPs:
- 10.20.0.1
- 10.20.0.2
- 10.20.0.3
- 10.20.0.4
- 10.20.0.5
- 10.20.0.6
- 10.20.0.7
- 10.20.0.8
...
For now, I just added a whole lot of these addresses there! I manage my cluster with Terraform, so I used a template to generate the configuration for the Traefik Helm chart:
...
service:
spec:
type: ClusterIP
externalIPs:
%{ for addr in range(1,50) ~}
- 10.20.0.${addr}
%{ endfor ~}
...
My cluster is just a few nodes, and these addresses are allocated sequentially, so this will do for now. In the future, I want to make this more robust and detect the actual IPs in use in the cluster.
Putting it all together
Let’s create a service account for the controller first:
apiVersion: v1 kind: ServiceAccount metadata: name: do-floating-ip namespace: default automountServiceAccountToken: true
Create RBAC role and role binding to allow this service account read-only access to pods and nodes:
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: do-floating-ip rules: - apiGroups: - "" resources: - nodes - pods verbs: - get - list - watch
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: do-floating-ip roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: do-floating-ip subjects: - kind: ServiceAccount name: do-floating-ip namespace: default
Generate DigitalOcean API token and store it as a secret in the cluster:
kubectl create secret generic digital-ocean-token --from-literal=token=$DIGITALOCEAN_TOKEN
Create a deployment for the controller:
apiVersion: apps/v1 kind: Deployment metadata: name: floating-ip-controller namespace: default labels: app.kubernetes.io/name: do-floating-ip app.kubernetes.io/component: controller spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: do-floating-ip app.kubernetes.io/component: controller template: metadata: name: floating-ip-controller labels: app.kubernetes.io/name: do-floating-ip app.kubernetes.io/component: controller spec: containers: - name: floating-ip-controller image: ghcr.io/haimgel/do-floating-ip:0.2.0 command: ["/app/floating-ip-controller"] env: - name: DIGITALOCEAN_TOKEN valueFrom: secretKeyRef: key: token name: digital-ocean-token serviceAccount: do-floating-ip automountServiceAccountToken: true
If you have your service pods annotated and labeled (see Challenge #1 above), it’s time to check the logs to confirm everything works as expected:
kubectl logs -l app.kubernetes.io/name=do-floating-ip
You should see something like this:
{"timestamp":"2022-01-03T20:01:29.260370Z","level":"DEBUG","message":"Pod is either not running or not ready, ignoring it","pod":"traefik-648cff6549-7xzbx","target":"do_floating_ip_k8s::controller","span":{"object.reason":"object updated","object.ref":"Pod.v1./traefik-648cff6549-7xzbx.kube-system","name":"reconciling object"},"spans":[{"object.reason":"object updated","object.ref":"Pod.v1./traefik-648cff6549-7xzbx.kube-system","name":"reconciling object"}]} {"timestamp":"2022-01-03T20:01:40.851022Z","level":"DEBUG","message":"Pod is running and ready","pod":"traefik-79d44c5894-kcm8f","target":"do_floating_ip_k8s::controller","span":{"object.reason":"object updated","object.ref":"Pod.v1./traefik-79d44c5894-kcm8f.kube-system","name":"reconciling object"},"spans":[{"object.reason":"object updated","object.ref":"Pod.v1./traefik-79d44c5894-kcm8f.kube-system","name":"reconciling object"}]} {"timestamp":"2022-01-03T20:01:42.123086Z","level":"DEBUG","message":"Floating IP attach in progress","ip":"104.248.100.100","droplet":123456789,"target":"do_floating_ip_k8s::floating_ip","span":{"object.reason":"object updated","object.ref":"Pod.v1./traefik-79d44c5894-kcm8f.kube-system","name":"reconciling object"},"spans":[{"object.reason":"object updated","object.ref":"Pod.v1./traefik-79d44c5894-kcm8f.kube-system","name":"reconciling object"}]} {"timestamp":"2022-01-03T20:01:47.433541Z","level":"INFO","message":"Floating IP attached to droplet","ip":"104.248.100.100","droplet":123456789,"target":"do_floating_ip_k8s::floating_ip","span":{"object.reason":"object updated","object.ref":"Pod.v1./traefik-79d44c5894-kcm8f.kube-system","name":"reconciling object"},"spans":[{"object.reason":"object updated","object.ref":"Pod.v1./traefik-79d44c5894-kcm8f.kube-system","name":"reconciling object"}]}
It’s time to test that the traffic is received as expected, and then to test the fail over: delete the node where the pod that receives the traffic is running, and watch the controller logs: it should reassign the floating IP to the node where the new service pod has started! 🎉