When running in EKS Availability Zone is Unset or Wrong
Symptom
In the WarpStream UI for the cluster you see warpstream-unset-az set as the availability zone of the agent and/or errors in the agent logs similar to the following:
{"time":"2025-04-02T22:23:46.467567362Z","level":"ERROR","msg":"failed to determine availability zone","git_commit":"32d51900b2423718b692a0edd29b08b11b7dd74e","git_time":"2025-04-02T18:53:04Z","git_modified":false,"go_os":"linux","go_arch":"arm64","process_generation":"081c0596-25c3-4147-88d5-d4416cb6a998","hostname_fqdn":"warp-agent-default-67d9795854-wrwh8","hostname_short":"warp-agent-default-67d9795854-wrwh8","private_ips":["10.0.115.97"],"num_vcpus":3,"kafka_enabled":true,"virtual_cluster_id":"vci_bc62be92_d3ba_4b0c_90e8_4e7bc621a693","module":"agent_azloader","error":{"message":"awsECSErr: missing metadata uri in environment (ECS_CONTAINER_METADATA_URI_V4), likely not running in ECS\nawsEC2Err: error getting metadata: operation error ec2imds: GetMetadata, canceled, context deadline exceeded\ngcpErr: error getting availablity zone: \nazureErr: error getting location: \nk8sErr: unable to get node information: nodes \"i-025487767185742f1\" is forbidden: User \"system:serviceaccount:warpstream:warpstream0-agent\" cannot get resource \"nodes\" in API group \"\" at the cluster scope"}}
Context
The WarpStream Agents try to use various methods to determine which availability zone the agent is running in.
When it can't determine the availability zone it falls back to warpstream-unset-az and logs error messages.
Problem
AWS by default prevents EKS pods from contacting the metadata service to prevent instance metadata leaks. While this is good security practice for normal instances, it prevents services within EKS from querying information about the instance.
Solution
Option A
Use our to deploy WarpStream. The helm chart with it's default configuration will create a Kubernetes ClusterRole and ClusterRoleBinding which allows the WarpStream pods to lookup get node they are running on within the Kubernetes API and find the availability zone from node labels.
Option B
Create the appropriate ClusterRole, ClusterRoleBinding, and ServiceAccount so the WarpStream agent can get availability zone information from the Kubernetes API
Modify your EKS Node Launch Template configuration to set http-put-response-hop-limit to 2.
This will allow the pods running on a EKS instance to connect to the AWS metadata service to find the availability zone.
When running in Kubernetes WarpStream pods end up in the same zone or node
Symptom
Some or all of your WarpStream pods end up running in the same availability zone or on the same Kubernetes node instead of being evenly spread out.
Context
When running workloads in Kubernetes it will try it's best to make sure pods from the same deployment are evenly spread across all nodes and availability zones, however this isn't always possible.
Problem
Depending on Kubernetes cluster configuration and other workloads on the cluster Kubernetes may not evenly deploy WarpStream pods across zones or nodes. Some Kubernetes deployments prioritize bin-packing rather then high availability of workloads. This varies by Kubernetes distribution and is not always configurable.
Solution
topologySpreadConstraints:
# Try to spread pods across multiple zones
- maxSkew: 1 # +/- one pod per zone
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
# minDomains is only available in Kubernetes 1.30+
# Remove this field if you are on an older Kubernetes
# version.
# When possible set to the number of available
# availability zones in your cluster.
minDomains: 3
# Label Selector to select the warpstream deployment
labelSelector:
matchLabels:
app.kubernetes.io/name: warpstream-agent
app.kubernetes.io/instance: warpstream-agent # Set to your helm release name
affinity:
# Make sure pods are not scheduled on the same node to prevent bin packing
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
# Label Selector to select the warpstream deployment
- labelSelector:
matchLabels:
app.kubernetes.io/name: warpstream-agent
app.kubernetes.io/instance: warpstream-agent # Set to your helm release name
topologyKey: kubernetes.io/hostname
Use Kubernetes topologySpreadConstraints and podAntiAffinity to force Kubernetes to spread WarpStream pods evenly across zones and nodes. If your WarpStream pods are using our you can set the following in your helm values: