LogoLogo
WarpStream.comSlackDiscordContact UsCreate Account
  • Overview
    • Introduction
    • Architecture
      • Service Discovery
      • Write Path
      • Read Path
      • Life of a Request (Simplified)
    • Change Log
  • Getting Started
    • Install the WarpStream Agent / CLI
    • Run the Demo
    • "Hello World" for Apache Kafka
  • BYOC
    • Run the Agents Locally
    • Deploy the Agents
      • Object Storage Configuration
      • Kubernetes Known Issues
      • Rolling Restarts and Upgrades
    • Infrastructure as Code
      • Terraform Provider
      • Helm charts
      • Terraform Modules
    • Monitoring
      • Pre-made Datadog Dashboard
      • Pre-made Grafana Dashboard
      • Important Metrics and Logs
      • Recommended List of Alerts
      • Monitoring Consumer Groups
      • Hosted Prometheus Endpoint
    • Client Configuration
      • Tuning for Performance
      • Configure Clients to Eliminate AZ Networking Costs
        • Force Interzone Load Balancing
      • Configuring Kafka Client ID Features
      • Known Issues
    • Authentication
      • SASL Authentication
      • Mutual TLS (mTLS)
      • Basic Authentication
    • Advanced Agent Deployment Options
      • Agent Roles
      • Agent Groups
      • Protect Data in Motion with TLS Encryption
      • Low Latency Clusters
      • Network Architecture Considerations
      • Agent Configuration Reference
      • Reducing Infrastructure Costs
      • Client Configuration Auto-tuning
    • Hosted Metadata Endpoint
    • Managed Data Pipelines
      • Cookbooks
    • Schema Registry
      • WarpStream BYOC Schema Registry
      • Schema Validation
      • WarpStream Schema Linking
    • Port Forwarding (K8s)
    • Orbit
    • Enable SAML Single Sign-on (SSO)
    • Trusted Domains
    • Diagnostics
      • GoMaxProcs
      • Small Files
  • Reference
    • ACLs
    • Billing
      • Direct billing
      • AWS Marketplace
    • Benchmarking
    • Compression
    • Protocol and Feature Support
      • Kafka vs WarpStream Configuration Reference
      • Compacted topics
    • Secrets Overview
    • Security and Privacy Considerations
    • API Reference
      • API Keys
        • Create
        • Delete
        • List
      • Virtual Clusters
        • Create
        • Delete
        • Describe
        • List
        • DescribeConfiguration
        • UpdateConfiguration
      • Virtual Clusters Credentials
        • Create
        • Delete
        • List
      • Monitoring
        • Describe All Consumer Groups
      • Pipelines
        • List Pipelines
        • Create Pipeline
        • Delete Pipeline
        • Describe Pipeline
        • Create Pipeline Configuration
        • Change Pipeline State
      • Invoices
        • Get Pending Invoice
        • Get Past Invoice
    • CLI Reference
      • warpstream agent
      • warpstream demo
      • warpstream cli
      • warpstream cli-beta
        • benchmark-consumer
        • benchmark-producer
        • console-consumer
        • console-producer
        • consumer-group-lag
        • diagnose-record
        • file-reader
        • file-scrubber
      • warpstream playground
    • Integrations
      • Arroyo
      • AWS Lambda Triggers
      • ClickHouse
      • Debezium
      • Decodable
      • DeltaStream
      • docker-compose
      • DuckDB
      • ElastiFlow
      • Estuary
      • Fly.io
      • Imply
      • InfluxDB
      • Kestra
      • Materialize
      • MinIO
      • MirrorMaker
      • MotherDuck
      • Ockam
      • OpenTelemetry Collector
      • ParadeDB
      • Parquet
      • Quix Streams
      • Railway
      • Redpanda Console
      • RisingWave
      • Rockset
      • ShadowTraffic
      • SQLite
      • Streambased
      • Streamlit
      • Timeplus
      • Tinybird
      • Upsolver
    • Partitions Auto-Scaler (beta)
    • Serverless Clusters
Powered by GitBook
On this page
  • When running in EKS Availability Zone is Unset or Wrong
  • Symptom
  • Context
  • Problem
  • Solution
  • When running in Kubernetes WarpStream pods end up in the same zone or node
  • Symptom
  • Context
  • Problem
  • Solution

Was this helpful?

  1. BYOC
  2. Deploy the Agents

Kubernetes Known Issues

PreviousObject Storage ConfigurationNextRolling Restarts and Upgrades

Last updated 1 month ago

Was this helpful?

When running in EKS Availability Zone is Unset or Wrong

Symptom

In the WarpStream UI for the cluster you see warpstream-unset-az set as the availability zone of the agent and/or errors in the agent logs similar to the following:

{"time":"2025-04-02T22:23:46.467567362Z","level":"ERROR","msg":"failed to determine availability zone","git_commit":"32d51900b2423718b692a0edd29b08b11b7dd74e","git_time":"2025-04-02T18:53:04Z","git_modified":false,"go_os":"linux","go_arch":"arm64","process_generation":"081c0596-25c3-4147-88d5-d4416cb6a998","hostname_fqdn":"warp-agent-default-67d9795854-wrwh8","hostname_short":"warp-agent-default-67d9795854-wrwh8","private_ips":["10.0.115.97"],"num_vcpus":3,"kafka_enabled":true,"virtual_cluster_id":"vci_bc62be92_d3ba_4b0c_90e8_4e7bc621a693","module":"agent_azloader","error":{"message":"awsECSErr: missing metadata uri in environment (ECS_CONTAINER_METADATA_URI_V4), likely not running in ECS\nawsEC2Err: error getting metadata: operation error ec2imds: GetMetadata, canceled, context deadline exceeded\ngcpErr: error getting availablity zone: \nazureErr: error getting location: \nk8sErr: unable to get node information: nodes \"i-025487767185742f1\" is forbidden: User \"system:serviceaccount:warpstream:warpstream0-agent\" cannot get resource \"nodes\" in API group \"\" at the cluster scope"}}

Context

The WarpStream Agents try to use various methods to determine which availability zone the agent is running in.

When it can't determine the availability zone it falls back to warpstream-unset-az and logs error messages.

Problem

AWS by default prevents EKS pods from contacting the metadata service to prevent instance metadata leaks. While this is good security practice for normal instances, it prevents services within EKS from querying information about the instance.

Solution

Option A

Use our to deploy WarpStream. The helm chart with it's default configuration will create a Kubernetes ClusterRole and ClusterRoleBinding which allows the WarpStream pods to lookup get node they are running on within the Kubernetes API and find the availability zone from node labels.

Option B

Create the appropriate ClusterRole, ClusterRoleBinding, and ServiceAccount so the WarpStream agent can get availability zone information from the Kubernetes API

apiVersion: v1
kind: ServiceAccount
metadata:
  name: warpstream-agent
  namespace: ${your-namespace}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: warpstream-agent
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - watch
  - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: warpstream-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: warpstream-agent
subjects:
- kind: ServiceAccount
  name: warpstream-agent
  namespace: ${your-namespace}

Then on your WarpStream deployment set the pod service account to warpstream-agent.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: warpstream-agent
  namespace: ${your-namespace}
spec:
  selector:
    matchLabels:
      app.kubernetes.io/app: warpstream-agent
  template:
    metadata:
      labels:
        app.kubernetes.io/app: warpstream-agent
    spec:
      containers:
      - args:
        - agent
        ...
        image: public.ecr.aws/warpstream-labs/warpstream_agent:latest
      ...
      serviceAccount: warpstream-agent

Option C

Modify your EKS Node Launch Template configuration to set http-put-response-hop-limit to 2.

This will allow the pods running on a EKS instance to connect to the AWS metadata service to find the availability zone.

When running in Kubernetes WarpStream pods end up in the same zone or node

Symptom

Some or all of your WarpStream pods end up running in the same availability zone or on the same Kubernetes node instead of being evenly spread out.

Context

When running workloads in Kubernetes it will try it's best to make sure pods from the same deployment are evenly spread across all nodes and availability zones, however this isn't always possible.

Problem

Depending on Kubernetes cluster configuration and other workloads on the cluster Kubernetes may not evenly deploy WarpStream pods across zones or nodes. Some Kubernetes deployments prioritize bin-packing rather then high availability of workloads. This varies by Kubernetes distribution and is not always configurable.

Solution

topologySpreadConstraints:
  # Try to spread pods across multiple zones
  - maxSkew: 1 # +/- one pod per zone
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    # minDomains is only available in Kubernetes 1.30+
    # Remove this field if you are on an older Kubernetes
    # version.
    # When possible set to the number of available 
    # availability zones in your cluster.
    minDomains: 3
    # Label Selector to select the warpstream deployment
    labelSelector:
      matchLabels:
        app.kubernetes.io/name: warpstream-agent
        app.kubernetes.io/instance: warpstream-agent # Set to your helm release name

affinity:
  # Make sure pods are not scheduled on the same node to prevent bin packing
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    # Label Selector to select the warpstream deployment
    - labelSelector:
        matchLabels:
          app.kubernetes.io/name: warpstream-agent
          app.kubernetes.io/instance: warpstream-agent # Set to your helm release name
      topologyKey: kubernetes.io/hostname

Use Kubernetes topologySpreadConstraints and podAntiAffinity to force Kubernetes to spread WarpStream pods evenly across zones and nodes. If your WarpStream pods are using our you can set the following in your helm values:

Helm Chart
Helm Chart