LogoLogo
WarpStream.comSlackDiscordContact UsCreate Account
  • Overview
    • Introduction
    • Architecture
      • Service Discovery
      • Write Path
      • Read Path
      • Life of a Request (Simplified)
    • Change Log
  • Getting Started
    • Install the WarpStream Agent / CLI
    • Run the Demo
    • "Hello World" for Apache Kafka
  • BYOC
    • Run the Agents Locally
    • Deploy the Agents
      • Object Storage Configuration
      • Kubernetes Known Issues
      • Rolling Restarts and Upgrades
    • Infrastructure as Code
      • Terraform Provider
      • Helm charts
      • Terraform Modules
    • Monitoring
      • Pre-made Datadog Dashboard
      • Pre-made Grafana Dashboard
      • Important Metrics and Logs
      • Recommended List of Alerts
      • Monitoring Consumer Groups
      • Hosted Prometheus Endpoint
    • Client Configuration
      • Tuning for Performance
      • Configure Clients to Eliminate AZ Networking Costs
        • Force Interzone Load Balancing
      • Configuring Kafka Client ID Features
      • Known Issues
    • Authentication
      • SASL Authentication
      • Mutual TLS (mTLS)
      • Basic Authentication
    • Advanced Agent Deployment Options
      • Agent Roles
      • Agent Groups
      • Protect Data in Motion with TLS Encryption
      • Low Latency Clusters
      • Network Architecture Considerations
      • Agent Configuration Reference
      • Reducing Infrastructure Costs
      • Client Configuration Auto-tuning
    • Hosted Metadata Endpoint
    • Managed Data Pipelines
      • Cookbooks
    • Schema Registry
      • WarpStream BYOC Schema Registry
      • Schema Validation
      • WarpStream Schema Linking
    • Port Forwarding (K8s)
    • Orbit
    • Enable SAML Single Sign-on (SSO)
    • Trusted Domains
    • Diagnostics
      • GoMaxProcs
      • Small Files
  • Reference
    • ACLs
    • Billing
      • Direct billing
      • AWS Marketplace
    • Benchmarking
    • Compression
    • Protocol and Feature Support
      • Kafka vs WarpStream Configuration Reference
      • Compacted topics
    • Secrets Overview
    • Security and Privacy Considerations
    • API Reference
      • API Keys
        • Create
        • Delete
        • List
      • Virtual Clusters
        • Create
        • Delete
        • Describe
        • List
        • DescribeConfiguration
        • UpdateConfiguration
      • Virtual Clusters Credentials
        • Create
        • Delete
        • List
      • Monitoring
        • Describe All Consumer Groups
      • Pipelines
        • List Pipelines
        • Create Pipeline
        • Delete Pipeline
        • Describe Pipeline
        • Create Pipeline Configuration
        • Change Pipeline State
      • Invoices
        • Get Pending Invoice
        • Get Past Invoice
    • CLI Reference
      • warpstream agent
      • warpstream demo
      • warpstream cli
      • warpstream playground
    • Integrations
      • Arroyo
      • AWS Lambda Triggers
      • ClickHouse
      • Debezium
      • Decodable
      • DeltaStream
      • docker-compose
      • DuckDB
      • ElastiFlow
      • Estuary
      • Fly.io
      • Imply
      • InfluxDB
      • Kestra
      • Materialize
      • MinIO
      • MirrorMaker
      • MotherDuck
      • Ockam
      • OpenTelemetry Collector
      • ParadeDB
      • Parquet
      • Quix Streams
      • Railway
      • Redpanda Console
      • RisingWave
      • Rockset
      • ShadowTraffic
      • SQLite
      • Streambased
      • Streamlit
      • Timeplus
      • Tinybird
      • Upsolver
    • Partitions Auto-Scaler (beta)
    • Serverless Clusters
Powered by GitBook
On this page
  • Available Features
  • warpstream_az
  • warpstream_proxy_target
  • warpstream_agent_group
  • warpstream_partition_assignment_strategy
  • ws_host_override
  • Configuring Multiple Features

Was this helpful?

  1. BYOC
  2. Client Configuration

Configuring Kafka Client ID Features

WarpStream uses the client id setting that you set on Apache Kafka clients to control how certain features are activated

Note that configuring the client id is optional, and that if you set the client id to a string that WarpStream does not recognize, no optional feature will be enabled.

WarpStream parses the Apache Kafka client id for key value pairs that are separated with commas. Each key value pair is defined as key=value

This means that WarpStream parses the following Apache Kafka client id:

a=b,c=d,rest_of_the_client_id

into two key/value pairs: key: a, value: b and key:c, value: d

These keys/values pair are referred to as "client id features" and are used to implement WarpStream-specific functionality while still remaining within the confines of the Kafka protocol.

Available Features

warpstream_az

kgo.NewClient(...,
    kgo.ClientID("warpstream_az=us-east-1a"),
)

The warpstream_az feature enables zone-aware routing to eliminate inter-zone networking costs. The example above creates a new Apache Kafka client that will tell the WarpStream service discovery system that it's running in the us-east-1a availability zone.

The service discovery system will use this information to route the client to Agents in the same zone to eliminate inter-zone networking fees.

Note however that the service discovery system favors availability, so if it cannot find Agents in specified availability zone, it will direct the clients to Agents that are in other zones.

Accepted aliases:

  • warpstream_az

  • ws_az

warpstream_proxy_target

kgo.NewClient(...,
    kgo.ClientID("warpstream_proxy_target=proxy-consume"),
)

Accepted aliases:

  • warpstream_proxy_target

  • ws_proxy_target

  • ws_pt

warpstream_agent_group

kgo.NewClient(...,
    kgo.ClientID("warpstream_agent_group=internal"),
)

Accepted aliases:

  • warpstream_agent_group

  • ws_agent_group

  • ws_ag

warpstream_partition_assignment_strategy

kgo.NewClient(...,
    kgo.ClientID("warpstream_partition_assignment_strategy=single_agent"),
)

WarpStream Agents are stateless, therefore it is possible to direct your Kafka clients writes or reads to any Agent in the cluster. That said, the WarpStream service discovery system still does have to pick which Agents to route individual clients to. WarpStream supports a number of different strategies for handling this routing.

Accepted aliases:

  • warpstream_partition_assignment_strategy

  • ws_partition_assignment_strategy

  • ws_pas

The possible values for this configuration are:

single_agent (default strategy)

ws_pas=single_agent

The default strategy is called single_agent, and highly recommended for the vast majority of scenarios. This strategy is called single agent because it selects a single WarpStream Agent for each Kafka client connection to act as the "leader" for all partitions in the cluster. In other words, each client perceives a view of the cluster where one of the Agents is responsible for all of the data in the cluster, and therefore the client will issue all of its Produce and Fetch requests to that one Agent.

This is best general purpose strategy because it achieves the lowest Produce latency (Produce requests for all partitions are routed to a single Agent, so clients are only exposed to the outlier latency of a single Agent instead of multiple), and also has the most even load-balancing because it's much easier to balance load when the system has full freedom to route any client to any Agent.

equal_spread

ws_pas=equal_spread

Another partition assignment strategy that WarpStream supports is called equal_spread. This strategy assigns partitions evenly amongst all the Agents such that each Agent is responsible for roughly an equal number of partitions.

This strategy is not recommended for general usage as it has the highest possible Produce latency of all strategies (because Producers producing to multiple partitions will be exposed to the outlier latency of multiple Agents), and because it is the most susceptible to load imbalances if partitions have skewed throughput.

However, it can be useful in some scenarios, particularly when data volumes are very high and traffic across partitions is very balanced. In that scenario, this strategy can improve performance, reduce the number of required Agents, and generally make the cluster more scaleable.

range_spread

ws_pas=range_spread

range_spread is the same as equal_spread except that if you have more partitions than Agents, it will assign a contiguous range of partitions to the same Agent. For each partition i, it assigns the partition to a broker based on i / len(agents).

ws_host_override

kgo.NewClient(...,
    kgo.ClientID("ws_host_override=agent-lb.yourcompany.com"),
)

ws_host_override instructs the Kafka client to connect to the specified hostname instead of the Agent's advertised address. This is particularly useful when agents are:

  • Running behind load balancers or

  • Deployed within containers where the advertised address is not routable from the outside

For example, if your Warpstream agent is accessible via a load balancer with the DNS name agent-lb.yourcompany.com, you would set ws_host_override in your Kafka client configuration to this value.

Accepted aliases:

  • warpstream_partition_assignment_strategy

  • warpstream_hostname_override

  • ws_host_override

  • ws_ho

  • ws_ioskh

Configuring Multiple Features

It is possible to combine these, for example

kgo.NewClient(...,
    kgo.ClientID("warpstream_proxy_target=proxy-produce,warpstream_partition_assignment_strategy=equal_spread,warpstream_az=us-east-1a,my_client_id"),
)

will configure an Apache Kafka client that

  • indicates it is in the us-east-1a zone.

  • will write different partitions to different agents to spread its load to multiple Agents

  • will connect only to the proxy-produce agents

  • also has a custom string of your choosing, unused by WarpStream, in its id.

PreviousForce Interzone Load BalancingNextKnown Issues

Last updated 2 months ago

Was this helpful?

The warpstream_proxy_target feature is used in conjunction with WarpStream's functionality to route individual Kafka clients to Agents that are running specific roles.

If you have split your Agents into distinct roles (as described ), you should configure your Apache Kafka Client by setting the warpstream_proxy_target feature to either proxy-produce (this client will connect to the nodes with the proxy-produce role) or to proxy-consume (and this client will connect to the nodes with the proxy-consume role).

The warpstream_agent_group feature is used in conjunction with WarpStream's functionality to route individual Kafka clients to Agents that belong to a specific group.

If you have split your Agents into multiple groups (as described ), you can configure your Apache Kafka client by setting the warpstream_agent_group feature to the name of the Agent group you want target.

(default)

For example when for local development.

Agent Roles
here
Agent Groups
here
port-forwarding
single_agent
equal_spread
range_spread