Links

Agent Configuration

Required Command Line Flags and Environment Variables

All WarpStream Agent configuration can be set either via command line flags, or environment variables. Command line flags take precedence over environment variables.
Flag
Environment Variable
Description
bucketURL
WARPSTREAM_BUCKET_URL
apiKey
WARPSTREAM_API_KEY
WarpStream API key obtained from the WarpStream admin console.
defaultVirtualClusterID
WARPSTREAM_DEFAULT_VIRTUAL_CLUSTER_ID
WarpStream Virtual Cluster ID obtained from the WarpStream admin console.
agentPoolName
WARPSTREAM_AGENT_POOL_NAME
(Optional) WarpStream Agent Pool name obtained from the WarpStream admin console. This argument is optional if you're using the "default" Virtual Cluster that was automatically created when you signed up for WarpStream.
However, it must be set if you're using a new Virtual Cluster that you created yourself. You can read more about this in the Agent Pools and Virtual Clusters Reference Documentation.

Optional Command Line Flags and Environment Variables

All WarpStream Agent configuration can be set either via command line flags, or environment variables. Command line flags take precedence over environment variables.
Flag
Environment Variable
Description
requireAuthentication
WARPSTREAM_REQUIRE_AUTHENTICATION
If set to true, the Agents will require that all Kafka clients authenticate themselves with proper SASL credentials.
kafkaPort
WARPSTREAM_KAFKA_PORT
The port the Agent will listen on for Kafka client TCP connections.
httpPort
WARPSTREAM_HTTP_PORT
The port the Agent will use for serving HTTP requests (Kinesis API requests, distributed file cache requests, exposing Prometheus metrics, etc).
N/A
WARPSTREAM_AVAILABILITY_ZONE
Override the Availability Zone name which is discovered by the WarpStream Agent automatically using Cloud Instance Metadata (see section below).
We do not recommend overriding this in the general case.
N/A
WARPSTREAM_LOG_LEVEL
Override the log level of the WarpStream Agent from the default value of info. Acceptable values are debug, info, warn, and error.
Defaults to info.
batchTimeout
WARPSTREAM_BATCH_TIMEOUT
Controls the maximum amount of time the WarpStream agents will allow a produced record to remain buffered in batch before flushing it to object storage. Increasing this value reduces object storage API costs, but increases latency, and vice versa. Note the WarpStream agents never acknowledge data until it has been flushed to object storage so this value has no impact on correctness or durability guarantees, only latency. Defaults to 250ms, minimum is 50ms.
fileCacheSizeBytes
WARPSTREAM_FILE_CACHE_SIZE_BYTES
Size of the Agent file cache size in bytes. This cache is used to reduce the number of object storage GET requests that required to serve consumers. Defaults to 0.5GiB/vCPU if omitted.
reportDiscoveryIP6
WARPSTREAM_REPORT_DISCOVERY_IP6
If set to true, the WarpStream Agents will report their IP6 address instead of IP4 to the WarpStream discovery system. This is useful when running in VPCs that only support IP6, like fly.io.
N/A
WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE
Overrides the hostname that the WarpStream Agents will report to the WarpStream discovery system (instead of the default of reporting their private IP4 address). This is useful when running the Agents behind a network load balancer which requires that the Agents report their hostname as the hostname of the network load balancer instead of their private IP.
advertiseHostnameStrategy
WARPSTREAM_ADVERTISE_HOSTNAME_STRATEGY
Which hostname strategy should be used the agent should advertise itself on. Accepted values: auto-ip4/auto-ip6/local/custom.
auto-ip4 means that it will try to automatically find an IP v4 that makes sense
auto-ip6 will do the same with an IPv6.
local will use localhost
If you select custom them you have to also define advertiseHostnameCustom.
advertiseHostnameCustom
WARPSTREAM_ADVERTISE_HOSTNAME_CUSTOM
Custom hostname value to advertise to service discovery for clustering purposes if the custom advertise strategy is selected.

Running a WarpStream Agent with a Non-Default Agent Pool or Virtual Cluster

Every WarpStream account is created with a default Virtual Cluster and Agent Pool called apn_default. If you omit the agentPoolName flag and WARPSTREAM_AGENT_POOL_NAME environment variable from your Agent configuration, then it will default to using the apn_default pool.
However, if you create additional Virtual Clusters in the WarpStream Admin console and want to configure a pool of agents to use that new virtual cluster, you'll also have to configure your agents with the name of the new agent pool that is associated with the new Virtual Cluster by adding either the agentPoolName flag or WARPSTREAM_AGENT_POOL_NAME environment variable to your Agent configuration. If you fail to do this, your WarpStream Agent will refuse to start and will emit an error log to inform you that it is misconfigured.

Availability zone automatic detection

By default, the agent will try to reach the cloud provider metadata to detect in which availability zone the agent is running and advertise it to warpstream control plane. It currently supports AWS, GCP and Azure.
If the agent appears with a warpstream-unset-az availability zone when you look at your virtual cluster in the warpstream console, then it means it failed to do it. You should have a log error determining availability zone with possibly an explanation of what went wrong.
For instance, a known issue on AWS EKS is that the hop limit on old EKS node group is 1 preventing the call to AWS metadata to fail. Raising it to 2 should fix the issue (see AWS doc).
As a last resort, you can use the WARPSTREAM_AVAILABILITY_ZONEenvironment variable described in the table above to declare the availability zone in which your agent is running.
Apache, Apache Kafka, Kafka, and associated open source project names are trademarks of the Apache Software Foundation. Kinesis is a trademark of Amazon Web Services.