Agent Configuration Reference
Reference documentation for WarpStream Agent flags.
Required Command Line Flags and Environment Variables
All WarpStream Agent configurations can be set via command-line flags or environment variables. Command-line flags take precedence over environment variables.
Flag | Environment Variable | Description |
---|---|---|
|
| See the dedicated documentation section |
|
| WarpStream Agent Key obtained from the WarpStream admin console. |
|
| WarpStream Virtual Cluster ID obtained from the WarpStream admin console. |
|
| WarpStream virtual cluster's control plane region. Can be obtained from the WarpStream admin console. |
Optional Command Line Flags and Environment Variables
All WarpStream Agent configuration can be set either via command line flags, or environment variables. Command line flags take precedence over environment variables.
Flag | Environment Variable | Description |
---|---|---|
|
| Backward-compatible alias of agentKey |
|
| Name of the 'group' that the Agent belongs to. This feature is used to isolate groups of Agents that belong to the same logical cluster, but should not communicate with each other because they're deployed in separate cloud accounts, vpcs, or regions. Leave blank to indicate the Agent belongs to the default group. |
|
| How often the agent should heartbeat the WarpStream backend. Recommended to not modify this. |
|
| Enable kinesis server (default true). |
|
| The port the Agent will use for serving HTTP requests (Kinesis API requests, distributed file cache requests, exposing Prometheus metrics, etc) (default 8080). |
|
| Enable kafka server (default true). |
|
| The port the Agent will listen on for Kafka client TCP connections (default 9092). |
|
| Compression type to use for Fetch responses: none, gzip, snappy, lz4 (by default), zstd. This is only used if no compression is set explicitly, or if 'agent' type compress. |
|
| Period of time at which topic metadata is refreshed. Unlike Kafka, this metadata cache refresh also affects the timestamp type associated with a stream (default 1m0s). |
|
| Handle consumer group 'JoinGroup' and 'SyncGroup' requests in the backend instead of in the agent. When handled in the backend, the 'Rebalance Timeout' is always set to 10 seconds, whereas in the agent, it will be determined by client specifications. Enabling this option offers the advantage of reduced error potential and seamless integration of backend improvements and bug fixes. However, exercise caution when enabling it for large consumer groups, as a 10-second rebalance timeout may lead to extended rebalancing times and consequently, prolonged consumption pauses. Warning: Ensure uniformity within your agent pool regarding this setting. Having a mix of enabled and disabled settings may lead to rebalancing issues and potential disruptions. |
|
| Whether to emit metrics with high cardinality tags. When set to true, it enables detailed tracking at a granular level, such as metrics for individual fetch and produce operations on a per-topic basis. Use with caution as high cardinality can significantly increase the amount of data collected, potentially impacting performance. |
|
| Close idle connections after the number of duration specified by this config (default 10m0s). |
|
| Maximum number of uncompressed bytes that can be fetched in a single fetch request (default 128MiB). |
|
| Mmaximum number of uncompressed bytes that can be fetched in a single fetch request for a single topic-partition (default 128MiB). |
|
| Size of the Agent file cache size in bytes. This cache is used to reduce the number of object storage GET requests that required to serve consumers. Defaults to 0.5GiB/vCPU if omitted. |
|
| Number of extra replicas for the distributed file cache. Helps improve availability and reduce errors when Agents shutdown ungracefully. You can override this to 0, but do not increase this value above 1 unless you know what you're doing. |
|
| Amount of time to wait after receiving SIGTERM before exiting to allow graceful removal from service discovery (default 1m0s). |
|
| Maximum number of concurrent requests (per CPU) allowed by the Kafka server. |
|
| Whether the cluster wide environment should be enabled. |
|
| The default port to use for the cluster wide environment (default 9999). |
|
| Object storage URL to use for data ingestion (produce requests). |
|
| Object storage URL to use for files created by compactions. |
|
| Controls the maximum amount of time the WarpStream Agents will allow a produced record to remain buffered in batch before flushing it to object storage. Increasing this value reduces object storage API costs, but increases latency, and vice versa. Note the WarpStream agents never acknowledge data until it has been flushed to object storage so this value has no impact on correctness or durability guarantees, only latency. Defaults to 250ms, minimum is 50ms. |
|
| Controls the maximum number of bytes that will be buffered by the WarpStream Agents before flushing it to object storage. Increasing this value reduces object storage API costs for workloads that write more than uncompressed 16MiB/s/Agent, but increases latency, and vice versa. Note the WarpStream agents never acknowledge data until it has been flushed to object storage so this value has no impact on correctness or durability guarantees, only latency. Defaults to 4MiB, minimum is 1MiB, maximum is 8MiB. |
|
| Maximum number of inflight bytes per CPU from Produce requests that have not yet been flushed to object storage that can be in memory at once before the Agent will begin backpressuring. |
|
| Maximum number of inflight files per CPU from Produce requests that have not yet been flushed to object storage that can be in memory at once before the Agent will begin backpressuring. |
|
| Address for WarpStream metadata backend (default "https://api.prod.us-east-1.warpstream.com"). |
|
| Address for WarpStream schema registry backend. |
|
| Enable schema registry server. |
|
| Port to run the schema registry server on (default 9094). |
|
| Region that the WarpStream control plane is running in. Value for your cluster can be obtained from the WarpStream console. Optional if your control plane is in |
|
| Time after which the Kafka connection will be closed. This mechanism helps load balance the clients by forcing them to query the magic URL again. By resetting the connection periodically, clients are evenly distributed across available Kafka connections. (default 8760h0m0s). |
|
| Interval at which the Kafka connection assesses if the client-agent connection resides in the same Availability Zone (AZ). If they are not in the same AZ and there are agents available within the client's AZ, the connection is terminated. This approach encourages load balancing by prompting clients to re-query the magic URL and, consequently, connect to agents within their respective AZ. For this mechanism to function, clients should include 'waprstream_az=X' and 'warpstream_interzone_lb=true' in their clientID. (default 1m0s). |
|
| Time given to gracefully close the Kafka connection after the reconnect interval is reached. |
|
| Which hostname strategy should be used the agent should advertise itself on. Accepted values:
If you select |
|
| Custom hostname value to advertise to service discovery for clustering purposes if the |
|
| If set to true, the Agents will require that all Kafka clients authenticate themselves with proper SASL credentials. |
|
| Interval for logging service status (default 15s). |
|
| Enable datadog profiling (default false). |
|
| Enable datadog tracing (default false). |
|
| Enable prometheus metrics (default true). |
|
| Enable datadog metrics (default false). |
|
| Disable the consumer group offset metrics automatically published by default ( |
|
| Comma-separated list of the tags to not expose in the consumer group offset metrics ( |
|
| Disable the logs collection sent to warpstream backend (default false). |
|
| Roles that the agent should start (comma-separated) (default "proxy, jobs"). |
|
| Bucket URL to use when fetching the bento configuration. |
|
| Path in the bucket to fetch the bento configuration. |
|
| Whether data pipelines can be managed by the control plane. |
|
| When enabled, the agent will synchronously try to resolve its az during startup for 1min, and will not start serving its |
|
| Enable TLS for Kafka client/Agent connections. Must also specify |
|
| Path to the X.509 certificate file in PEM format for the server. |
|
| Path to the X.509 private key file in PEM format for the server. |
|
| Path to the X.509 certificate file in PEM format for the client certificate authority. If not specified, the host's root certificate pool will be used for client certificate verification. |
|
| If set to true, the Agents will require that all Kafka clients authenticate themselves with mTLS. |
|
| Regular expression to extract the ACL principal from the client certificate distinguished name. |
|
| Compression used in object store, either zstd or lz4. |
|
| Username for the external schema registry. |
|
| Password for the external schema registry. |
|
| Path to the X.509 certificate file in PEM format for the schema registry server's certificate authority. If not specified, the host's root certificate pool will be used for client certificate verification. |
|
| Path to the X.509 certificate file in PEM format for the schema registry client. |
|
| Path to the X.509 private key file in PEM format for the schema registry client. |
|
| Enable this flag to call |
|
| Tune the value passed to call |
|
| Enable this flag to call |
|
| Tune the value passed to call |
|
| Maximum size of a record that can be produced. Value needs to be between 4MiB and 64 MiB (default 32 MB). |
|
| Disable profile forwarding to warpstream backend. Note that if both Datadog profiling and profile forwarding are on, profile forwarding will automatically be turned off (default false). |
|
| Maximum number of bytes for buffering profiles in memory. Value needs to be smaller than 1 MiB (default 500 KiB). This is only used if |
|
| A mapping of availability zones to Kafka client IPs. The mapping should be a |
|
| Allow consumer fetch limits to be auto-adjusted (default false). |
N/A |
| Override the Availability Zone name which is discovered by the WarpStream Agent automatically using Cloud Instance Metadata (see section below). We do not recommend overriding this in the general case. |
N/A |
| Override the log level of the WarpStream Agent from the default value of Defaults to |
N/A |
| Overrides the hostname that the WarpStream Agents will report to the WarpStream discovery system (instead of the default of reporting their private IP4 address). This is useful when running the Agents behind a network load balancer which requires that the Agents report their hostname as the hostname of the network load balancer instead of their private IP. |
Last updated