Agent Configuration

Required Command Line Flags and Environment Variables

All WarpStream Agent configurations can be set via command-line flags or environment variables. Command-line flags take precedence over environment variables.

FlagEnvironment VariableDescription

bucketURL

WARPSTREAM_BUCKET_URL

agentKey

WARPSTREAM_AGENT_KEY

WarpStream Agent Key obtained from the WarpStream admin console.

defaultVirtualClusterID

WARPSTREAM_DEFAULT_VIRTUAL_CLUSTER_ID

WarpStream Virtual Cluster ID obtained from the WarpStream admin console.

Optional Command Line Flags and Environment Variables

All WarpStream Agent configuration can be set either via command line flags, or environment variables. Command line flags take precedence over environment variables.

FlagEnvironment VariableDescription

kafkaPort

WARPSTREAM_KAFKA_PORT

The port the Agent will listen on for Kafka client TCP connections.

requireSASLAuthentication

WARPSTREAM_REQUIRE_SASL_AUTHENTICATION

If set to true, the Agents will require that all Kafka clients authenticate themselves with proper SASL credentials.

requireMTLSAuthentication

WARPSTREAM_REQUIRE_MTLS_AUTHENTICATION

If set to true, the Agents will require that all Kafka clients authenticate themselves with mTLS. enableTLS must be set to true.

tlsPrincipalMappingRule

WARPSTREAM_TLS_PRINCIPAL_MAPPING_RULE

Regular expression to extract the ACL principal from the client certificate distinguished name. requireMTLSAuthentication must be set to true.

enableTLS

WARPSTREAM_TLS_ENABLED

Enable TLS for Kafka client/Agent connections. Must also specify tlsServerCertFile and tlsServerPrivateKeyFile.

tlsServerCertFile

WARPSTREAM_TLS_SERVER_CERT_FILE

Path to the X.509 certificate file in PEM format for the server

tlsServerPrivateKeyFile

WARPSTREAM_TLS_SERVER_PRIVATE_KEY_FILE

Path to the X.509 private key file in PEM format for the server.

tlsClientCACertFile

WARPSTREAM_TLS_CLIENT_CA_CERT_FILE

Path to the X.509 certificate file in PEM format for the client certificate authority. If not specified, the host's root certificate pool will be used for client certificate verification.

httpPort

WARPSTREAM_HTTP_PORT

The port the Agent will use for serving HTTP requests (Kinesis API requests, distributed file cache requests, exposing Prometheus metrics, etc).

N/A

WARPSTREAM_AVAILABILITY_ZONE

Override the Availability Zone name which is discovered by the WarpStream Agent automatically using Cloud Instance Metadata (see section below).

We do not recommend overriding this in the general case.

region

WARPSTREAM_REGION

Region that the WarpStream control plane is running in. Value for your cluster can be obtained from the WarpStream console. Optional if your control plane is in us-east-1, otherwise must be provided.

N/A

WARPSTREAM_LOG_LEVEL

Override the log level of the WarpStream Agent from the default value of info. Acceptable values are debug, info, warn, and error.

Defaults to info.

batchTimeout

WARPSTREAM_BATCH_TIMEOUT

Controls the maximum amount of time the WarpStream agents will allow a produced record to remain buffered in batch before flushing it to object storage. Increasing this value reduces object storage API costs, but increases latency, and vice versa. Note the WarpStream agents never acknowledge data until it has been flushed to object storage so this value has no impact on correctness or durability guarantees, only latency. Defaults to 250ms, minimum is 50ms.

fileCacheSizeBytes

WARPSTREAM_FILE_CACHE_SIZE_BYTES

Size of the Agent file cache size in bytes. This cache is used to reduce the number of object storage GET requests that required to serve consumers. Defaults to 0.5GiB/vCPU if omitted.

reportDiscoveryIP6

WARPSTREAM_REPORT_DISCOVERY_IP6    

If set to true, the WarpStream Agents will report their IP6 address instead of IP4 to the WarpStream discovery system. This is useful when running in VPCs that only support IP6, like fly.io.

N/A

WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE

Overrides the hostname that the WarpStream Agents will report to the WarpStream discovery system (instead of the default of reporting their private IP4 address). This is useful when running the Agents behind a network load balancer which requires that the Agents report their hostname as the hostname of the network load balancer instead of their private IP.

advertiseHostnameStrategy

WARPSTREAM_ADVERTISE_HOSTNAME_STRATEGY

Which hostname strategy should be used the agent should advertise itself on. Accepted values: auto-ip4/auto-ip6/local/custom.

auto-ip4 means that it will try to automatically find an IP v4 that makes sense

auto-ip6 will do the same with an IPv6.

local will use localhost

If you select custom them you have to also define advertiseHostnameCustom.

advertiseHostnameCustom

WARPSTREAM_ADVERTISE_HOSTNAME_CUSTOM

Custom hostname value to advertise to service discovery for clustering purposes if the custom advertise strategy is selected.

agentGroup

WARPSTREAM_AGENT_GROUP

Name of the 'group' that the Agent belongs to. This feature is used to isolate groups of Agents that belong to the same logical cluster, but should not communicate with each other because they're deployed in separate cloud accounts, vpcs, or regions. Leave blank to indicate the Agent belongs to the default group.

availabilityZoneRequired

WARPSTREAM_AVAILABILITY_ZONE_REQUIRED

When enabled, the agent will synchronously try to resolve its az during startup for 1min, and will not start serving its /v1/status health check until it succeeds. The process will exit early if it did not manage to resolve the availability zone. Only used in agent mode.

Availability zone automatic detection

By default, the agent will try to reach the cloud provider metadata to detect in which availability zone the agent is running and advertise this to the WarpStream control plane. It currently supports AWS, GCP, and Azure.

If the agent appears with a warpstream-unset-az availability zone when you look at your virtual cluster in the WarpStream console, then it means it failed to do it. You should have a log error determining availability zone with possibly an explanation of what went wrong.

For instance, a known issue on AWS EKS is that the hop limit on old EKS node group is 1, preventing the call to AWS metadata from failing. Raising it to 2 should fix the issue (see AWS doc).

As a last resort, you can use the WARPSTREAM_AVAILABILITY_ZONEenvironment variable described in the table above to declare the availability zone in which your agent is running.

Last updated

Logo

Apache, Apache Kafka, Kafka, and associated open source project names are trademarks of the Apache Software Foundation. Kinesis is a trademark of Amazon Web Services.