Deploy the Agents
Don't forget to review our documentation on how to configure your Kafka client for WarpStream, as well as our instructions on tuning for performance once you're done. A few small changes in client configuration can result in 10-20x higher throughput when using WarpStream, and proper client configuration is required to leverage WarpStream's zone-aware discovery system.
Required Arguments
The WarpStream Agent is completely stateless and thus can be deployed however you prefer to deploy stateless containers. For example, you could use AWS ECS or a Kubernetes Deployment. The WarpStream Docker containers can be found in the "Install the WarpStream Agent" reference.
The Agent has three required arguments that must be passed as command line flags:
bucketURL
apiKey
defaultVirtualClusterID
For example:
The values of apiKey
and defaultVirtualClusterID
can both be obtained from the WarpStream Admin Console.
Note that the entrypoint for the WarpStream docker image is a multi-command binary. For production usage, the subcommand that you want to run is just called agent
as shown above.
Depending on the tool you're using to deploy/run containers, it can sometimes be cumbersome to provide additional arguments beyond the agent
subcommand.
In that case, all of the required arguments can be passed as environment variables instead: WARPSTREAM_BUCKET_URL
, WARPSTREAM_API_KEY
, and WARPSTREAM_DEFAULT_VIRTUAL_CLUSTER_ID
.
Object Storage
bucketURL
is the URL of the object storage bucket that the WarpStream Agent should write to. See our dedicated reference page on how to construct a proper URL for the specific object store implementation that you're using.
In addition to constructing a well formed bucketURL
, you'll also need to create and configure a dedicated object storage bucket for the Agents, and ensure that the Agents have the appropriate permissions to access that bucket. See our documentation for more details about configuring buckets and permissions.
Permissions and Ports
The WarpStream Agents need permission to perform various different operations against the object storage bucket. Review our object storage permissions documentation for more details.
In addition to object storage access, the WarpStream Agent will also need permission to communicate with https://api.prod.us-east-1.warpstream.com
in order to write/read Virtual Cluster metadata. Raw data flowing through your WarpStream will never leave your cloud account, only metadata required to order batches of data and perform remote consensus. You can read more about what metadata leaves your cloud account in our security and privacy considerations documentation.
Finally, the WarpStream Agent requires 2 ports to be exposed. For simplicity, we recommend just ensuring that the WarpStream Agent can listen on ports 9092
and 8080
by default, however, the section below contains more details about how each port is used and how to override them if necessary.
Default: 9092
Override: -kafkaPort $PORT
Disable: -enableKafka false
This is the port that exposes the Kafka TCP protocol to Kafka clients. Only disable it if you don't intend to use the Kafka protocol at all.
Service discovery
The advertiseHostnameStrategy
flag allows you to choose how the agent will advertise itself in Warpstream service discovery (more details here). The default auto-ip4
is a good choice for most cases in production.
GOMAXPROCS
The WarpStream Agent uses heuristics to automatically configure itself based on the resources available. The most important way this happens is adjusting concurrency and cache sizes based on the number of available cores.
The Agent uses standard operating system APIs to determine how many cores are available, and it prints this value when starting:
This number is usually right, but it may not be right depending on how the Agent is deployed. For example, the Agent may determine the wrong value when running in AWS ECS.
In general, we recommend that you manually set the GOMAXPROCS
environment variable to the number of cores that you've made available to the Agent in your environment. For example, if you've allocated 3 cores to the Agent's container, then we recommend adding GOMAXPROCS=3
as an environment variable.
The value of GOMAXPROCS
must be a whole number and not a fraction. We also recommend that you always assign Agent whole numbers for CPU quotas so that the Agent doesn't have fractional CPU quotas. Fractional CPU quotas can result in throttling and increased latency since the value of GOMAXPROCS
and the number of whole cores available to the Agent won't match.
Graceful Shutdown
The WarpStream Agents perform a graceful shutdown routine that strives to minimize disruption to the cluster. By default, this graceful shutdown process takes 1 minute to complete. However, many container orchestration frameworks will not wait 1 minute for a container to shutdown gracefully. For example, in Kubernetes the default graceful termination window is 30s. We recommend increasing this value to 5 minutes. In Kubernetes, the configuration value for this is called terminationGracePeriodSeconds
.
In addition, for this graceful shutdown to work smoothly with your application, we recommend setting the metadata refresh interval on your client to 20s
. See our client tuning documentation for more details.
Last updated