Benchmarking
How to Benchmark WarpStream.
If you prefer to skip benchmarking WarpStream yourself, you can read our public benchmarking blog post where we provide detailed WarpStream benchmarks and TCO analysis.
The most important thing to consider when benchmarking WarpStream is that because WarpStream is a higher latency system than Apache Kafka, your Kafka client settings must be tuned appropriately to work with WarpStream to achieve high throughput. Start by reading our "Tuning Kafka Clients for Performance" documentation.
Ideally, benchmarking is performed with a real application running in in a pre-prod environment, or by teeing traffic from a production workload to WarpStream. However, we also understand that many people like to begin the evaluation process with simple synthetic benchmarks so the rest of this document is focused on how to do that correctly.
kafka-producer-perf-test.sh
One of the most common utilities for performing synthetic benchmarks of Kafka cluster is the kafka-producer-perf-test.sh utility. This utility embeds a native Java Kafka client, so it should be tuned according to our recommend settings. For example:
The settings above are just a starting point, you'll want to slowly increase the values of throughput and num-records as you perform your testing. More importantly, you'll have to consider how many partitions the test topic you're producing to has.
If the topic you're producing to has many partitions, you may need to reduce the value of batch.size to prevent the producer utility from OOMing. If the topic you're producing to has less partitions, then you may need to increase the value of batch.size instead to achieve higher throughput.
Due to the nature of how the Java Kafka protocol is implemented, you'll most likely struggle to achieve more than 60-100MiB/s of producer traffic from a single instance of this utility. However, once you've found configuration that you're happy with, you can increase the total throughput of the benchmark by running multiple instances of kafka-producer-perf-test.sh concurrently.
Running multiple instances of kafka-producer-perf-tesh.sh is highly recommended because load-balancing in WarpStream works differently than it does in Apache Kafka. Specifically, Apache Kafka balances partitions across Brokers, whereas WarpStream (due to its stateless nature) balances client connections across Agents.
As a result, a single instance of kafka-producer-perf-test.sh will generally route all of its traffic to a single WarpStream Agent. However, if you run multiple instances of the benchmarking utility concurrently, you'll see the traffic begin to spread evenly amongst all your deployed Agents.
Last updated