LogoLogo
WarpStream.comSlackDiscordContact UsCreate Account
  • Overview
    • Introduction
    • Architecture
      • Service Discovery
      • Write Path
      • Read Path
      • Life of a Request (Simplified)
    • Change Log
  • Getting Started
    • Install the WarpStream Agent / CLI
    • Run the Demo
    • "Hello World" for Apache Kafka
  • BYOC
    • Run the Agents Locally
    • Deploy the Agents
      • Object Storage Configuration
      • Kubernetes Known Issues
      • Rolling Restarts and Upgrades
    • Infrastructure as Code
      • Terraform Provider
      • Helm charts
      • Terraform Modules
    • Monitoring
      • Pre-made Datadog Dashboard
      • Pre-made Grafana Dashboard
      • Important Metrics and Logs
      • Recommended List of Alerts
      • Monitoring Consumer Groups
      • Hosted Prometheus Endpoint
    • Client Configuration
      • Tuning for Performance
      • Configure Clients to Eliminate AZ Networking Costs
        • Force Interzone Load Balancing
      • Configuring Kafka Client ID Features
      • Known Issues
    • Authentication
      • SASL Authentication
      • Mutual TLS (mTLS)
      • Basic Authentication
    • Advanced Agent Deployment Options
      • Agent Roles
      • Agent Groups
      • Protect Data in Motion with TLS Encryption
      • Low Latency Clusters
      • Network Architecture Considerations
      • Agent Configuration Reference
      • Reducing Infrastructure Costs
      • Client Configuration Auto-tuning
    • Hosted Metadata Endpoint
    • Managed Data Pipelines
      • Cookbooks
    • Schema Registry
      • WarpStream BYOC Schema Registry
      • Schema Validation
      • WarpStream Schema Linking
    • Port Forwarding (K8s)
    • Orbit
    • Enable SAML Single Sign-on (SSO)
    • Trusted Domains
    • Diagnostics
      • GoMaxProcs
      • Small Files
  • Reference
    • ACLs
    • Billing
      • Direct billing
      • AWS Marketplace
    • Benchmarking
    • Compression
    • Protocol and Feature Support
      • Kafka vs WarpStream Configuration Reference
      • Compacted topics
    • Secrets Overview
    • Security and Privacy Considerations
    • API Reference
      • API Keys
        • Create
        • Delete
        • List
      • Virtual Clusters
        • Create
        • Delete
        • Describe
        • List
        • DescribeConfiguration
        • UpdateConfiguration
      • Virtual Clusters Credentials
        • Create
        • Delete
        • List
      • Monitoring
        • Describe All Consumer Groups
      • Pipelines
        • List Pipelines
        • Create Pipeline
        • Delete Pipeline
        • Describe Pipeline
        • Create Pipeline Configuration
        • Change Pipeline State
      • Invoices
        • Get Pending Invoice
        • Get Past Invoice
    • CLI Reference
      • warpstream agent
      • warpstream demo
      • warpstream cli
      • warpstream playground
    • Integrations
      • Arroyo
      • AWS Lambda Triggers
      • ClickHouse
      • Debezium
      • Decodable
      • DeltaStream
      • docker-compose
      • DuckDB
      • ElastiFlow
      • Estuary
      • Fly.io
      • Imply
      • InfluxDB
      • Kestra
      • Materialize
      • MinIO
      • MirrorMaker
      • MotherDuck
      • Ockam
      • OpenTelemetry Collector
      • ParadeDB
      • Parquet
      • Quix Streams
      • Railway
      • Redpanda Console
      • RisingWave
      • Rockset
      • ShadowTraffic
      • SQLite
      • Streambased
      • Streamlit
      • Timeplus
      • Tinybird
      • Upsolver
    • Partitions Auto-Scaler (beta)
    • Serverless Clusters
Powered by GitBook
On this page

Was this helpful?

  1. Reference

Benchmarking

How to Benchmark WarpStream.

PreviousAWS MarketplaceNextCompression

Last updated 1 month ago

Was this helpful?

If you prefer to skip benchmarking WarpStream yourself, you can read where we provide detailed WarpStream benchmarks and TCO analysis.

The most important thing to consider when benchmarking WarpStream is that because WarpStream is a higher latency system than Apache Kafka, your Kafka client settings must be tuned appropriately to work with WarpStream to achieve high throughput. Start by reading our "" documentation.

Ideally, benchmarking is performed with a real application running in in a pre-prod environment, or by teeing traffic from a production workload to WarpStream. However, we also understand that many people like to begin the evaluation process with simple synthetic benchmarks so the rest of this document is focused on how to do that correctly.

kafka-producer-perf-test.sh

One of the most common utilities for performing synthetic benchmarks of Kafka cluster is the kafka-producer-perf-test.sh utility. This utility embeds a native Java Kafka client, so it should be tuned according to our recommend settings. For example:

kafka-producer-perf-test.sh --print-metrics --producer-props bootstrap.servers=$BROKERS enable.idempotence=false compression.type=lz4 linger.ms=25 batch.size=10000000 buffer.memory=128000000 max.request.size=64000000 metadata.max.age.ms=60000 --record-size 10000 --topic "test" --throughput 1000 --num-records 1000000

The settings above are just a starting point, you'll want to slowly increase the values of throughput and num-records as you perform your testing. More importantly, you'll have to consider how many partitions the test topic you're producing to has.

If the topic you're producing to has many partitions, you may need to reduce the value of batch.size to prevent the producer utility from OOMing. If the topic you're producing to has less partitions, then you may need to increase the value of batch.size instead to achieve higher throughput.

Due to the nature of how the Java Kafka protocol is implemented, you'll most likely struggle to achieve more than 60-100MiB/s of producer traffic from a single instance of this utility. However, once you've found configuration that you're happy with, you can increase the total throughput of the benchmark by running multiple instances of kafka-producer-perf-test.sh concurrently.

Running multiple instances of kafka-producer-perf-tesh.sh is highly recommended because load-balancing in WarpStream works differently than it does in Apache Kafka. Specifically, Apache Kafka balances partitions across Brokers, whereas WarpStream (due to its stateless nature) balances client connections across Agents.

As a result, a single instance of kafka-producer-perf-test.sh will generally route all of its traffic to a single WarpStream Agent. However, if you run multiple instances of the benchmarking utility concurrently, you'll see the traffic begin to spread evenly amongst all your deployed Agents.

our public benchmarking blog post
Tuning Kafka Clients for Performance