LogoLogo
WarpStream.comSlackDiscordContact UsCreate Account
  • Overview
    • Introduction
    • Architecture
      • Service Discovery
      • Write Path
      • Read Path
      • Life of a Request (Simplified)
    • Change Log
  • Getting Started
    • Install the WarpStream Agent / CLI
    • Run the Demo
    • "Hello World" for Apache Kafka
  • BYOC
    • Run the Agents Locally
    • Deploy the Agents
      • Object Storage Configuration
      • Kubernetes Known Issues
      • Rolling Restarts and Upgrades
    • Infrastructure as Code
      • Terraform Provider
      • Helm charts
      • Terraform Modules
    • Monitoring
      • Pre-made Datadog Dashboard
      • Pre-made Grafana Dashboard
      • Important Metrics and Logs
      • Recommended List of Alerts
      • Monitoring Consumer Groups
      • Hosted Prometheus Endpoint
    • Client Configuration
      • Tuning for Performance
      • Configure Clients to Eliminate AZ Networking Costs
        • Force Interzone Load Balancing
      • Configuring Kafka Client ID Features
      • Known Issues
    • Authentication
      • SASL Authentication
      • Mutual TLS (mTLS)
      • Basic Authentication
    • Advanced Agent Deployment Options
      • Agent Roles
      • Agent Groups
      • Protect Data in Motion with TLS Encryption
      • Low Latency Clusters
      • Network Architecture Considerations
      • Agent Configuration Reference
      • Reducing Infrastructure Costs
      • Client Configuration Auto-tuning
    • Hosted Metadata Endpoint
    • Managed Data Pipelines
      • Cookbooks
    • Schema Registry
      • WarpStream BYOC Schema Registry
      • Schema Validation
      • WarpStream Schema Linking
    • Port Forwarding (K8s)
    • Orbit
    • Enable SAML Single Sign-on (SSO)
    • Trusted Domains
    • Diagnostics
      • GoMaxProcs
      • Small Files
  • Reference
    • ACLs
    • Billing
      • Direct billing
      • AWS Marketplace
    • Benchmarking
    • Compression
    • Protocol and Feature Support
      • Kafka vs WarpStream Configuration Reference
      • Compacted topics
    • Secrets Overview
    • Security and Privacy Considerations
    • API Reference
      • API Keys
        • Create
        • Delete
        • List
      • Virtual Clusters
        • Create
        • Delete
        • Describe
        • List
        • DescribeConfiguration
        • UpdateConfiguration
      • Virtual Clusters Credentials
        • Create
        • Delete
        • List
      • Monitoring
        • Describe All Consumer Groups
      • Pipelines
        • List Pipelines
        • Create Pipeline
        • Delete Pipeline
        • Describe Pipeline
        • Create Pipeline Configuration
        • Change Pipeline State
      • Invoices
        • Get Pending Invoice
        • Get Past Invoice
    • CLI Reference
      • warpstream agent
      • warpstream demo
      • warpstream cli
      • warpstream cli-beta
        • benchmark-consumer
        • benchmark-producer
        • console-consumer
        • console-producer
        • consumer-group-lag
        • diagnose-record
        • file-reader
        • file-scrubber
      • warpstream playground
    • Integrations
      • Arroyo
      • AWS Lambda Triggers
      • ClickHouse
      • Debezium
      • Decodable
      • DeltaStream
      • docker-compose
      • DuckDB
      • ElastiFlow
      • Estuary
      • Fly.io
      • Imply
      • InfluxDB
      • Kestra
      • Materialize
      • MinIO
      • MirrorMaker
      • MotherDuck
      • Ockam
      • OpenTelemetry Collector
      • ParadeDB
      • Parquet
      • Quix Streams
      • Railway
      • Redpanda Console
      • RisingWave
      • Rockset
      • ShadowTraffic
      • SQLite
      • Streambased
      • Streamlit
      • Timeplus
      • Tinybird
      • Upsolver
    • Partitions Auto-Scaler (beta)
    • Serverless Clusters
Powered by GitBook
On this page
  • Reducing Infrastructure Costs
  • Networking
  • Storage
  • Compute
  • Object Storage API Fees

Was this helpful?

  1. BYOC
  2. Advanced Agent Deployment Options

Reducing Infrastructure Costs

How to reduce infrastructure costs for WarpStream BYOC clusters.

PreviousAgent Configuration ReferenceNextClient Configuration Auto-tuning

Last updated 5 months ago

Was this helpful?

Reducing Infrastructure Costs

WarpStream infrastructure costs can originate from four different sources:

  1. Networking

  2. Storage

  3. Compute

  4. Object Storage API Fees

Networking

With WarpStream, you can avoid 100% of inter-AZ networking fees by properly configuring your Kafka clients.

Unlike Apache Kafka, WarpStream Agents will never manually replicate data across availability zones, but Kafka producer/consumer clients can still connect cross-zone, resulting in inter-zone networking fees.

Storage

In addition, just like with Apache Kafka, storage costs can be reduced by reducing the retention of your largest topics as well.

Compute

Object Storage API Fees

The most expensive source of object storage API fees in WarpStream are the PUT requests required to create files as a result of Produce requests. By default, the WarpStream Agents will buffer data in-memory until one of the following two events occur:

  • The batch timeout elapses

    1. Default value: 250ms

    2. Agent flag: -batchTimeout

    3. Agent environment variable: WARPSTREAM_BATCH_TIMEOUT

  • A sufficient amount of uncompressed bytes is accumulated

    1. Default value: 4MiB

    2. Agent flag: batchMaxSizeBytes

    3. Agent environment variable: WARPSTREAM_BATCH_MAX_SIZE_BYTES

at which point the Agent will flush a file to the object store and then acknowledge the Produce request as a success back to the client.

To determine how much uncompressed data is stored in the files your Agents are creating for Produce requests, check the average value of the metric: warpstream_agent_segment_batcher_flush_file_size_uncompressed_bytes

If this value is close to (or larger than) the value of batchMaxSizeBytes, then your PUT request costs can be reduced by increasing the amount uncompressed data that is written to each file, resulting in fewer total files being created. There are two ways to accomplish this:

  1. Increase the value of batchTimeout. For example, if the average uncompressed size of files created by your Agents is 2MiB, then doubling the batch timeout from 250ms to 500ms should double the uncompressed file size to 4MiB and cut the number of PUT requests in half. The downside of this approach though is that it will increase the latency of Produce requests.

Once the average size of your uncompressed files is approaching the value of batchMaxSizeBytes then you can increase the value of batchMaxSizeBytes and repeat the steps above to further reduce PUT request costs.

This happens because by default WarpStream has no way of knowing which availability zone the client is connecting from. To avoid this issue, configure your Kafka clients to announce what availability zone they're running in using a , and WarpStream will take care of zonally aligning your Kafka clients (for both Produce and Fetch requests) resulting in almost zero inter-zone networking fees.

WarpStream uses object storage as the primary and only storage in the system. As a result, storage costs in WarpStream tend to be in WarpStream than they are in Apache Kafka. Storage costs can be reduced even further by configuring the WarpStream Agents to store data compressed using ZSTD instead of LZ4. Check out our for more details.

The easiest way to reduce WarpStream Agent compute costs is to auto-scale the Agents based on CPU usage. This feature is built-in to our .

WarpStream's entire is designed around minimizing object storage API fees as much as possible. This is accomplished with a file format that can store data for many different topic-partitions, as well as heavy usage of buffering, batching, and caching in the Agents.

Reduce the number of Agents that are handling Produce requests. This can be accomplished by running a smaller number of Agents, but using larger instance types. Alternatively, you can use the feature to split Producer / Consumer Agents.

client ID feature
more than an order of magnitude lower
compression documentation
Helm Chart for Kubernetes
storage engine
Agent Roles
Kafka producer/consumer clients incurring inter-zone networking fees.
Kafka producer/consumer clients using WarpStream's zonal-alignment functionality to eliminate inter-zone networking fees entirely.