Low Latency Clusters

Configure the WarpStream Agent with lightning topics and a S3 Express, DynamoDB or Spanner storage layer to reduce Produce latency.

Overview

By default, WarpStream is tuned for maximum throughput and minimal costs at the expense of higher latency. However, WarpStream clusters can be tuned to provide much lower Produce and End-to-End latency.

The rest of this document outlines all of the different approaches that can be taken to reduce latency. Note that all of these approaches are cumulative and the lowest possible latency is achieved by combining all of them. The table below summarizes the different approaches and their trade-offs.

Approach
Reduces Produce Latency
Reduces E2E Latency
Full Consistency
Increases Costs

Reduce client linger

Reduce Agent batch timeout

Control Plane Cluster Tier

S3 Express

(~20% on average)

Lightning Topics

The table below shows achievable produce and E2E latencies for a variety of different setups.

Setup
Produce Latency
E2E Latency

25ms linger, 250ms batch timeout (default), S3 Standard, Fundamental cluster tier

p50: 250ms p99: 500ms

p50: 500ms p99: 900ms

10ms linger, 50ms batch timeout, S3 Express, Fundamentals cluster tier

p50: < 80ms p99: < 150ms

p50: < 200ms p99: < 400ms

10ms linger, 25ms batch timeout, S3 Express, Fundamentals cluster tier, lightning topics

p50: < 35ms p99: < 50ms

p50: < 200ms p99: < 400ms

Client Linger

Before tuning WarpStream itself, first check your client configuration. The WarpStream documentation has recommendations on how to tune various Kafka clients for maximum performance with WarpStream. You should still follow all those recommendations, however, if you want to minimize cluster latency then you should consider reducing the value of linger in your Kafka client from our default recommendation of 100ms to 25ms or 10ms.

Agent Batch Timeout

The WarpStream Agents accept a -batchTimeout (WARPSTREAM_BATCH_TIMEOUT environment variable) that controls how long the Agents will buffer data in-memory before flushing it to object storage. Produce requests are never acknowledged back to the client before data is durably persisted in object storage, so this option has no impact on durability or correctness, but it does directly impact the latency of Produce requests.

The default batchTimeout in the Agents is 250ms , but the value can be decreased as low as 25ms to reduce Produce latency. Lowering this value will result in higher cloud infrastructure costs because the Agents will have to create more files in object storage and will incur higher PUT request API fees as a result.

Note that S3 Express PUTs are ~ 1/5th the cost of a regular S3 PUT, so reducing your batch timeout from 250ms to 50ms while also switching to S3OZ would only increase your ingestion PUT request costs by 2x instead of 5x.

250/50(1/5)2(azs)=2x250/50 * (1/5) * 2 (azs) = 2x

Control Plane Cluster Tier

Similar to the Agents, the WarpStream control plane batches some virtual cluster operations, resulting in higher latency in exchange for reduced control plane costs. Higher cluster tiers like Fundamentals and Pro batch less and thus have lower control plane latency. Switching cluster tiers is a one-click operation in the WarpStream UI or terraform.

Lightning topics

Lightning topics are a special topic type in WarpStream where the Agents skip committing data to the control plane in the critical path of a produce request. Instead, they journal produce requests to object storage, and commit them to the control plane asynchronously.

As a result, lightning topics have significantly lower Produce request latency than regular topics, especially if you have already lowered your batch timeout and switched to a low latency storage backend like S3 Express.

Lightning topics provide the exact same durability guarantees as regular topics (acknowled data is guaranteed to not be lost), but they do have a few caveats and relaxed consistency guarantees that you can learn more about in our dedicated lightning topics documentation.

S3 Express

S3 Express One Zonearrow-up-right is a tier of AWS S3 that provides much lower latency for writes and reads, but only stores the data in a single availability zone. The WarpStream Agents have native support for S3 Express and can use it to store newly written data. Combined with a reduced batch timeout, S3 express can reduce the P99 latency of Produce requests to less than 150ms.

Learn how to configure Warpstream agents to write to S3 Express One Zone here.

Alternative Storage Backends

In addition to S3 Express, we offer a few additional lower latency storage backends like AWS DynamoDB and Google Spanner. While useful for some applications, keep in mind tha these alternative storage backends are much more expensive than traditional object storage or S3 Express and are not suitable for high volume applications.

AWS DynamoDB

In addition to S3 Express One Zone, AWS developers have the option to deploy their WarpStream agents using DynamoDBarrow-up-right as the storage layer. Using DynamoDB yields latencies similar to S3 Express One Zone and generally costs less if the workload's throughput is low enough. Higher volume workloads should always prefer S3 Express One Zone over DynamoDB for cost reasons. See the Cost Estimates section below for more details.

Learn how to configure WarpStream agents to use AWS DynamoDB as the storage layer here

Google Spanner (beta)

circle-exclamation

On GCP deployments, developers can choose to use Spannerarrow-up-right as the storage layer. This is the only low-latency ingestion alternative in GCP, and offers similar tradeoffs to the DynamoDB option described above. It's also only recommended for low-throughput clusters for cost reasons. See the Cost Estimates section below for more details.

Learn how to configure WarpStream agents to use Google Spanner as the storage layer here

Last updated

Was this helpful?