Low Latency Clusters
Configure the WarpStream Agent with lightning topics and a S3 Express, DynamoDB or Spanner storage layer to reduce Produce latency.
Overview
By default, WarpStream is tuned for maximum throughput and minimal costs at the expense of higher latency. However, WarpStream clusters can be tuned to provide much lower Produce and End-to-End latency.
The rest of this document outlines all of the different approaches that can be taken to reduce latency. Note that all of these approaches are cumulative and the lowest possible latency is achieved by combining all of them. The table below summarizes the different approaches and their trade-offs.
Reduce client linger
✅
✅
✅
❌
Reduce Agent batch timeout
✅
✅
✅
✅
Control Plane Cluster Tier
✅
✅
✅
✅
S3 Express
✅
✅
✅
✅ (~20% on average)
Lightning Topics
✅
❌
❌
❌
The table below shows achievable produce and E2E latencies for a variety of different setups.
25ms linger, 250ms batch timeout (default), S3 Standard, Fundamental cluster tier
p50: 250ms p99: 500ms
p50: 500ms p99: 900ms
10ms linger, 50ms batch timeout, S3 Express, Fundamentals cluster tier
p50: < 80ms p99: < 150ms
p50: < 200ms p99: < 400ms
10ms linger, 25ms batch timeout, S3 Express, Fundamentals cluster tier, lightning topics
p50: < 35ms p99: < 50ms
p50: < 200ms p99: < 400ms
Client Linger
Before tuning WarpStream itself, first check your client configuration. The WarpStream documentation has recommendations on how to tune various Kafka clients for maximum performance with WarpStream. You should still follow all those recommendations, however, if you want to minimize cluster latency then you should consider reducing the value of linger in your Kafka client from our default recommendation of 100ms to 25ms or 10ms.
Agent Batch Timeout
The WarpStream Agents accept a -batchTimeout (WARPSTREAM_BATCH_TIMEOUT environment variable) that controls how long the Agents will buffer data in-memory before flushing it to object storage. Produce requests are never acknowledged back to the client before data is durably persisted in object storage, so this option has no impact on durability or correctness, but it does directly impact the latency of Produce requests.
The default batchTimeout in the Agents is 250ms , but the value can be decreased as low as 25ms to reduce Produce latency. Lowering this value will result in higher cloud infrastructure costs because the Agents will have to create more files in object storage and will incur higher PUT request API fees as a result.
Note that S3 Express PUTs are ~ 1/5th the cost of a regular S3 PUT, so reducing your batch timeout from 250ms to 50ms while also switching to S3OZ would only increase your ingestion PUT request costs by 2x instead of 5x.
Control Plane Cluster Tier
Similar to the Agents, the WarpStream control plane batches some virtual cluster operations, resulting in higher latency in exchange for reduced control plane costs. Higher cluster tiers like Fundamentals and Pro batch less and thus have lower control plane latency. Switching cluster tiers is a one-click operation in the WarpStream UI or terraform.
Lightning topics
Lightning topics are a special topic type in WarpStream where the Agents skip committing data to the control plane in the critical path of a produce request. Instead, they journal produce requests to object storage, and commit them to the control plane asynchronously.
As a result, lightning topics have significantly lower Produce request latency than regular topics, especially if you have already lowered your batch timeout and switched to a low latency storage backend like S3 Express.
Lightning topics provide the exact same durability guarantees as regular topics (acknowled data is guaranteed to not be lost), but they do have a few caveats and relaxed consistency guarantees that you can learn more about in our dedicated lightning topics documentation.
S3 Express
S3 Express One Zone is a tier of AWS S3 that provides much lower latency for writes and reads, but only stores the data in a single availability zone. The WarpStream Agents have native support for S3 Express and can use it to store newly written data. Combined with a reduced batch timeout, S3 express can reduce the P99 latency of Produce requests to less than 150ms.
Learn how to configure Warpstream agents to write to S3 Express One Zone here.
Alternative Storage Backends
In addition to S3 Express, we offer a few additional lower latency storage backends like AWS DynamoDB and Google Spanner. While useful for some applications, keep in mind tha these alternative storage backends are much more expensive than traditional object storage or S3 Express and are not suitable for high volume applications.
AWS DynamoDB
In addition to S3 Express One Zone, AWS developers have the option to deploy their WarpStream agents using DynamoDB as the storage layer. Using DynamoDB yields latencies similar to S3 Express One Zone and generally costs less if the workload's throughput is low enough. Higher volume workloads should always prefer S3 Express One Zone over DynamoDB for cost reasons. See the Cost Estimates section below for more details.
Learn how to configure WarpStream agents to use AWS DynamoDB as the storage layer here
Google Spanner (beta)
Google Spanner support for the data plane is available only for agents using version v709 and above
On GCP deployments, developers can choose to use Spanner as the storage layer. This is the only low-latency ingestion alternative in GCP, and offers similar tradeoffs to the DynamoDB option described above. It's also only recommended for low-throughput clusters for cost reasons. See the Cost Estimates section below for more details.
Learn how to configure WarpStream agents to use Google Spanner as the storage layer here
Last updated
Was this helpful?