Partitions Auto-Scaler (beta)
This page describes the functionality of WarpStream's Partitions Auto-Scaler.
The partitions auto-scaler is a convenience feature that automatically scales the number of partitions in a topic to ensure that the average write throughput (uncompressed bytes per second) stays below a configured threshold. This enables the partition count to increase automatically with organic traffic growth so that operators don't have to take manual actions or perform capacity planning.
For example, consider a topic called called "logs" with the following configuration values:
warpstream.partitions_auto_scaler.enabled
true
warpstream.partitions_auto_scaler.per_partition_throughput_uncompressed_bytes_per_second
2500000
warpstream.partitions_auto_scaler.max_partitions
1024
The logs topic has 50 partitions and an average total throughput of 100 uncompressed MB/s. If we do the math then we'll see that:
which is below the limit of 2.5MB/s/partition. In this scenario, the partitions auto-scaler will take no action. However, if the total throughput increases to 250 uncompressed MB/s, then the throughput per partition will increase to:
which is above the limit of 2.5MB/s. As a result, the partitions auto-scaler will detect this and begin adding additional partitions to the topic until the average throughput per partition falls back below 2.5MB/s (I.E once the topic reaches roughly 100 partitions).
The partitions auto-scaler can add partitions to a topic, but it can never remove them. As a result, it does not take action on every temporary spike in write throughput, and instead waits for throughput to exceeded the configured target consistently for a contiguous window of time (currently 15 minutes) before taking action.
The partitions auto-scaler is particularly convenient for workloads where WarpStream is being used as a highly scalable and cost-effective "pipe" where records with the same key don't always need to be written to the same partition.
Caveats
This feature is not suitable for any workloads where records with a specific key must always be written and consumed from the same partition. The reason for this is that in the Kafka protocol, record partitioning happens client side, so when the number of partitions in a topic increases, the destination partition for each record key will change.
The partitions auto-scaler can increase the number of partitions in a topic, but it cannot decrease the number of partitions in a topic. The reason for this is that the Kafka protocol (and the domain model of Kafka itself) are such that it's impossible to delete partitions from a topic with active producers/consumers in a safe manner. However, in the future we will solve this problem with WarpStream by creating "Virtual Topics" that abstract over multiple individual topics.
Configuration Options
All of the configuration options below are topic-level configuration values.
warpstream.partitions_auto_scaler.enabled
Boolean value.
Whether the partitions auto-scaler is enabled for this topic. Defaults to false.
warpstream.partitions_auto_scaler.per_partition_throughput_uncompressed_bytes_per_second
Integer value.
Target throughput (in uncompressed bytes per second) for each partition in the topic. If average throughput per partition is below this value, the partitions auto-scaler will continue to add partitions to the topic until throughput per partition falls below this value or the max partitions limit is reached.
For example, if this was set to 2500000
(2.5 MB/s/partition) then the partitions auto-scaler would begin adding additional partitions once the total throughput of the topic exceeded 10MB/s (10MB / 4 partitions = 2.5MB/s/partition).
warpstream.partitions_auto_scaler.max_partitions
Maximum number of partitions beyond which the partitions auto-scaler will not add any more partitions to the topic, regardless of how high the throughput is.
How to Edit Configuration
There are two ways to enable / configure partitions auto-scaling on a topic:
The WarpStream UI.
The Kafka protocol's AlterConfigs API.
WarpStream UI
Click "Edit Configuration" in the Topics view for your cluster, and then navigate to the "Partitions Auto scaler" tab.
AlterConfigs API
Use a Kafka client / tool / UI of your choice that allows editing topic-level configuration values to update the configuration values described in the configuration section for your topic.
Last updated