Links
Comment on page

Change Log

Contains a history of changes made to the Agent and Control Plane.

Release v523

Agent

Bug Fixes and performance Improvements

  1. 1.
    Fixed a bug in the Fetch() code that was not setting the correct topic ID in error responses which made some Kafka clients emit warning logs when this happened.
  2. 2.
    Fixed a bug in the "roles" feature that was causing Agents with the "produce" role to still participate in the distributed file cache. Now only Agents with the "consume" role will participate in the file cache, as expected.

Release v522

Agent

Bug Fixes and performance Improvements

  1. 1.
    Circuit breakers will now return example errors for clarity.
  2. 2.
    Fetch() code path will now handle failures more gracefully by returning incremental results in more scenarios which improves the system's ability to recover under load.
  3. 3.
    Fix a memory leak in the in-memory file cache implementation.

Release v521

Agent

New Features

  1. 1.
    Docker images are now multi-arch, our documentation and official kubernetes charts has been updated accordingly.
  2. 2.
    Introduced circuit breakers around object storage access.
  3. 3.
    Finer control over agent roles: it is now possible to split between the proxy-consume and proxy-produce roles, our documentation has been updated as well.

Release v520

Agent

This release is the first phase of a two-phase upgrade to WarpStream's internal file format. This release adds support for reading the upgraded file format. You MUST upgrade all Agents to this version before moving from any version < v520 to any version > than v520.

Release v518

Agent

New Features

  1. 1.
    Support kafka Headers: if you produce messages containing Kafka headers, they will now be automatically persisted to your cloud object storage, and will be read when fetching.
  2. 2.
    Revisit the flags and configuration knobs to choose how the agents advertise themselves in Warpstream service discovery. Our documentation has been updated accordingly.
  3. 3.
    Agent nodes can now be configured to run dedicated roles - see splitting roles documentation.

Control Plane

New Features

  1. 1.
    Fully support kafka ListOffsets protocol: you can now look for partition offsets based on timestamps.

Release v517

Agent

Bug Fixes and performance Improvements

  1. 1.
    Fixed a bug related to the handling of empty (but not null) values in records in the Fetch implementation.

Release v516

Agent

New Features

  1. 1.
    The agent will now report a sample of its error logs back to Warpstream control plane. It should ease troubleshooting and help us identify issues earlier. This can be disabled with the flag disableLogsCollection or the environment variable WARPSTREAM_DISABLE_LOGS_COLLECTION.

Bug Fixes and performance Improvements

  1. 1.
    Added batching in the metadata calls made during Kafka Fetch, improving memory usage along the way.

Control Plane

New Features

  1. 1.
    Added support for Kafka's InitProducerID protocol message, and the idempotent producer functionality in general. Requires upgrading to a version of the Agents that is >= v515.
  2. 2.
    Added support for Kafka ListOffsets with positive timestamps value (until now only negative values for special cases were supported)

Release v515

Agent

New Features

  1. 1.
    The Agents will now report the lag / max offsets for every active consumer group as standard metrics. The metrics can be found as warpstream_consumer_group_lag and warpstream_consumer_group_max_offset .
  2. 2.
    The Agents will now report the number of files at each compaction level so that user's can monitor whether they are experiencing compaction lag. These metrics can be found as warpstream_files_count and the level is tagged with the name compaction_level.

Bug Fixes and performance Improvements

  1. 1.
    File cache is now partitioned by <file_id, 16MiB extent> instead of just <file_id>. This spreads the load for fetching data for large files more evenly amongst all the Agents.
  2. 2.
    Added some logic in the file cache to detect when certain parts of the cache are experiencing high churn and reduce the default IO size for paging in data from object storage. This helps avoid filling the cache with data that won't be read.
  3. 3.
    Fixed a bug in the file cache that was causing it to significantly *over* fetch data in some scenarios. This did not cause any correctness problems, but it wasted network bandwidth and CPU cycles.
  4. 4.
    Modified the implementation of the Kafka Fetch method to return incremental results when it experiences a retryable error mid-fetch. This makes the Agents much better at recovering from disruption and catching consumer lag incrementally.
  5. 5.
    Added some pre-fetching logic into the Kafka Fetch method so that when data for a single partition is spread amongst many files the Agent doesn't get bottlenecked making many single-threaded RPCs. This mostly helps increase the speed at which individual partitions can be "caught up" when lagging.
  6. 6.
    Increased the default maximum file size created at ingestion time from 4MiB to 8MiB. This improves performance for extremely high volume workloads.
  7. 7.
    Added replication to the Agent file cache so that if an error is experienced trying to load data from the file cache on the Agent node that is "responsible" for a chunk of data, the client can retry on a different node. This helps minimize disruption when Agents shutdown ungracefully.
  8. 8.
    Agents now report their CPU utilization to the control plane. We will use this information in the future to improve load balancing decisions. CPU utilization can be view in the WarpStream Admin console now as well.
  9. 9.
    Improved the performance of deserializing file footers.
  10. 10.
    Standardized prometheus metric names prefixes.
  11. 11.
    Added a lot more metrics and instrumentation, especially around the blob storage library and file cache.

Control Plane

New Features

  1. 1.
    Added support for the Kafka protocol message DeleteTopics .

Bug Fixes and Performance Improvements

  1. 1.
    We added intelligent throttling / scheduling to the deadscanner scheduler. This scheduler is responsible for scheduling jobs that run in the Agent to scan for "dead files" in object storage and delete them. Previously these jobs could run with high frequency and rates which would interfere with live workloads. In addition, they could also result in very high object storage API requests costs due to excessive amounts of LIST requests. The new implementation is much more intelligent and automatically tunes the frequency to avoid disrupting the live workload and incurring high API request fees.
Apache, Apache Kafka, Kafka, and associated open source project names are trademarks of the Apache Software Foundation. Kinesis is a trademark of Amazon Web Services.