LogoLogo
WarpStream.comSlackDiscordContact UsCreate Account
  • Overview
    • Introduction
    • Architecture
      • Service Discovery
      • Write Path
      • Read Path
      • Life of a Request (Simplified)
    • Change Log
  • Getting Started
    • Install the WarpStream Agent / CLI
    • Run the Demo
    • "Hello World" for Apache Kafka
  • BYOC
    • Run the Agents Locally
    • Deploy the Agents
      • Object Storage Configuration
      • Kubernetes Known Issues
      • Rolling Restarts and Upgrades
    • Infrastructure as Code
      • Terraform Provider
      • Helm charts
      • Terraform Modules
    • Monitoring
      • Pre-made Datadog Dashboard
      • Pre-made Grafana Dashboard
      • Important Metrics and Logs
      • Recommended List of Alerts
      • Monitoring Consumer Groups
      • Hosted Prometheus Endpoint
    • Client Configuration
      • Tuning for Performance
      • Configure Clients to Eliminate AZ Networking Costs
        • Force Interzone Load Balancing
      • Configuring Kafka Client ID Features
      • Known Issues
    • Authentication
      • SASL Authentication
      • Mutual TLS (mTLS)
      • Basic Authentication
    • Advanced Agent Deployment Options
      • Agent Roles
      • Agent Groups
      • Protect Data in Motion with TLS Encryption
      • Low Latency Clusters
      • Network Architecture Considerations
      • Agent Configuration Reference
      • Reducing Infrastructure Costs
      • Client Configuration Auto-tuning
    • Hosted Metadata Endpoint
    • Managed Data Pipelines
      • Cookbooks
    • Schema Registry
      • WarpStream BYOC Schema Registry
      • Schema Validation
      • WarpStream Schema Linking
    • Port Forwarding (K8s)
    • Orbit
    • Enable SAML Single Sign-on (SSO)
    • Trusted Domains
    • Diagnostics
      • GoMaxProcs
      • Small Files
  • Reference
    • ACLs
    • Billing
      • Direct billing
      • AWS Marketplace
    • Benchmarking
    • Compression
    • Protocol and Feature Support
      • Kafka vs WarpStream Configuration Reference
      • Compacted topics
    • Secrets Overview
    • Security and Privacy Considerations
    • API Reference
      • API Keys
        • Create
        • Delete
        • List
      • Virtual Clusters
        • Create
        • Delete
        • Describe
        • List
        • DescribeConfiguration
        • UpdateConfiguration
      • Virtual Clusters Credentials
        • Create
        • Delete
        • List
      • Monitoring
        • Describe All Consumer Groups
      • Pipelines
        • List Pipelines
        • Create Pipeline
        • Delete Pipeline
        • Describe Pipeline
        • Create Pipeline Configuration
        • Change Pipeline State
      • Invoices
        • Get Pending Invoice
        • Get Past Invoice
    • CLI Reference
      • warpstream agent
      • warpstream demo
      • warpstream cli
      • warpstream cli-beta
        • benchmark-consumer
        • benchmark-producer
        • console-consumer
        • console-producer
        • consumer-group-lag
        • diagnose-record
        • file-reader
        • file-scrubber
      • warpstream playground
    • Integrations
      • Arroyo
      • AWS Lambda Triggers
      • ClickHouse
      • Debezium
      • Decodable
      • DeltaStream
      • docker-compose
      • DuckDB
      • ElastiFlow
      • Estuary
      • Fly.io
      • Imply
      • InfluxDB
      • Kestra
      • Materialize
      • MinIO
      • MirrorMaker
      • MotherDuck
      • Ockam
      • OpenTelemetry Collector
      • ParadeDB
      • Parquet
      • Quix Streams
      • Railway
      • Redpanda Console
      • RisingWave
      • Rockset
      • ShadowTraffic
      • SQLite
      • Streambased
      • Streamlit
      • Timeplus
      • Tinybird
      • Upsolver
    • Partitions Auto-Scaler (beta)
    • Serverless Clusters
Powered by GitBook
On this page
  • Release v657
  • Release v656
  • Release v655
  • Release v654
  • Release v653
  • Release v652
  • Release v651
  • Release v650
  • Release v649
  • Release v648
  • Release v647
  • Release v646 (Bad version, please upgrade to v647)
  • Release v645 (Bad version, please upgrade to v647)
  • Release v644 (Bad version, please upgrade to v647)
  • Release v643 (Bad version, please upgrade to v647)
  • Release v642
  • Release v641
  • Release v640
  • Release v639
  • Release v638
  • Release v637
  • Release v636
  • Release v635
  • Release v634
  • Release v633
  • Release v632
  • Release v631
  • Release v630
  • Release v629
  • Release v628
  • Release v627
  • Release v626
  • Release v625
  • Release v623
  • Release v622
  • Release v621
  • Release v620
  • Release v619
  • Release v618
  • Release v617
  • Release v616
  • Release v615
  • Release v614
  • Release v613
  • Release v612
  • Release v611
  • Release v610
  • Release v609
  • Release v608
  • Release v607
  • Release v606
  • Release v605
  • Release v604
  • Release v603
  • Release v602
  • Release v601
  • Release v600
  • Release v599
  • Release v598
  • Release v597
  • Release v596
  • Release v595
  • Release v594
  • Release v593
  • Release v592
  • Release v591
  • Release v590
  • Release v589
  • Release v588
  • Release v587
  • Release v586
  • Release v585
  • Release v584
  • Release v583
  • Release v582
  • Release v581
  • Release v580
  • Release v579
  • Release v578
  • Release v577
  • Release v576
  • Release v575
  • Release v574
  • Release v573
  • Release v572
  • Release v571
  • Release v570
  • Release v569
  • Release v568
  • Release v567
  • Release v566
  • Release v565
  • Release v564
  • Release v563
  • Release v562
  • Agent
  • Release v559
  • Agent
  • Release v558
  • Agent
  • Release v557
  • Agent
  • Release v556
  • Agent
  • Release v555
  • Agent
  • Release v554
  • Agent
  • Release v553
  • Agent
  • Release v552
  • Agent
  • Release v551
  • Agent
  • Release v550
  • Agent
  • Release v549
  • Agent
  • Release v548
  • Agent
  • Release v547
  • Agent
  • Release v545
  • Agent
  • Release v544
  • Agent
  • Release v543
  • Agent
  • Release v542
  • Agent
  • Release v541
  • Agent
  • Release v539
  • Agent
  • Release v538
  • Agent
  • Release v537
  • Agent
  • Release v536
  • Agent
  • Release v535
  • Agent
  • Release v534
  • Agent
  • Release v533
  • Agent
  • Release v532
  • Agent
  • Release v531
  • Agent
  • Release v530
  • Agent
  • Release v529
  • Agent
  • Release v526
  • Agent
  • Release v525
  • Agent
  • Release v524
  • Agent
  • Release v523
  • Agent
  • Release v522
  • Agent
  • Release v521
  • Agent
  • Release v520
  • Agent
  • Release v518
  • Agent
  • Control Plane
  • Release v517
  • Agent
  • Release v516
  • Agent
  • Control Plane
  • Release v515
  • Agent
  • Control Plane

Was this helpful?

  1. Overview

Change Log

Contains a history of changes made to the Agent.

PreviousLife of a Request (Simplified)NextInstall the WarpStream Agent / CLI

Last updated 3 days ago

Was this helpful?

Release v657

May 22, 2025

  • Diagnostic to detect when Kafka Clients issue Fetch requests with very low timeout.

  • Add flag httpsProxyCACertFile and environment variable WARPSTREAM_HTTPS_PROXY_CA_CERT_FILE to support enterprise MITM HTTPS Proxies.

    • If using a MITM HTTPS Proxy users may see this error Post "https://metadata.default.${region}$.${cloud}$.warpstream.com/api/v1/agent/agentpool": tls: failed to verify certificate: x509: certificate signed by unknown authority because the agent doesn't trust the HTTPS Proxy's CA. This flag allows the user to set the CA to the internal WarpStream HTTPS client as a trusted certificate authority.

  • Add flag tlsProfile and environment variable WARPSTREAM_TLS_PROFILE to change TLS MinVersion, Curves and Ciphers.

    • See the documentation for exact details.

Release v656

May 20, 2025

  • Fix the diagnostic for unavailable storage bucket: it triggered on some errors that were not bucket errors in AWS.

  • Modified the record size limit resolution logic to prioritize in the following order: 1) topic-level max.message.bytes, 2) cluster-level message.max.bytes, 3) agent-level default. This replaces the previous behavior which used the minimum value between topic and cluster limits.

  • Allow Orbit to fetch more than 100MiB per kafka fetch request.

Release v655

May 20, 2025

  • Increase default value for WARPSTREAM_KAFKA_MAX_FETCH_REQUEST_BYTES_UNCOMPRESSED_OVERRIDE from 128MiB to 256MiB.

  • Increase default value for WARPSTREAM_KAFKA_MAX_FETCH_PARTITION_BYTES_UNCOMPRESSED_OVERRIDE from 128MiB to 256MiB.

  • Adds a diagnostic that triggers when there are errors talking to the cloud storage bucket (either because the storage bucket is rate limiting reads/writes or because of a temporary problem connecting to it).

  • Significantly improve the performance of consumer workloads that are fetching large amounts of compacted data.

Release v654

May 15, 2025

  • Cleanly NACK jobs that failed to run during agents shutdown.

  • Improve the logic for long-polling fetch requests after detecting that there is no new data for the set of topic-partitions that are being fetched.

Release v653

May 9, 2025

  • Added cluster-level (message.max.bytes) and topic-level (max.message.bytes) limits. In case of Warpstream these limits apply to individual records, instead of record batches. This is because OSS Kafka doesn't process individual records, it processes batches, but WarpStream operates on records directly. So there is no relationship between the batch size sent by producers and the size of batches received by consumers, and therefore, limiting record sizes will prevent consumers from choking on any individual records that are too large.

    • The effective limit is the minimum value across topic-level, cluster-level, and agent-level configurations.

Release v652

May 8, 2025

  • Added optional prometheus metrics server for benchmark-producer and benchmark-consumer commands.

Release v651

May 7, 2025

  • Added a producer benchmark tool benchmark-producer to the cli-beta commands.

  • Added a consumer benchmark tool benchmark-consumer to the cli-beta commands.

  • Add a 15s socket deadline for every socket read/write to ensure we detect when goroutines are stuck reading from a bad connection.

  • Tune HTTP connection pools to be smaller and dont allow idle connections to remain in the pool for as long.

  • Report how long each job has run to the control plane.

Release v650

May 2, 2025

  • Add new -tlsBlobURL flag and WARPSTREAM_TLS_BLOB_URL environment variable to load TLS files from a blob store bucket.

  • Increase default connection limit per vCPU from 4096 to 8192.

  • Add metric warpstream_agent_active_pipeline_instances (tags: pipeline_id) that tracks the number of active pipeline instances run by a given agent.

Release v649

Apr 29, 2025

  • Add a cache for api versions. This will reduce excessive rpcs to the control plane.

  • When an agent has the "pipelines" role AND any other role, Bento pipelines are rate-limited (default: 5MB/vCPU). When the agent has only the "pipelines" role, the rate limit is higher (50 MB/vCPU).

  • Add new -overridePipelinesSharedRateLimitPerVCPU flag and WARPSTREAM_OVERRIDE_PIPELINES_SHARED_RATE_LIMIT_PER_VCPU environment variable to override the default rate limit shared across all pipelines.

Release v648

April 18, 2025

  • Fixes an invariant violation error that could be emitted if you tried to fetch transactional records before the beginning of a partition.

  • Add support for ratelimiting the number of bytes that can be read from input streams during compactions to reduce networking microbursts.

  • Tune the maximum number of concurrent input streams that can be opened during compactions based on the overall size of the compaction to reduce networking microbursts.

  • Add a new flag for a preview feature, -multiregion. This tells the agent that the control plane runs across two regions, which will have the agent talk to the leader region while keeping track of leadership.

  • Enable the Agent file cache to begin backpressuring sooner than before. This prevents the Agents from OOMing some scenarios when they're completely overloaded and will now backpressure traffic more appropriately.

  • Allow reducing batch timeout to 25ms now that S3OZ is so much cheaper.

  • Add a new warpstream_control_plane_utilization metric (tagged by cluster). A value of 1.0 (100%) means the control plane is fully saturated and can no longer keep up with incoming requests.

  • Fixes a race condition in schema validation when there are concurrent produces.

  • Orbit now has an estimated time lag. The time lag is defined as time.Since(X) where X is the timestamp of the source record that is most recently copied by Orbit. The time lag will be emitted with the warpstream_consumer_group_estimated_lag_very_coarse_do_not_use_to_measure_e2e_seconds metric for the Orbit consumer group.

Release v647

April 10, 2025

Release v646 (Bad version, please upgrade to v647)

April 9, 2025

  • Orbit ignores topic id in fetch responses if kafka version < 3.1.

Release v645 (Bad version, please upgrade to v647)

April 8, 2025

  • Orbit makes metadata requests using topic name when max supported request version is < 12.

Release v644 (Bad version, please upgrade to v647)

April 8, 2025

  • Add flags batch-size and sleep-between-batch for the delete-all-topics-unsafe command to allow configuring the batch size and sleep duration between batch deletions.

  • Allow Agents to advertise IPs in range 240.0.0.0/8–255.0.0.0/8 automatically for customers using those in their private subnets.

Release v643 (Bad version, please upgrade to v647)

April 8, 2025

  • Make the delete-all-topics-unsafe command iterative.

Release v642

April 7, 2025

  • Add tls_insecure_skip_verify to orbit and schema registry migrator config

Release v641

April 7, 2025

  • Treat "connection timed out" error as resumable for object storage GET requests to prevent long running compactions from failing.

  • Fixes a bug where the agent could get stuck in some circumstances when reading data after a gap of more than 4B records created either by orbit or compacted topics compaction.

Release v640

April 2, 2025

  • Tune down the interval for batching internal API calls for describing the topics in a cluster.

  • Reduce memory usage for S3 clients when performing large numbers of concurrent DELETE operations.

  • Double number of LOW priority jobs that can run concurrently.

  • Create file-reader and file-scrubber cli-beta commands to read and scrub warpstream agent files.

  • Add new -additionalDeadscannerBucketURLs flag and WARPSTREAM_ADDITIONAL_DEADSCANNER_BUCKET_URLS environment variable to allow specifying additional bucket URLs that should be scanned for dead objected to be deleted from the object store. This enables more seamless migrations from one object storage bucket to another without having to manually cleanup the dead files in the old object storage bucket.

  • Add new mtls_server_ca_cert_env field to Orbit configuration to allow pointing to a file containing PEM encoded public keys of the certificate authorities that sign your server certificates.

Release v639

  • Add support for WarpStream Schema Linking.

  • Fix a rare panic when fetch auto-tuning is disabled explicitly and the kafka clients issue very small requests.

  • Bump github.com/golang-jwt/jwt/v5 dependency for CVE-2025-30204 vulnerability.

  • Return LeaderNotAvailable instead of BrokerNotAvailable for some transient / retriable Metadata errors to improve client compatibility.

  • Change the verbosity of deadscanner "deadscanner_progress" logs when they hit a "not found" error.

  • Allow Agents to run more L0/L1 compactions in parallel which helps prevent temporary spikes of L0/L1 lag when large L2 compactions are running.

Release v638

March 21, 2025

  • Add metric agent_kafka_source_cluster_connections_counter that counts the number of connections made by agents to the source cluster (via Orbit)

Release v637

March 21, 2025

  • Add consumer-group-lag command to cli-beta.

  • Increase the default value of GOMEMLIMIT to 3GiB/vCPU from 2GiB/vCPU. For properly scaled clusters this should have no impact on memory usage, but it should prevent excessive garbage collection from occurring when there are a lot of inflight fetch requests for overloaded clusters.

  • Ignore context.DeadlineExceeded errors for circuit breaking purposes when streaming GET requests from object store to avoid opening the circuit breaker when the Agents are overloaded, but the underlying object store is fine.

  • Reduce log spam for context.Canceled errors.

  • Use 4MiB pages by default instead of 16MiB pages.

  • Agent handles sasl handshakes in the agent itself. This reduces excessive rpcs to the control plane.

  • Orbit will prioritize fetching partitions by the order defined by the control plane.

Release v636

March 18, 2025

  • Release analyzers in the Agent to feed Warpstream Diagnostics: Diagnostics continuously analyzes your clusters to identify potential problems, cost inefficiencies, and ways to make things better. It looks at the health and cost of your cluster and gives detailed explanations on how to fix and improve them.

  • Add flag enableACLLogs and env var WARPSTREAM_ENABLE_ACL_LOGS to enable ACL logging.

    • As of this release Produce, Fetch, Metadata, JoinGroup, SyncGroup, DeleteRecords, InitProducerID Kafka API calls are logged if they are denied. More APIs will be added over time as we update the Control Plane.

  • Reduce allocations in the fetch code path to improve performance.

  • Fixes a very rare case where the agent would improperly categorize a single record as committed when it was aborted, with isolation_level=read_committed.

Release v635

March 12, 2025

  • Enable a new mechanism in the Agents for cleaning up deleted files from object storage. This new mechanism makes it much easier for high volume clusters to keep up with object deletion, and also reduces costs by dramatically reducing the amount of LIST and HEAD requests that the Agent make.

  • Reduces amount of data Orbit sends to the control plane to create topics.

Release v634

March 12, 2025

  • Add support for using IMDSv2 to resolve instance types for Agents running in AWS.

Release v633

March 11, 2025

  • Fixed a bug where setting an agent flag to an empty string by explicitly setting an environment variable to an empty string (i.e. ENV_VAR="") doesn't work as expected as it uses the default value instead. This affects the following flags:

  • If you set WARPSTREAM_DISABLE_CONSUMER_GROUP_METRICS_TAGS to an empty string, it used to default to disabling the partition tag. Now, if you set the env variable to an empty string, it doesn't disable any tags for the consumer group offset metrics.

  • If you set WARPSTREAM_AGENT_ROLES to an empty string, it used to default to proxy, jobs. Now, if you set WARPSTREAM_AGENT_ROLES to an empty string it means no roles are selected. Note that unless a role is provided via flags like enableManagedPipelines, empty roles is not allowed.

  • Add begin_fetch_at_latest_offset to Orbit topic mappings which forces Orbit to fetch from the latest topic partition offset in the source cluster, for the first fetch of each topic partition.

Release v632

March 10th, 2025

  • If the partition tag is disabled via the disableConsumerGroupsMetricsTags flag (or the WARPSTREAM_DISABLE_CONSUMER_GROUP_METRICS_TAGS environment variable), the warpstream_max_offset metric will be the max of the topic's partitions' max offsets.

  • Increase WARPSTREAM_KAFKA_CLOSE_IDLE_CONN_AFTER to 1 hour to decrease the frequency of broken pipe errors on some kafka clients.

  • Reduce log spam for some context canceled errors.

Release v631

March 7th, 2025

  • Implement support for the v0 JoinGroup API in consumer group management, allowing older clients to join groups.

Release v630

March 6th, 2025

  • If the partition tag is disabled via the disableConsumerGroupsMetricsTags flag (or the WARPSTREAM_DISABLE_CONSUMER_GROUP_METRICS_TAGS environment variable):

    • The consumer_group_lag metric will be the sum of the consumer group lag across the topic's partitions.

    • The warpstream_consumer_group_estimated_lag_very_coarse_do_not_use_to_measure_e2e_seconds metric will be the max of the estimated lag across the topic's partitions.

  • Increase default connection idle timeout from 10m to 15m.

  • Improve Agent logging (remove spurious logs, emit sampled logs for idle connections being closed for improved observability)

Release v629

Feb 28th, 2025

  • The agent now accepts to create batches that span more than 4B offsets. This is useful for heavily compacted topics.

  • Include warpstream proxy target in client ID for kafka_franz_warpstream blocks in managed data pipelines so that managed data pipelines works when Agents are split into proxy-consume and proxy-produce roles with no Agents running both roles simultaneously.

  • Add MTLS support orbit.

Release v628

Feb 19th, 2025

  • Improve visibility of ACL authorization failures.

  • Return ErrKafkaStorageError instead of ErrThrottlingQuotaExceeded for backpressure errors from Produce requests because librdkafka doesn't retry ErrThrottlingQuotaExceeded requests.

  • Fix bug in PREFIXED ACLs, where sometimes were not correctly applied.

Release v627

Feb 18th, 2025

  • Change the default fetch size for kafka_franz in managed data pipelines from 100MiB to 50MiB and the per-fetch fetch size from 50MiB to 25MiB.

  • If sasl/mTLS auth is performed using a cluster credentials name, with or without the ccn_ prefix, allow both the credentials name and the credentials username to be used as the kafka principal for acls. Previously, if a client performed auth using a credentials name, the credentials username was used as the kafka principal, which prevented valid acls matches.

Release v626

February 18th, 2025

  • Upgrade Agents to go 1.23.6.

  • Bump our Alpine base image to 3.21.3 to fix some vulnerabilities

  • Fix a rare panic happening on slow nodes when the publishing metric job would take too long to be handled

Release v625

February 12th, 2025

  • Add batching of internal API calls for describing the topics in a cluster.

Release v623

February 10th, 2025

  • Add diagnose-record to cli-beta, this command prints out diagnostic information about a specific record

  • Reduce default values for fetch auto-tuning to be less aggressive so it causes less memory pressure on consumer clients if they embed multiple Kafka consumers within the same application.

  • Prevent a tombstone from being deleted if it is the last record in a topic partition. Some workload use the offsets of the records consumed to determine if a consumer is caught up with the high watermark offset, and not deleting the tombstone allows the tombstone record to be consumed.

Release v622

February 6th, 2025

  • Change the default value of autoTuneFetchLimits from false to true. This makes it so the WarpStream Agents automatically auto-tune the fetch limits of Kafka consumer client applications. This dramatically improves the consumer performance of consumer applications that have not explicitly tuned their application for performance with WarpStream.

  • Add a new ws_dfat client ID feature that allows consumer clients to disable fetch auto-tuning if needed.

  • Add flag --consume-from-beginning to the CLI console-consumer to allow consuming from the beginning of the topic partition.

  • Enforce that agent group names only contain lowercase letters, numbers, and dashes.

  • Add cli-beta subcommand that changes how cli flags work to have per cli subcommand flags instead of adding them all to every cli command.

  • Add console-consumer to cli-beta which adds new features for printing out the offset of the message and the message key. Also add max-messages flag to limit the number of messages to consume.

  • Fix a bug where the agent would return an empty response that the sarama kafka client would not know how to parse in some cases in transactional or compacted topics.

  • New metrics:

    • warpstream_num_records: This is a gauge that indicates the number of records being stored per topic

      • tags: topic

Release v621

February 5th, 2025

  • Add support for the min.compaction.lag.ms topic-level configuration. This configuration allows users to specify a minimum time that must pass before a segment is eligible for compaction. This configuration is useful for ensuring that data is not compacted too soon after being written in a topic that has compact or compact,delete as its cleanup policy.

  • Fix a rare bug where incorrect data could be returned when using transactions in a compacted topic.

Release v620

February 3rd, 2025

  • Allow Orbit to work with older kafka clusters(pre 3.1 which was when topic ids were added to kafka fetch request responses). If Orbit is used without topic ids, then it is possible that a topic was deleted and recreated in the source cluster with the exact same name which might lead to inconsistencies in Orbit. This is easily solved by deleting the topic in the target cluster, and it will be automatically recreated by Orbit.

Release v619

February 2nd, 2025

  • Only retry Metadata errors that have retryable error codes.

Release v618

Jan 31st, 2025

  • Allow blob storage retries to retry GET and multi-part upload requests in compaction code to prevent long-running compactions from failing due to transient errors.

  • Improve error handling of the Metadata RPC to convert 500s and transient retryable errors to Kafka error LeaderNotAvailable instead of KafkaStorageError. This improves compatibility with librdkafka and prevents issues where librdkafka would refuse to accept additional records into its producer queue after receiving a KafkaStorageError in a Metadata response.

Release v617

Jan 28th, 2025

  • Improve the consistent hashing ring used in the file cache by increasing the number of virtual shards and switching hashing algorithms. Agents will automatically wait until all of the other Agents in the cluster are upgraded to the latest version and then switch hash ring implementations at approximately the same time to minimize disruption during the rollout.

  • Removed the allowHighCardinalityConsumerGroupTags flag and changed the default values of disableConsumerGroupsMetricTags to partition . This is a potentially breaking change for observability, but was done to prevent cluster with large numbers of topic-partitions from overloading customer's metrics systems and bills. The partition tag for consumer group metrics can be re-enabled by setting the disableConsumerGroupsMetricTags value to an empty string.

  • Add cli certificate-to-dn helper command to convert a client certificate to a DN.

Release v616

Jan 24th, 2025

  • Fix JKS certificate loading to be according to the JKS spec, specifically load certificates from DER instead of PEM formatting.

Release v615

Jan 24th, 2025

  • Fixed a bug where the BYOC schema registry incorrectly returns a schema incompatibility error when removing a field whose default value is null.

Release v614

Jan 23nd, 2025

  • Fix naming of tlsJavaKeystorePasword and tlsJavaTruststorePasword to be tlsJavaKeystorePassword and tlsJavaTruststorePassword .

Release v613

Jan 22nd, 2025

  • Orbit now copies internal kafka topics if these topics match the regex specified in the Orbit config.

Release v612

Jan 16th, 2025

  • Added flags to support Java Keystores and Truststores to make migration from Kafka easier

    • tlsJavaKeystoreFile, tlsJavaKeystorePasword, tlsJavaKeystoreKeyPasword, tlsJavaTruststoreFile, tlsJavaTruststorePasword.

    • Java Keystore passwords are not used to encrypt the keystore, only to confirm it's integrity. In most Java implementations the keystore password and private key password must be the same which negates any encryption advantages. We recommend using the same security controls when using Java Keystores that you would use with unencrypted certificate keys.

  • Added support for using IMDSv2 to configure DD_AGENT_HOST for Agents running in AWS. If v2 is not available we automatically fall back to v1.

  • Improved schema validation error message to return more detailed reason for why invalid record is rejected.

  • Added the -schemaValidationURL flag for schema registry to replace the deprecated -schemaRegistryURL flag (still supported).

Release v611

Jan 10th, 2025

  • Add tlsServerPrivateKeyPasswordFile flag to enable using encrypted TLS private keys.

    • The encrypted key given via tlsServerPrivateKeyFile must be in PKCS#8 format.

    • The password file must contain a single line which is the private key's password.

  • Honor DD_AGENT_HOST environment variable if its already set.

  • Allow TLS to be used with Orbit even if SASL is not configured.

Release v610

Jan 8th, 2025

  • Basic Authentication Support for BYOC Schema Registry:

    • Add schemaRegistryBasicAuth flag to enable basic authentication for schema registry.

  • Support the GET /subjects/(string: subject)/versions endpoint in schema registry to allow getting a list of versions registered under the specified subject.

  • Bump golang.org/x/net dependency for CVE-2024-45338 vulnerability.

Release v609

Dec 19th, 2024

  • Add -requireSASLAuthentication and -requireMTLSAuthentication to playground mode. They where missed when these flags where added to the agent in v549.

Release v608

Dec 18th, 2024

  • Add support for using DynamoDB as the agent's backing store alongside S3 and S3 Express One Zone.

Release v607

Dec 18th, 2024

  • Fix an issue where tombstones could be incorrectly handled, and showed as messages with empty values.

  • Fix a rare issue where the agent would fail to connect to a S3 bucket on startup in some configurations.

Release v606

Dec 11th, 2024

  • Make Orbit much more efficient and able to achieve higher throughput more easily.

  • Double the ratelimit for the maximum inflight fetch compressed bytes per core (now that fetch is significantly more efficient).

  • Make bento logging less noisy by limiting each managed pipeline to one bento log per second or less.

Release v605

Dec 10th, 2024

  • Avoid noisy error logs when instance type discovery fails.

  • Fix macOS Agent binaries to not segfault about a missing library

Release v604

Dec 9nd, 2024

  • Add the ability to load the agent key from a file using the -agentKeyPath flag or WARPSTREAM_AGENT_KEY_PATH environment variable. The agent will reload the file when the file is updated so the key can be changed without restarting the agent.

  • Upgrade to latest Bento which adds ability to ratelimit pipelines based on bytes instead of messages, and improving GCP bigquery output logging.

  • Switch to C++ library for lz4 compression instead of pure Go.

Release v603

Dec 2nd, 2024

  • Preserves the original DN from the TLS cert to be used as the ACL principal. Previously, while the DN was semantically correct, the order of its elements wasn't necessarily preserved. This leads to the principal set during auth to not match the principal set when the ACL is created.

  • Fix a log in the managed data pipelines feature that was logging a cluster-specific credential.

  • Add support for SASL SCRAM 256/512 as an authentication mechanism for source clusters in Orbit.

Release v602

Nov 25th, 2024

  • Retry object storage permissions check up to 3 times on startup to avoid false positives related to dial timeouts

  • Added BYOC Schema Registry Support:

    • The agent now supports hosting a BYOC Schema Registry.

    • Use the -schemaRegistryPort flag (or the WARPSTREAM_SCHEMA_REGISTRY_PORT env variable) to specify the port (default 9094) to run the schema registry server on.

    • Use the -schemaRegistryTLS flag (or the WARPSTREAM_SCHEMA_REGISTRY_TLS_ENABLED env variable) to enable tls over the schema registry server.

    • New metrics:

      • warpstream_agent_schema_registry_inflight_connections: number of currently inflight / active connections to schema registry server.

        • tags: schema_registry_operation, outcome

      • warpstream_agent_schema_registry_request_latency: latency (seconds) for processing each Schema registry request.

        • tags: schema_registry_operation, outcome

      • warpstream_agent_schema_registry_outcome: outcome (success, error, etc) for each Schema Registry request.

        • tags: schema_registry_operation, outcome

      • warpstream_agent_schema_registry_request_bytes_counter: number of bytes of incoming requests.

        • tags: schema_registry_operation

      • warpstream_agent_schema_registry_response_bytes_counter: number of bytes of outgoing response.

        • tags: schema_registry_operation, outcome

      • warpstream_schema_versions_count: Total number of schema versions in the schema registry cluster.

      • warpstream_schema_versions_limit: Maximum number of schema versions allowed in the cluster.

Release v601

Nov 21st, 2024

  • Limit maximum number of inflight bytes for fetch as compressed instead of uncompressed, and do it before issuing the fetch requests. This prevents excessive read amplifications when the Agents are highly loaded.

  • Switch from offheap to onheap cache for file cache for better eviction policy.

  • Fix a bug that was preventing the Agents from loading large IOs directly from object storage leading to excessive cache churn.

  • Remove some prefetcher restrictions to improve the performance of large fetch requests with many topic-partitions.

Release v600

Nov 20th, 2024

  • Fixed DescribeCluster API to correctly override hostname

  • Fixed a bug where the demo/playground agent was incorrectly using the environment variable.

Release v599

Nov 20th, 2024

  • Only restart Bento pipelines when their deployed configurations are modified. Previously, modifying one configuration restarted all deployed pipelines.

  • Increase default timeout for loading data into the file cache from 5s to 15s. This has no impact on tail latencies for healthy clusters due to the fast retry mechanism, but dramatically improves the ability of overloaded clusters to make progress.

  • Cap number of speculative/fast retries to 2% at most. This prevents excessive read amplification in some scenarios where the Agents are overloaded.

  • Upgrade Agents to go 1.23.3 to eliminate contention in HTTP client.

Release v598

Nov 6th, 2024

  • Removed noisy error log triggered when producing with ACKS=0. Previously, clients and agents generated a fake invariant error: 'Previous sent message is not current sequence -1' due to an unordered sequence number.

  • Allow Agents to create ingestion files with up to 64MiB of uncompressed data, increased from 16 MiB.

  • Orbit encrypts communication with source clusters using TLS.

Release v597

Nov 6th, 2024

  • Default Agent Group: When the -agentGroup parameter is not explicitly set, the agent will now default to a group named 'default'. Previously, if a Kafka client connects to an agent without a group specified, it could end up connecting to agents in other groups. With this change, clients connected to an agent with no group specified will now only see agents in the 'default' group.

    • Note: This change is backwards compatible; during rollout, agents with no group set and those in the 'default' group will be treated as a single group.

Release v596

Nov 1st, 2024

  • Add delay + jitter + randomization to the order/timing in which Bento pipelines are started to avoid pipelines synchronizing with each other and doing all of their work at the same time.

  • Add support for Orbit to the Agents. See: https://docs.warpstream.com/warpstream/byoc/orbit

Release v595

October 30th, 2024

  • Return success + offsets/timestamps instead of DuplicateSequenceError + offsets/timestamps when a duplicate batch is detected via the idempotent producer functionality to mirror Kafka's behavior.

  • Upgrade to latest version of Bento with more improvements for parquet encoder (bug fixes + support for int8/int16).

Release v594

October 28th, 2024

  • Upgrade to latest version of Bento with improved handling of decimals and floats in parquet encoder.

Release v593

October 27th, 2024

  • Fixes metrics called "warpstream_agent_segment_batcher_flush_xxx" who were reporting their successes with "flush_cause:buffer_full" as errors.

  • Emit Bento metrics in Agents when managed data pipelines is enabled. All Bento metrics will be prefixed with: warpstream_bento.

  • Upgrade to latest version of Bento with ability to encode maps/lists in the parquet encoder.

Release v592

October 23nd, 2024

  • Upgrade to latest Bento version, which improves error logging for the GCP BigQuery output.

Release v591

October 22nd, 2024

  • Make manage data pipelines product work automatically, even when the WarpStream Agents are advertising the hostname of a load balancer to the Kafka protocol by making the Kafka metadata handler always return the Agents actual IP addresses to the managed data pipelines product.

Release v590

October 17th, 2024

  • Agents will now automatically tune their GC and backpressure settings automatically based on which roles are configured. proxy-consume Agents will have larger file caches, proxy-produce Agents will buffer more data before backpressuring, pipelines Agents will GC less aggressively, etc

Release v589

October 16th, 2024

  • Add fetch limits auto-tuning to the Agents, which automatically adjusts fetch request size limits when the Agents deem necessary. To enable this feature, use the flag -autoTuneFetchLimits or the env variable WARPSTREAM_AUTO_TUNE_FETCH_LIMITS. By default this is disabled.

Release v588

October 11th, 2024

  • Agents configured with specific roles will now reject requests that they're not supposed to handle. For example, proxy-consume Agents will reject Fetch requests and proxy-produce Agents will reject Produce requests.

  • WarpStream Agents will now automatically avoid communicating with each other if multiple clusters are deployed in the same VPC, even if I.P addresses are quickly recycled between the two clusters (due to high container churn).

  • Increased default batching interval for Metadata requests to 250ms.

  • Upgrade managed data pipelines to latest version of Bento, and automatically tune franz_kafka blocks to fetch data from source Kafka/WarpStream clusters faster.

Release v587

September 27th, 2024

  • Increase the default value of GOMEMLIMIT from 1GiB/vCPU to 1.5GiB/vCPU. This should result in less GC overhead for highly loaded Agents.

  • Automatically disable profile forwarding if Datadog profiling is turned on.

Release v586

September 25th, 2024

  • Change the default value of disableProfileForwarding/WARPSTREAM_DISABLE_PROFILE_FORWARDING to false, which means that the Agents will start forwarding profiles to WarpStream's control plane.

Release v585

September 24th, 2024

Release v584

September 23rd, 2024

  • Add support for AWS Glue Schema Registry schema validation

    • Users can now use schemas stored in their AWS Glue Schema Registry to validate records.

    • Add new topic-level configuration warpstream.schema.registry.type to specify what type of schema registry the agent should fetch the remote schemas from. Supported values include "STANDARD" and "AWS_GLUE". Defaults to "STANDARD"

  • Add -zonedCIDRBlocks flag and WARPSTREAM_ZONED_CIDR_BLOCKS env variable as an alternative way to provide information on the Kafka client's availability zone. The value is a mapping of availability zones to Kafka client IPs and should be a <> delimited list of AZ to CIDR range pairs, where each pair starts with an AZ, a @, and a comma separated list of CIDR blocks for that given AZ. For example, us-east-1a@10.0.0.0/19,10.0.32.0/19<>us-east-1b@10.0.64.0/19<>us-east-1c@10.0.96.0/19.

Release v583

September 18th, 2024

  • Bump the otel dependency to fix spurious error logs introduced in v582

Release v582

September 18th, 2024

  • Increase default clean shutdown interval from 60s to 80s.

  • Increase the max number of connections per CPU (from 2048 to 4096)

Release v581

September 12th, 2024

  • Bump build to go 1.22.7 to fix some stdlib vulnerabilities

Release v580

September 11th, 2024

  • Add agent profile forwarding to WarpStream's control plane. Note that this feature cannot be enabled together with Datadog profiling and is not supported for self-hosted control planes. The following flags are added for this feature.

    • disableProfileForwarding/WARPSTREAM_DISABLE_PROFILE_FORWARDING: this flag can be used to disable profile forwarding. By default this is set to true and no profiles will be forwarded.

    • maxProfileSize/WARPSTREAM_MAX_PROFILE_SIZE: max number of bytes allowed for buffering profiles in memory.

  • Bump our Alpine base image to 3.20.3 to fix some vulnerabilities

Release v579

September 9th, 2024

  • Add -maxProduceRecordSizeBytes flag and WARPSTREAM_MAX_PRODUCE_RECORD_SIZE_BYTES env variable to override the maximum uncompressed size of a record that can be produced.

Release v578

August 30th, 2024

  • Add "ratelimited" outcome for S3-backed blob storage metrics warpstream_blob_store_*

  • Increased maximum allowed ingestion file size from 8MiB to 16MiB

  • Improve buckets for distribution metrics for warpstream_agent_segment_batcher_flush_file_size_uncompressed_bytes and warpstream_agent_segment_batcher_flush_file_size_compressed_bytes

  • Fixed a bug where we returned the wrong error code when a produced record was larger than the maximum allowed size (32MiB)

Release v577

August 25th, 2024

  • Fix bug in backpressure system that would cause Agents to get stuck in backpressure state.

Release v576

August 22nd, 2024

  • Fix bug affecting librdkafka library's cooperative rebalance, when adding more clients to a consumer group.

  • Add new startup flags to enable mutex and block profiling:

    • -enableSetMutexProfileFraction/WARPSTREAM_ENABLE_SET_MUTEX_PROFILE_FRACTION: enable this flag to call 'runtime.SetMutexProfileFraction' with the value passed along -mutexProfileFraction.

    • -mutexProfileFraction/WARPSTREAM_MUTEX_PROFILE_FRACTION: tune the value passed to call 'runtime.SetMutexProfileFraction

    • -enableSetBlockProfileRate/WARPSTREAM_ENABLE_SET_BLOCK_PROFILE_RATE: enable this flag to call 'runtime.SetBlockProfileRate' with the value passed along -blockProfileRate

    • -blockProfileRate/WARPSTREAM_BLOCK_PROFILE_RATE: tune the value passed to call 'runtime.SetBlockProfileRate'

Release v575

August 18th, 2024

  • Report CPU usage using a moving average instead of point-in-time to improve accuracy.

  • Improve performance when a cluster has many clients polling partitions that are written to infrequently.

Release v574

August 15th, 2024

  • Increase default limits for backpressuring produce requests by 400%

Release v573

August 14th, 2024

  • Disable automatic retries in underlying blob storage clients (AWS / GCP) so we can control blob storage retries at the application layer.

  • Switch to C++ library for ZSTD instead of pure Go (2-3x faster).

Release v572

August 6th, 2024

  • Improved the Agents ability to backpressure Produce requests.

  • Added back-pressure support for Fetch requests in addition to Produce requests.

Release v571

August 2nd, 2024

  • Improved Agent backpressuring for Produce requests so that Agents will begin throttling Produce request and individual connections when they have too much producer data buffered in memory instead of OOMing.

  • Agents now return KafkaStorageError instead of RequestTimedOut for some timeout-related errors in the Fetch path. This improves behavior with Java consumer clients which treat KafkaStorageError as retryable, but not RequestedTimedOut.

  • Fixed a bug in demo / playground mode where the requested hostname strategy was not being used.

  • Tagged all Agent metrics with the agent_roles , agent_group , and virtual_cluster_id tags to make debugging advanced deloyments easier.

Release v570

July 17th, 2024

  • Fixed a bug that would cause idempotent producer out of sequence errors when idempotent producer was enabled on some clients. Data was never committed out of order, but the previous implementation would force the client to retry more than necessary, resulting in high latency or reduced throughput

  • Added Schema Validation Support:

Release v569

July 5th, 2024

  • Add a new warpstream_max_offset metric tagged by topic/partition (that can be controlled with the existing disableConsumerGroupsMetricsTags flag).

  • The existing warpstream_consumer_group_max_offset metric is deprecated as it shares the same value across consumer groups and it can be replaced with the new metric mentioned above.

  • Add the topic label/tag to warpstream_agent_kafka_produce_uncompressed_bytes_counter that was missing it.

Release v568

July 3rd, 2024

  • Various small performance improvements (batching, allocations, etc).

  • Batch Metadata and FindCoordinator request RPCs between the Agents and the control plane. Helpful for workloads with a large number of consumer/producer clients.

Release v567

June 19th, 2024

  • Allow individual fetch requests to fetch up to 128MiB of uncompressed data per topic-partition, per fetch request, increased from 32 MiB.

  • Add -kafkaMaxFetchRequestBytesUncompressedOverride flag and WARPSTREAM_KAFKA_MAX_FETCH_REQUEST_BYTES_UNCOMPRESSED_OVERRIDE env variable to override maximum number of uncompressed bytes that can be fetched in a single fetch request.

  • Add -kafkaMaxFetchPartitionBytesUncompressedOverride and WARPSTREAM_KAFKA_MAX_FETCH_PARTITION_BYTES_UNCOMPRESSED_OVERRIDE env variable to override maximum number of uncompressed bytes that can be fetched for a single topic-partition in a single fetch request.

  • Introduced the metric warpstream_consumer_group_generation_id with the tags consumer_group. This metric indicates the generation number of the consumer group, incrementing by one with each rebalance. It serves as an effective indicator for detecting occurrences of rebalances.

  • Added metric warpstream_consumer_group_estimated_lag_very_coarse_do_not_use_to_measure_e2e_seconds (tags: consumer_group, topic, partition). Provides an estimated lag in seconds (equivalent to warpstream_consumer_group_lag but in time units instead of offsets), calculated via interpolation/extrapolation. Caution: Very coarse; unsuitable for precise end-to-end latency measurement.

  • Renamed Flags and Environment Variables:

    • benthosBucketURL (WARPSTREAM_BENTHOS_BUCKET_URL) is now bentoBucketURL (WARPSTREAM_BENTO_BUCKET_URL): Bucket URL to use when fetching the Bento configuration.

    • benthosConfigPath (WARPSTREAM_BENTHOS_CONFIG_PATH) is now bentoConfigPath (WARPSTREAM_BENTO_CONFIG_PATH): Path in the bucket to fetch the Bento configuration.

Release v566

June 5th, 2024

  • Fix broken formatting of logs that were sometimes logfmt when they should've been JSON.

Release v565

June 5th, 2024

  • Improve the Agent's ability to handle certain pathological compactions where an input stream is not read from for so long that the object store closes the connection. The Agent will now detect this scenario and issue a new GET request to resume the compaction instead of failing it entirely and never making progress.

Release v564

May 26th, 2024

  • Improve performance of file cache for high throughput workloads by improving how batching is handled.

  • Add support for more Benthos components: [awk, jsonpath, lang, msgpack, parquet, protobuf, xml, zstd].

  • Fix rare memory leak.

  • Fix rare bug that would cause circuit breakeres to get stuck in open / half-open state forever.

Release v563

May 22nd, 2024

  • Fixed a bug in the Agent that prevented users from starting playgrounds or demos.

Release v562

May 19th, 2024

Agent

  • Make Agent batch size configurable using -batchMaxSizeBytes flag and WARPSTREAM_BATCH_MAX_SIZE_BYTES environment variable.

  • Metrics

    • Add new Agent metric (gauge) that indicates state of consumer group: warpstream_consumer_group_state. Metric has two tags: consumer_group and group_state.

    • Add new Agent metric (gauge) that indicates the number of members in each consumer group: warpstream_consumer_group_num_members. Metric has one primary tag: consumer_group.

    • Add new Agent metric (gauge) that indicates the number of topics in each consumer group: warpstream_consumer_group_num_topics. Metric has one primary tag: consumer_group.

    • Add new Agent metric (gauge) that indicates the number of partitions in each consumer group: warpstream_consumer_group_num_partitions. Metric has one primary tag: consumer_group.

    • Add new Agent metric (gauge) that indicates the total number of topics in the cluster: warpstream_topic_count.

    • Add new Agent metric (gauge) that indicates the limit for the total number of topics in the cluster: warpstream_topic_count_limit.

    • Add new Agent metric (gauge) that indicates the total number of partitions in the cluster: warpstream_partition_count.

    • Add new Agent metric (gauge) that indicates the limit for the total number of partitions in the cluster: warpstream_partition_count_limit.

Release v559

May 13th, 2024

Agent

  • Add support for WarpStream Managed Data Pipelines

  • Change the default number of uncompressed bytes that will be be buffered before flushing a file to object storage from 8MiB to 4MiB

Release v558

May 10th, 2024

Agent

  • Support adopting AWS roles for providing access to S3 via environment variables instead of a query argument in the bucket URL.

  • Tuned circuit breakers to be less aggressive by default, and configured the retry mechanisms to avoid retrying circuit breaker errors.

  • New Metrics

    • warpstream_circuit_breaker_open: Records the number of times the circuit breaker is opened, tagged by the circuit breaker name to identify the associated operations. - tags: name

    • warpstream_circuit_breaker_close: Records the number of times the circuit breaker is closed, tagged by the circuit breaker name to identify the associated operations.

      • tags: name

    • warpstream_circuit_breaker_halfopen: Records the number of times the circuit breaker is half-opened, tagged by the circuit breaker name to identify the associated operations.

      • tags: name

    • warpstream_circuit_breaker_error: Records the number of times the circuit breaker blocks an operation, tagged by the circuit breaker name to identify the associated operations.

      • tags: name

Release v557

May 9th, 2024

Agent

  • Enhanced Buckets: Refined the buckets for warpstream_agent_kafka_fetch_uncompressed_bytes, warpstream_agent_kafka_fetch_compressed_bytes, warpstream_agent_kafka_produce_compressed_bytes and warpstream_agent_kafka_produce_uncompressed_bytes to ensure they are more precise. The previous bucket configuration was too sparse.

  • Fixed Prometheus Counters: Resolved an issue with Prometheus counters previously were not appropriately incrementing the value.

Release v556

May 9th, 2024

Agent

  • Tune circuit breakers to be less aggressive, and don't retry requests that failed due to a circuit breaker error.

  • New metrics:

    • warpstream_circuit_breaker_open: Records the number of times the circuit breaker is opened, tagged by the circuit breaker name to identify the associated operations.

      • tags: name

    • warpstream_circuit_breaker_close: Records the number of times the circuit breaker is closed, tagged by the circuit breaker name to identify the associated operations.

      • tags: name

    • warpstream_circuit_breaker_halfopen: Records the number of times the circuit breaker is half-opened, tagged by the circuit breaker name to identify the associated operations.

      • tags: name

    • warpstream_circuit_breaker_error: Records the number of times the circuit breaker blocks an operation, tagged by the circuit breaker name to identify the associated operations.

      • tags: name

Release v555

May 8th, 2024

Agent

  • New metrics:

    • warpstream_agent_kafka_fetch_uncompressed_bytes_counter: Tracks the count of uncompressed bytes fetched. Although a histogram version (warpstream_agent_kafka_fetch_uncompressed_bytes) already exists, it may not provide an accurate count in some metrics systems. That's why we've made this data directly available as a counter.

      • tags: topic (requires enabling high-cardinality metrics)

    • agent_kafka_produce_uncompressed_bytes_counter: Tracks the count of uncompressed bytes produced. While the histogram version (warpstream_agent_kafka_produce_uncompressed_bytes) is available, some metrics systems might not accurately count it. So, we're also offering this data directly as a counter.

      • tags: topic (requires enabling high-cardinality metrics)

Release v554

April 30th, 2024

Agent

  • Replaces all instances of Ristretto cache with a simple LRU cache, dramatically reduces the number of caches misses in topic metadata related caches.

Release v553

April 30th, 2024

Agent

  • Rename metrics:

    • warpstream_agent_kafka_inflight_conn has been renamed to warpstream_agent_kafka_inflight_connections

    • warpstream_agent_kafka_inflight_request has been renamed to warpstream_agent_kafka_inflight_requests

Release v552

April 29th, 2024

Agent

  • Dramatically improve the performance of fetch requests that query 100s or 1000s of partitions in a single fetch request.

  • Add speculative retries for blob storage reads in the fetch path (in addition to existing speculative retries for blob storage writes in the write path).

Release v551

April 26th, 2024

Agent

  • Convert an error log (that did not actually represent an error) to an info log to reduce error spam.

Release v550

April 23nd, 2024

Agent

  • High-Cardinality Metrics: Introduced a new agent configuration for enabling high-cardinality metrics. To activate this feature, use the command-line flag -kafkaHighCardinalityMetrics or set the environment variable WARPSTREAM_KAFKA_HIGH_CARDINALITY_METRICS=true.

  • Deprecated metrics:

    • warpstream_agent_kafka_fetch_bytes_sent

    • warpstream_agent_segment_batcher_flush_file_size_counter_type

    • warpstream_agent_segment_batcher_flush_file_size

    • blob_store_list_latency

    • blob_store_list_count

    • blob_store_delete_latency

    • blob_store_delete_count

    • blob_store_put_bytes_latency

    • blob_store_put_bytes_count

    • blob_store_put_stream_latency

    • blob_store_put_stream_count

    • blob_store_get_bytes_latency

    • blob_store_get_bytes_count

    • blob_store_get_stream_latency

    • blob_store_get_stream_count

    • blob_store_get_bytes_range_latency

    • blob_store_get_bytes_range_count

    • blob_store_get_stream_range_latency

    • blob_store_get_stream_range_count

  • New metrics:

    • warpstream_agent_kafka_fetch_uncompressed_bytes: Tracks the total uncompressed bytes fetched, replacing warpstream_agent_kafka_fetch_bytes_sent.

      • tags: topic (requires enabling high-cardinality metrics)

    • warpstream_agent_kafka_produce_uncompressed_bytes: Tracks the number of uncompressed bytes produced.

      • tags: topic (requires enabling high-cardinality metrics)

    • warpstream_agent_segment_batcher_flush_file_size_uncompressed_bytes: Tracks the uncompressed size of files stored after batching, serving as a replacement for warpstream_agent_segment_batcher_flush_file_size.

    • warpstream_blob_store_operation_latency: Tracks the latency and count of individual object storage operations. It's a replacement for all deprecated blob_store_ metrics. Now all the different operations are tagged within the same metric.

      • tags: operation,outcome

Release v549

April 22nd, 2024

Agent

  1. Adds support for authenticating Kafka clients using mTLS. A -requireMTLSAuthentication flag is added, and the previous -tlsVerifyClientCert flag has been deprecated. A new -requireSASLAuthentication flag is added, and the previous -requireAuthentication flag is deprecated. When authenticating using mTLS, the Distinguished Name(DN) from the client certificate is used as the Kafka Principal. The -tlsPrincipalMappingRule flag can be used to specify a Regex to extract a principal from the DN. For example, the rule CN=([^,]+) will extract the Common Name(CN) from the DN, and use that as the ACL principal.

Release v548

April 17th, 2024

Agent

  1. Fix a rare bug where some compactions could fail if you had tombstone expiration enabled and very little data to compact.

Release v547

April 11nd, 2024

Agent

  1. Fix playground mode against latest control plane signup constraints

Release v545

April 4nd, 2024

Agent

  1. Added support to new regions. There is a new "-region" (or WARPSTREAM_REGION env variable) that you can leverage to pick among the supported regions - you can use the console to see the current list.

Release v544

April 2nd, 2024

Agent

  1. Added support for TLS/mTLS for Warpstream Agent and kafka client connections. The -kafkaTLS(env WARPSTREAM_TLS_ENABLED=true), -tlsServerCertFile(env WARPSTREAM_TLS_SERVER_CERT_FILE=<filepath>), tlsServerPrivateKeyFile(env WARPSTREAM_TLS_SERVER_PRIVATE_KEY_FILE=<filepath>), -tlsVerifyClientCert(env WARPSTREAM_TLS_VERIFY_CLIENT_CERT=true), -tlsClientCACertFile(env WARPSTREAM_TLS_CLIENT_CA_CERT_FILE=<filepath>) flags were added which respectively are used to enable tls, pass the TLS server certificate to the Agent, pass the TLS server private key to the Agent, optionally enable client certificate verification, and to optionally pass the TLS root certificate authority certificate file to the server.

Release v543

March 29th, 2024

Agent

  1. Adding a new "availabilityZoneRequired" (or env variable "WARPSTREAM_AVAILABILITY_ZONE_REQUIRED") flag. When enabled, the agent will synchronously try to resolve its availability zone during startup for 1 min, and will not start serving its /v1/status health check until it succeeds. The process will exit early if it did not manage to resolve the availability zone.

Release v542

March 26th, 2024

Agent

  1. Fixed a race condition during startup that could make some agents advertise their availability zone as "WARPSTREAM_UNSET_AZ"

Release v541

March 25th, 2024

Agent

Release v539

March 15th, 2024

Agent

  1. Adding a new "agentGroup" parameter to define the name of the 'group' that the Agent belongs to. This feature is used to isolate groups of Agents that belong to the same logical cluster, but should not communicate with each other because they're deployed in separate cloud accounts, vpcs, or regions. By default the agent belongs to the default group.

Release v538

March 6th, 2024

Agent

  1. Fixed a rare panic in the Agents caught by Antithesis

Release v537

March 1st, 2024

Agent

  1. Kafka "compacted topics" are generally available.

Release v536

February 28th, 2024

Agent

  1. Fix a bug with the warpstream playground command, introduced in v535.

Release v535

February 26th, 2024

Agent

  1. Make agent pool name argument completely optional, even when using non-default clusters.

  2. Fixed a memory leak that would cause some workloads to use an excessive amount of memory over time.

  3. Add support for S3 express and separating the "ingestion" bucket from the "compaction" bucket so data can be landed into low-latency storage and then immediately compacted into lower cost storage.

  4. Add speculative retries to file flushing which dramatically reduces outlier latency for Produce requests.

Release v534

Agent

Bug fixes

  1. Fixed another bug in the fetch logic that would result in the Agent returning empty batches in some scenarios when transient network errors occurred. This did not cause any correctness issues, but would make librdkafka refuse to proceed in some scenarios and block consumption of some partitions.

Release v533

Agent

Bug fixes

  1. Fixed a bug in the fetch logic that would result in the Agent returning empty batches in some scenarios depending on the client configuration. This did not cause any correctness issues, but would make librdkafka refuse to proceed in some scenarios and block consumption of some partitions.

Release v532

Agent

New Features

  1. Improves roles reporting to Warpstream control plane so that they can be properly rendered in our console.

Release v531

Agent

New Features

Release v530

Agent

New Features

  1. Compatibility with CreatePartitions Kafka API for updating partitions count.

Release v529

Agent

New Features

  1. Treat grade nat as private IP addresses, and allow advertising it.

Performance improvements

  1. Improve caching in the agent.

  2. Improve batching of requests to the warpstream metadata backend.

  3. Other general performance improvements.

Release v526

Agent

New Features

  1. Full-support for ACLs using SASL credentials.

Release v525

Agent

Bug Fixes and performance Improvements

  1. Improved object pooling in RPC layer to reduce memory usage.

  2. Add transparent batching to the file cache RPCs to dramatically reduce the number of inter-agent RPCs for high partition workloads.

  3. Enhanced Apache Kafka compatibility by refining the handling of watermarks in Fetch responses.

Release v524

Agent

Dynamic Consumer Group Rebalance Timeout

  1. Partially handle consumer group requests (JoinGroup and SyncGroup) in the agent, to use the clients' rebalance timeout, instead of the previous 10s default. This enhancement will minimize unnecessary rebalance attempts in larger consumer groups with numerous members, due to short timeouts.

Release v523

Agent

Bug Fixes and performance Improvements

  1. Fixed a bug in the Fetch() code that was not setting the correct topic ID in error responses which made some Kafka clients emit warning logs when this happened.

  2. Fixed a bug in the "roles" feature that was causing Agents with the "produce" role to still participate in the distributed file cache. Now only Agents with the "consume" role will participate in the file cache, as expected.

Release v522

Agent

Bug Fixes and performance Improvements

  1. Circuit breakers will now return example errors for clarity.

  2. Fetch() code path will now handle failures more gracefully by returning incremental results in more scenarios which improves the system's ability to recover under load.

  3. Fix a memory leak in the in-memory file cache implementation.

Release v521

Agent

New Features

  1. Docker images are now multi-arch, our documentation and official kubernetes charts has been updated accordingly.

  2. Introduced circuit breakers around object storage access.

  3. Finer control over agent roles: it is now possible to split between the proxy-consume and proxy-produce roles, our documentation has been updated as well.

Release v520

Agent

This release is the first phase of a two-phase upgrade to WarpStream's internal file format. This release adds support for reading the upgraded file format. You MUST upgrade all Agents to this version before moving from any version < v520 to any version > than v520.

Release v518

Agent

New Features

  1. Support kafka Headers: if you produce messages containing Kafka headers, they will now be automatically persisted to your cloud object storage, and will be read when fetching.

  2. Revisit the flags and configuration knobs to choose how the agents advertise themselves in Warpstream service discovery. Our documentation has been updated accordingly.

Control Plane

New Features

  1. Fully support kafka ListOffsets protocol: you can now look for partition offsets based on timestamps.

Release v517

Agent

Bug Fixes and performance Improvements

  1. Fixed a bug related to the handling of empty (but not null) values in records in the Fetch implementation.

Release v516

Agent

New Features

  1. The agent will now report a sample of its error logs back to Warpstream control plane. It should ease troubleshooting and help us identify issues earlier. This can be disabled with the flag disableLogsCollection or the environment variable WARPSTREAM_DISABLE_LOGS_COLLECTION.

Bug Fixes and performance Improvements

  1. Added batching in the metadata calls made during Kafka Fetch, improving memory usage along the way.

Control Plane

New Features

  1. Added support for Kafka's InitProducerID protocol message, and the idempotent producer functionality in general. Requires upgrading to a version of the Agents that is >= v515.

  2. Added support for Kafka ListOffsets with positive timestamps value (until now only negative values for special cases were supported)

Release v515

Agent

New Features

  1. The Agents will now report the lag / max offsets for every active consumer group as standard metrics. The metrics can be found as warpstream_consumer_group_lag and warpstream_consumer_group_max_offset .

  2. The Agents will now report the number of files at each compaction level so that user's can monitor whether they are experiencing compaction lag. These metrics can be found as warpstream_files_count and the level is tagged with the name compaction_level.

Bug Fixes and performance Improvements

  1. File cache is now partitioned by <file_id, 16MiB extent> instead of just <file_id>. This spreads the load for fetching data for large files more evenly amongst all the Agents.

  2. Added some logic in the file cache to detect when certain parts of the cache are experiencing high churn and reduce the default IO size for paging in data from object storage. This helps avoid filling the cache with data that won't be read.

  3. Fixed a bug in the file cache that was causing it to significantly *over* fetch data in some scenarios. This did not cause any correctness problems, but it wasted network bandwidth and CPU cycles.

  4. Modified the implementation of the Kafka Fetch method to return incremental results when it experiences a retryable error mid-fetch. This makes the Agents much better at recovering from disruption and catching consumer lag incrementally.

  5. Added some pre-fetching logic into the Kafka Fetch method so that when data for a single partition is spread amongst many files the Agent doesn't get bottlenecked making many single-threaded RPCs. This mostly helps increase the speed at which individual partitions can be "caught up" when lagging.

  6. Increased the default maximum file size created at ingestion time from 4MiB to 8MiB. This improves performance for extremely high volume workloads.

  7. Added replication to the Agent file cache so that if an error is experienced trying to load data from the file cache on the Agent node that is "responsible" for a chunk of data, the client can retry on a different node. This helps minimize disruption when Agents shutdown ungracefully.

  8. Agents now report their CPU utilization to the control plane. We will use this information in the future to improve load balancing decisions. CPU utilization can be view in the WarpStream Admin console now as well.

  9. Improved the performance of deserializing file footers.

  10. Standardized prometheus metric names prefixes.

  11. Added a lot more metrics and instrumentation, especially around the blob storage library and file cache.

Control Plane

New Features

  1. Added support for the Kafka protocol message DeleteTopics .

Bug Fixes and Performance Improvements

  1. We added intelligent throttling / scheduling to the deadscanner scheduler. This scheduler is responsible for scheduling jobs that run in the Agent to scan for "dead files" in object storage and delete them. Previously these jobs could run with high frequency and rates which would interfere with live workloads. In addition, they could also result in very high object storage API requests costs due to excessive amounts of LIST requests. The new implementation is much more intelligent and automatically tunes the frequency to avoid disrupting the live workload and incurring high API request fees.

Fix a bug where S3 multi-part uploads from the produce path and the schema migrator would fail with "InvalidPart: One or more of the specified parts could not be found. The part may not have been uploaded, or the specified entity tag may not match the part's entity tag". The github.com/aws/aws-sdk-go-v2/config version was upgraded to 1.29.9 from 1.27.43 in Agent v643 but the upgrade is backwards-incompatible. Specifically, multi-part uploads that don't specify a checksum algorithm will fail (see reports and ).

Add support for Apache Kafka transactions. Read our announcement at .

Add support for .

Add support for isolating managed data pipelines to different groups of pipelines Agents using the managedPipelinesGroupName flag and WARPSTREAM_MANAGED_PIPELINES_GROUP_NAME environment variable.

Upgrade to latest Bento version which adds support for interpolating table name in GCP BigQuery output, as well as support for GCP's new . Also adds support for writing to .

Make the Agents use the pure GO DNS resolver to work around a concurrency with setenv and getenv.

Add support for in managed data pipelines

Check out for more details

Added beta support for in the WarpStream Agents

Supports automatic availability zone detection in kubernetes reading the node zone label (requires version 0.10.0 of our ).

Agent nodes can now be configured to run dedicated roles - see .

Protect Data in Motion with TLS Encryption
here
here
https://www.warpstream.com/blog/kafka-transactions-explained-twice
SASL handshake v0
Storage Write API for BigQuery
Redshift clusters in AWS
issue
documentation
benthos
helm charts
splitting roles documentation
Read the docs
scheduling blocks