Known Issues

Leader epoch handling in librdkafka

Symptom

Client does not try to open a connection with a new agent, especially during rolling restarts.

Context

Leader epoch is a monotonically increasing number representing a continuous period of leadership for a single partition in Kafka. Changes in leader epoch signals leader transition. In WarpStream the concept of a partition leader does not exist since any WarpStream agent can handle produce and consume requests for any topic and partition. As such WarpStream returns a leader epoch of 0 in all the responses that require a leader epoch.

Problem

librdkafka 2.4 introduced a stricter check such that metadata update is only considered if leader epoch is monotonically increasing (PR).

if (rktp->rktp_leader_epoch == -1 || leader_epoch > rktp->rktp_leader_epoch)

This means that if there is an agent with IP ip1 and we would like to replace it with an agent with ip2, librdkafka will not open a connection against ip2.

Solution

There are a couple options:

  • Upgrade to librdkafka 2.8

  • Downgrade to librdkafka 2.3

  • Set warpstream_strict_leader_epoch=true or ws_sle=true in your Kafka client ID

  • Contact us to enable a patch in the WarpStream control plane

Relevant material

Related librdkafka issues

librdkafka fix

Leader epoch KIPs

Last updated