Known Issues

Leader epoch handling in librdkafka

Symptom

Client does not try to open a connection with a new agent, especially during rolling restarts.

Context

Leader epoch is a monotonically increasing number representing a continuous period of leadership for a single partition in Kafka. Changes in leader epoch signals leader transition. In WarpStream the concept of a partition leader does not exist since any WarpStream agent can handle produce and consume requests for any topic and partition. As such WarpStream returns a leader epoch of 0 in all the responses that require a leader epoch.

Problem

librdkafka 2.4 introduced a stricter check such that metadata update is only considered if leader epoch is monotonically increasing (PR).

if (rktp->rktp_leader_epoch == -1 || leader_epoch > rktp->rktp_leader_epoch)

This means that if there is an agent with IP ip1 and we would like to replace it with an agent with ip2, librdkafka will not open a connection against ip2.

Solution

There are a couple options:

  • Downgrade to librdkafka 2.3

  • Set warpstream_strict_leader_epoch=true or ws_sle=true in your Kafka client ID.

  • Contact us to enable a patch in the WarpStream control plane.

Additionally, the librdkafka fix is expected to go into version 2.7 (to be released in mid December).

Relevant material

Related librdkafka issues

librdkafka fix

Leader epoch KIPs

Last updated