Known Issues
Leader epoch handling in librdkafka
Symptom
Client does not try to open a connection with a new agent, especially during rolling restarts.
Context
Leader epoch is a monotonically increasing number representing a continuous period of leadership for a single partition in Kafka. Changes in leader epoch signals leader transition. In WarpStream the concept of a partition leader does not exist since any WarpStream agent can handle produce and consume requests for any topic and partition. As such WarpStream returns a leader epoch of 0 in all the responses that require a leader epoch.
Problem
librdkafka
2.4 introduced a stricter check such that metadata update is only considered if leader epoch is monotonically increasing (PR).
if (rktp->rktp_leader_epoch == -1 || leader_epoch > rktp->rktp_leader_epoch)
This means that if there is an agent with IP ip1
and we would like to replace it with an agent with ip2
, librdkafka
will not open a connection against ip2
.
Solution
There are a couple options:
Downgrade to
librdkafka
2.3Set
warpstream_strict_leader_epoch=true
orws_sle=true
in your Kafka client ID.Contact us to enable a patch in the WarpStream control plane.
Additionally, the librdkafka
fix is expected to go into version 2.7 (to be released in mid December).
Relevant material
Related librdkafka
issues
librdkafka
fix
Leader epoch KIPs
Last updated