Problem Statement
There may be occasions when running a Kafka Connect connector (for example Debezium) that is sending data to Redpanda and you observe what appears to be lower write-throughput than you expect given the infrastructure Redpanda is running on .
Detail
If Redpanda is running on infrastructure that is adequate for your workload per :
Redpanda Sizing Guidelines
then the apparent low write-throughput could be down to the way producer message batching is configured on the Kafka Connect side.
In that Redpanda and the underlying infrastructure is performing fine with the amount of data being sent to it, but due to the way Kafka Connect is batching the events, it is not sending them to Redpanda at your desired rate.
Kafka Connect connectors are producer clients, so they can benefit from tuning producer "message batching" parameters such as :
linger.ms , batch.size + buffer.memory
These parameters are discussed in more detail here:
Configure Producers - Message Batching
Solution
-Eliminate the issue being down to infrastructure . For example run fio for disk and iperf3 for network per Redpanda Sizing Guidelines to ensure they can meet your required bandwidth/throughput.
-Adjust Kafka Connect worker configuration to tune producer message batching parameters discussed in Configure Producers - Message Batching
An example configuration used on internal environments here at Redpanda has given good results :
producer.linger.ms=1
producer.batch.size=131072
---
If the above does not resolve please log a support ticket providing the following information :
- Outputs from the fio + iperf3 performance tests
- Output of rpk debug bundle taken during the time Kafka Connect was sending events to Redpanda.
Note "rpk debug bundle" has following parameters to specify time-ranges that will pull entries from the Redpanda log file for desired range
--logs-since
--logs-until - Screenshot(s) of Redpanda Monitoring dashboard covering the time-range when Kafka Connect was sending events to Redpanda
Example screenshot :
- Kafka Connect worker configuration + Kafka Connect Connector configuration
- Kafka Connect log files (covering time load was happening)
- Expected throughput + messages per second + average message size.