Why does my S3 Sink Connector /GCS Sink Connector appear to have a lag? – Redpanda

Issue

There may be times when your S3 Sink Connector / GCS Sink Connector appears to have a 'lag'.

A definition of what is considered a 'lag' in this context could be :

*) How frequently files are uploaded to S3/GCS compared to most recent timestamps on events

You may be observing a delay in how often files are being uploaded to S3/GCS compared to the most recent timestamps on events

*) The connectors consumer group reports a consistent lag

Each Managed Sink Connector has a corresponding consumer group which reads events from the topics being monitored by the connector.

For example..
if we have a S3 Sink Connector /GCS Sink Connector named :
my-cloud-sink-storage-connector

The corresponding Consumer Group name is:
connect-my-cloud-sink-storage-connector

The lag in this case, being the difference between most recent topic event offset and the latest committed offset for the consumer group.

In either

rpk group describe <group-name>

Redpanda Console > Consumer Groups

The amount of lag for the consumer group is displayed ..

Additionally , you may have query based on metrics that displays consumer group lag.

Solution

For the S3/GCS Sink Connectors there are 2 parameters that can influence the 'lag' behaviour .

Max records per file / file.max.records

The maximum number of records to put in a single file. Must be a non-negative number. 0 is interpreted as "unlimited", which is the default. In this case files are only flushed after file.flush.interval.ms.
File flush interval milliseconds / file.flush.interval.ms

The time interval to periodically flush files and commit offsets. Value specified must be a non-negative number. Default is 60 seconds. 0 indicates that it is disabled. In this case, files are only flushed after reaching file.max.records record size

Connector Documentation :
#advanced-aws-s3-sink-connector-configuration / #advanced-gcs-sink-connector-configuration

So for example ...if you have a low volume topic, the connector could waiting until either

- there is enough data in the file before it is sent to S3/GCS (based on file.max.records)
- file.flush.interval.ms time has been exceeded

Only once the files have been successfully been uploaded to S3/GCS will the connectors consumer group offset be committed ..thus reducing the consumer group lag.

That did not work

Please log a ticket with Redpanda Support, provide the following information :

- Connector Configuration (JSON format)

- If you are running self-hosted output of :
-- rpk group describe <connector consumer group>
--rpk debug bundle

-If you are using Redpanda Cloud
--Confirmation of the name of the impacted Redpanda Cluster
--Screenshot of Console > Consumer Groups > $connector_group_name