Problem Statement
Redpanda pods could be evicted due attempts to write to a EmptyDir mount used for Cloud Storage cache , where the writes are exceeding the "SizeLimit" for the mount.
Pods are restarted/evicted and reported in kubernetes events logs are message ssimilar to
Usage of EmptyDir volume "tiered-storage-dir" exceeds the limit "nnnn.
Detail
In earlier releases of Redpanda helm-chart ...
The mountType for the tiered storage volume was emptyDir
storage:
tiered:
mountType: emptyDir
And if parameter cloud_storage_cache_size was set to a value...
then that value is used as the SizeLimit for the volume...
for example
"cloud_storage_cache_size": 107374182400,
The mountType for the tiered storage volume was EmptyDir and it had a SizeLimit of 107374182400
tiered-storage-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: 107374182400
The cloud_storage_cache_size could at times be slightly exceeded , which would then trigger the
pod eviction, due to exceeding tiered-storage-dir>SizeLimit:
8m27s Warning Evicted pod/redpanda-7 Usage of EmptyDir volume "tiered-storage-dir" exceeds the limit "107374182400".
Workaround/Resolution
- workaround (1) could be to update configuration
set cloud_storage_cache_size to 0 (this will remove the Size SizeLimit for the volume )
and set cloud_storage_cache_size_percent to an appropriate value
Example...
tiered:
config:
cloud_storage_cache_size: "0"
cloud_storage_cache_size_percent: "5"
Then deploying the helm chart...
- Other workarounds could be to use alternate MountTypes for the cloud storage cache volume
- In latest releases of helm chart this is not an issue as for for tiered.mountType is set to none
which does not defined a SizeLimit for the volume mount
storage:
tiered:
# mountType can be one of:
# - none: does not mount a volume. Tiered storage will use the data directory.
# - hostPath: will allow you to chose a path on the Node the pod is running on
# - emptyDir: will mount a fresh empty directory every time the pod starts
# - persistentVolume: creates and mounts a PersistentVolumeClaim
mountType: none
Via this change ( Reference Documentation)