Restore an ElasticSearch Cluster from its Partial Volume

Zhimin Wen
4 min readSep 2, 2024
Image by 🌼Christel🌼 from Pixabay

I have a elasticsearch cluster running on OpenShift implemented with the elastic helm chart. After some wrong operations, the cluster is in an unknown status and it stuck in a applychanges status.

A careful check finds out among the 3 nodes only one node’s PVC volume has the real data left over. The other volume has no data as they are recreated. Too bad, no snapshot is taken for the elastic cluster.

Can we restore the cluster with the partially left over volumes even some data loss is unavoidable?

Replicate the Problem with PVC Volume Snapshot

Instead working on the actual cluster directly, let's copy out the data from the single left PVC.

Take a volume snapshot against the data PVC using the storage class.

Now create a new namespace, create a fresh new elasticsearch cluster with the operator. Note down the PVC name created by the operator.

Scale down to 0 replicas for the operator so that there is no oeprator reconcilation session affect the testing. Scale down to 0 replica for the statefulset of the elastic cluster.

Now delete the 1st node’s PVC. Create the PVC with the same name of the elastic cluster’s 1st node by using the volume…

--

--