MirrorMaker2 for Data Recovery

Zhimin Wen
10 min readJul 24, 2022
Image by Sabine Zierer from Pixabay

Volume snapshot is one way to backup, and restore Kafka data, but it's at the storage layer where the process itself doesn’t have any understanding of the application.

With Kafka MirrorMaker2 we can perform the backup and restore in a more flexible way. Let’s check this out.

Test Environment

I have a Kafka cluster with 3 brokers running with IBM eventstreams on top of OpenShift. Meantime, I have a single nodes K3s cluster where the Kafka is set up using the Strimizi Operator.

Using MirrorMaker2, we will first backup the topic data, and the offset for its consumer groups into the K3s Kafka cluster.

Simulating the disaster recovery, we will delete the topic and then restore the data from the K3s back to the OCP cluster using mirrormaker2. Lastly, we resume the MirrorMake2 data flow from OCP cluster to K3s.

Kafka Clusters

The Kafka on the OCP is setup with the IBM eventstream operator, the Kafka portion of the YAML is shown below,

replicas: 3
inter.broker.protocol.version: '3.2'
log.cleaner.threads: 6
log.message.format.version: '3.2'
num.io.threads: 24
num.network.threads: 9
num.replica.fetchers: 3
offsets.topic.replication.factor: 3
default.replication.factor: 3
min.insync.replicas: 2
auto.create.topics.enable: "false"
- name: external
port: 9094
type: route
type: scram-sha-512
tls: true
- name: tls
port: 9093
type: internal
tls: true
type: tls
type: persistent-claim
size: 8Gi
class: rook-ceph-block
replicas: 3
class: rook-ceph-block
size: 2Gi
type: persistent-claim
topicOperator: {}
userOperator: {}

We have turned off the auto topic creation. The external listener for Kafka is SASL_SSL based.

The Kafka cluster on the K3s is created with the native Strimizi operator. The YAML custom resource is listed below,

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
name: my-cluster
replicas: 1…