Skip to main content

Should not have ZK migrations enabled on a cluster that was created in KRaft mode.

Hi,

I am trying out ZK to kraft migration in our k8s environment. So for the
initial step, that is to deploy Kraft controller with migration, I have
deployed 3 Kraft controller vai this statefulset:
# {% raw %}
kind: StatefulSet
apiVersion: apps/v1
metadata:
name: kraft-controller
namespace: "cgbu-ums-dev"
labels:
application: osdmc
component: kraft-controller
spec:
updateStrategy:
type: RollingUpdate
serviceName: kraft-controller
podManagementPolicy: "Parallel"
replicas: 3
selector:
matchLabels:
application: osdmc
component: kraft-controller
template:
metadata:
labels:
application: osdmc
component: kraft-controller
version: '3.9'
occloud.oracle.com/open-network-policy: allow
spec:
nodeSelector:
occloud.oracle.com/nodeGroup: cgbu-ums-dev
terminationGracePeriodSeconds: 300
containers:
- name: kraft-controller
imagePullPolicy: Always
image: '
phx.ocir.io/oraclegbudevcorp/cn-cgbu-ums/kafka-kraft-controller:latest'
resources:
limits:
memory: 8Gi
cpu: 2
requests:
memory: 1Gi
cpu: 250m
ports:
- containerPort: 9092
name: controller
# - containerPort: 19093
# name: controller
command:
- /ocss/entrypoint.sh
volumeMounts:
- name: kraft-controller-claim
mountPath: /var/lib/kafka/data
subPath: cgbu-ums-dev_kafka
# - name: kraft-controller-config-volume
# mountPath: /ocss/kafka/config/kraft/controller.properties
# subPath: controller.properties
# - name: kraft-controller-config-volume
# mountPath: /ocss/kafka/config/kraft
envFrom:
- configMapRef:
name: kraft-controller-config
env:
- name: KAFKA_HEAP_OPTS
value: "-Xms512m -Xmx1536m -Dsun.net.inetaddr.ttl=45"
- name: KAFKA_OPTS
value: "-Dlogging.level=INFO"
- name: KAFKA_LOG4J_OPTS
value: "-Dlog4j.configuration=file:/ocss/log4j.properties"
- name: ZK_CLUSTER
value: "zookeeper:2181"
- name: ADVERTISED_HOST
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: CONTROLLER_PORT
value: "9092"
- name: CLUSTER_ID
value: "y3lzxs3MS3Si8YKLLQYJww"
- name: IS_DEV_MODE
value: "false"
- name: NODE_TYPE
value: "controller"
- name: REPLICATION_FACTOR
value: "3"
- name: OFFSETS_RETENTION_MINUTES
value: "60"
- name: LOG4J_FORMAT_MSG_NO_LOOKUPS
value: "true"
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 120
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 15
exec:
command:
- sh
- -c
- "/ocss/livenessKraftController.sh"
lifecycle:
preStop:
exec:
command:
- sh
- -c
- "/ocss/kafka/bin/kafka-server-stop.sh
/ocss/kafka/config/kraft/controller.properties"
# volumes:
# - name: kraft-controller-config-volume
# configMap:
# name: kraft-controller-config
securityContext:
runAsUser: 1000
fsGroup: 5000
dnsPolicy: ClusterFirst
restartPolicy: Always
volumeClaimTemplates:
- metadata:
name: kraft-controller-claim
annotations:
path: "cgbu_cgbu-ums_cndevcorp_3_1"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: nfs
# {% endraw %}

To disable migration we have added a flag in the setupscript (called by
entrypoint.sh) to remove migration related properties. SetupScript:
update_kraft_configuration(){
#name of the kafka broker example---kafka-kraft-0
broker_name=$1
#log dir where logs are stored for the broker
example--/var/lib/kafka/data/kafka-kraft-0
log_dir=$2
#Port number at which broker will will accept clients connections
example--19092
# broker_Port=$3
#Port number used by the controller for inter-broker communications
example--19093
controller_port=$3
#comma separated list of controllers with their port number and node_id
taking part in controller election process
#example---0@kafka-kraft-0:9093,1@kafka-kraft-1:9093,2@kafka-kraft-2:9093
controller_quorum_voters=$4
#unique cluster id in hexadecimal format which will be same for all the
brokers running in kraft cluster
#example--0b7KcoQKSWKf7IddcNHGuw
cluster_id=$5
#advertised host name of the broker example--100.77.43.231
advertised_host=$6
#unique broker node id derived from broker name example--0,1,2
node_id=${broker_name##*-}
#Address used by clients to connect with the kafka broker
#example:PLAINTEXT://kafka-kraft-0.kafka-kraft:19092---kafka-kraft is the
service name configured in svc yaml
#
advertised_listeners="PLAINTEXT://"${broker_name}.kafka-kraft-controller":"${broker_Port}
# advertised_listeners="CONTROLLER://localhost:"${controller_port}
#
advertised_listeners="CONTROLLER://"${advertised_host}":"${controller_port}
-- this is working in 418
advertised_listeners=""
if [[ $IS_DEV_SETUP == true ]]; then
advertised_listeners="PLAINTEXT://:"${controller_port}
else
advertised_listeners="CONTROLLER://"${broker_name}":"${controller_port}
fi
# advertised_listeners="CONTROLLER://"${broker_name}":"${controller_port}
-- working
# advertised_listeners="PLAINTEXT://"${broker_name}":"${broker_Port} #Not
needed for controller only node -- check!!!
# protocols used for the broker and controller communication
# listeners="PLAINTEXT://:"${broker_Port}",CONTROLLER://:"${controller_port}
# For error - The listeners config must only contain KRaft controller
listeners from controller.listener.names when process.roles=controller
# listeners="CONTROLLER://"${advertised_host}":"${controller_port} -- this
is working in 418
listeners=""
if [[ $IS_DEV_SETUP == true ]]; then
listeners="CONTROLLER://:"${controller_port}
else
listeners="CONTROLLER://"${broker_name}":"${controller_port}
fi
# listeners="CONTROLLER://"${broker_name}":"${controller_port} -- working
# listeners="CONTROLLER://localhost:"${controller_port}

#Print each variable value
echo "broker_name: $broker_name"
echo "log_dir: $log_dir"
echo "controller_port: $controller_port"
echo "controller_quorum_voters: $controller_quorum_voters"
echo "cluster_id: $cluster_id"
echo "advertised_host: $advertised_host"
echo "node_id: $node_id"
echo "advertised_listeners: $advertised_listeners"
echo "listeners: $listeners"

#Kafka directory
KAFKA_DIR="/ocss/kafka"

#replacing node.id properties in server.properties
sed -i -e 's\^node.id=.*\node.id='"${node_id}"'\'
${KAFKA_DIR}/config/kraft/controller.properties
#replacing controller.quorum.voters properties in server.properties
# sed -i -e
's\^controller.quorum.voters=.*\controller.quorum.voters='"${controller_quorum_voters}"'\'
${KAFKA_DIR}/config/kraft/controller.properties
#replacing advertised.listeners properties in server.properties
sed -i -e
's\^advertised.listeners=.*\advertised.listeners='"${advertised_listeners}"'\'
${KAFKA_DIR}/config/kraft/controller.properties
#replacing listeners properties in server.properties
sed -i -e 's\^listeners=.*\listeners='"${listeners}"'\'
${KAFKA_DIR}/config/kraft/controller.properties
#replacing log.dirs properties in server.properties
sed -i -e 's\^log.dirs=.*\log.dirs='"${log_dir}"'\'
${KAFKA_DIR}/config/kraft/controller.properties

# additional changes required for using controller.properties file to start
controller node
sed -i -e
's/^#controller.quorum.voters=.*/controller.quorum.voters='${controller_quorum_voters}'/'
${KAFKA_DIR}/config/kraft/controller.properties
sed -i -e
"s/^controller.quorum.bootstrap.servers=.*/#controller.quorum.bootstrap.servers=kraft-controller:${controller_port}/"
"${KAFKA_DIR}/config/kraft/controller.properties"

#replacing process.roles properties in server.properties
# sed -i -e 's\^process.roles=.*\process.roles=controller\'
${KAFKA_DIR}/config/server.properties

#if different value for cluster.id is found in meta.properties file,then
replace with the unique cluster_id.
#cluster_id should be same across all brokers running in a cluster
if [[ -f "${log_dir}/meta.properties" ]]; then
echo "Updating cluster.id in meta.properties file"
sed -i -e 's\^cluster.id=.*\cluster.id='"${cluster_id}"'\'
${log_dir}/meta.properties
fi

# Flag to enable migration
MIGRATION_ENABLE=${MIGRATION_ENABLE:-false}
# Additional migration properties to be added if migration flag is enabled
if [[ "$MIGRATION_ENABLE" = true ]]; then
echo "Enabling migration process for Kraft controller"
cat >> ${KAFKA_DIR}/config/kraft/controller.properties << EOF
# ZooKeeper to Kraft migration related properties

# Enable the migration
zookeeper.metadata.migration.enable=true

# ZooKeeper client configuration
zookeeper.connect=${ZK_CLUSTER}
EOF
elif [[ "$MIGRATION_ENABLE" = false ]]; then
# Remove the migration properties if they exist
echo "Disabling migration process for Kraft controller"
sed -i '/^zookeeper.metadata.migration.enable=.*/d'
${KAFKA_DIR}/config/kraft/controller.properties
sed -i '/^zookeeper.connect=.*/d'
${KAFKA_DIR}/config/kraft/controller.properties
fi


#upgrading metadata version to support KIP-919 controller registration
#This line can be removed in future when kafka is upgraded and when
metadata version supports KIP-919 controller registration by default
# ${KAFKA_DIR}/bin/kafka-storage.sh format -c
${KAFKA_DIR}/config/controller.properties --cluster-id $cluster_id
--upgrade
#formatting log_dir of the each brokers running in a cluster if not already
formatted
${KAFKA_DIR}/bin/kafka-storage.sh format -g -t $cluster_id -c
${KAFKA_DIR}/config/kraft/controller.properties
}

To pass that flag we used configmap to supply that flag which we can later
change to false when needed. We went ahead with configmaps as when we
bashed into the pods and changed the property >> restarted the pods ---
everything went back to the original settings and nothing changed...

Our configmap:
# {% raw %}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kraft-controller-config
namespace: cgbu-ums-dev
data:
MIGRATION_ENABLE : "true"
# {% endraw %}

Now the main issue is when I apply the Kraft controllers, I get the
following error:
Should not have ZK migrations enabled on a cluster that was created in
KRaft mode.

I am not sure why this would come, but my guess is that the cluster started
as pure Kafka but then later was told to do migration and while checking
log.dirs for that a conflict arose and it failed completely... Please
correct me if I am wrong.

Now that I have deployed the cluster by default with migration enabled, I
believe this error should not come as I am directly starting with
migration... Please help me understand this issue and fix this

Regards,
Priyanshu

Comments