Kafka Connect is a stateless component by design. It relies on external
Kafka topics to persist its state, including connector configurations,
offsets, and status updates. In a distributed Kafka Connect cluster, this
state is managed through the following configurable topics:
-
config.storage.topic – stores connector configurations
-
offset.storage.topic – stores source connector offsets
-
status.storage.topic – stores the status of connectors and tasks
Because Kafka Connect does not maintain any state locally, it is not
dependent on a specific IP address or hostname. As a result, it is best to
deploy Kafka Connect using a *Kubernetes Deployment* rather than a
*StatefulSet*, since Deployments are better suited for stateless
applications and provide more flexibility with scaling and rolling updates.
Additionally, it is common practice to expose the Kafka Connect REST API
via an *Ingress*, allowing external systems to submit and manage connectors.
We have deployed several instances of this as deployment for our use case
from below repo - FYR
https://github.com/ibm-messaging/kafka-connect-mq-source
Thanks,
Vignesh
On Sun, Jun 15, 2025 at 12:12 AM Prateek Kohli <prateekkohli2112@gmail.com>
wrote:
> Hi All,
>
> I'm building a custom Docker image for kafka Connect and planning to run it
> on Kubernetes. I'm a bit stuck on whether I should use a Deployment or a
> StatefulSet.
>
> From what I understand, the main difference that could affect Kafka Connect
> is the hostname/IP behaviour. With a Deployment, pod IPs and hostnames can
> change after restarts. With a StatefulSet, each pod gets a stable hostname
> (like connect-0, connect-1, etc.)
>
> My question is: Does it really matter for Kafka Connect if the pod
> IPs/hostname change, considering its a stateless application?
>
> Thanks
>
Kafka topics to persist its state, including connector configurations,
offsets, and status updates. In a distributed Kafka Connect cluster, this
state is managed through the following configurable topics:
-
config.storage.topic – stores connector configurations
-
offset.storage.topic – stores source connector offsets
-
status.storage.topic – stores the status of connectors and tasks
Because Kafka Connect does not maintain any state locally, it is not
dependent on a specific IP address or hostname. As a result, it is best to
deploy Kafka Connect using a *Kubernetes Deployment* rather than a
*StatefulSet*, since Deployments are better suited for stateless
applications and provide more flexibility with scaling and rolling updates.
Additionally, it is common practice to expose the Kafka Connect REST API
via an *Ingress*, allowing external systems to submit and manage connectors.
We have deployed several instances of this as deployment for our use case
from below repo - FYR
https://github.com/ibm-messaging/kafka-connect-mq-source
Thanks,
Vignesh
On Sun, Jun 15, 2025 at 12:12 AM Prateek Kohli <prateekkohli2112@gmail.com>
wrote:
> Hi All,
>
> I'm building a custom Docker image for kafka Connect and planning to run it
> on Kubernetes. I'm a bit stuck on whether I should use a Deployment or a
> StatefulSet.
>
> From what I understand, the main difference that could affect Kafka Connect
> is the hostname/IP behaviour. With a Deployment, pod IPs and hostnames can
> change after restarts. With a StatefulSet, each pod gets a stable hostname
> (like connect-0, connect-1, etc.)
>
> My question is: Does it really matter for Kafka Connect if the pod
> IPs/hostname change, considering its a stateless application?
>
> Thanks
>
Comments
Post a Comment