Hey Lehar,
I don't think there's a way to control this during topic creation. I just
took a look through
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/AdminUtils.scala
and it does appear partition assignment does not account for each broker's
different log directories. I also took a look at the kafka-topics.sh script
and it has a --replica-assignment argument but that looks to only allow
specifying brokers. During topic creation, once a replica has been chosen I
think we then choose the directory with the fewest number of partitions -
see
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogManager.scala#L1192
What I think you can do is move existing partitions around with the
kafka-reassign-partitions.sh script. From running the command locally:
--reassignment-json-file <String: The JSON file with the partition
manual assignment json file path> reassignment configurationThe
format
to use is -
{"partitions":
[{"topic": "foo",
"partition": 1,
"replicas": [1,2,3],
"log_dirs": ["dir1","dir2","dir3"]
}],
"version":1
}
Note that "log_dirs" is optional.
When
it is specified, its length must
equal the length of the replicas
list. The value in this list can
be
either "any" or the absolution
path
of the log directory on the
broker.
If absolute log directory path is
specified, the replica will be
moved
to the specified log directory on
the broker.
There's the log_dirs field you can use in the JSON file to move partitions
between directories.
Hope that helps a bit.
Andrew
On Tue, Oct 25, 2022 at 6:56 AM Lehar Jain <lehar.j@media.net.invalid>
wrote:
> Hey,
>
> We run Kafka brokers with multiple log directories. I wanted to know how
> Kafka balances traffic between various directories. Can we have our own
> strategy to distribute different partitions to different directories. As
> currently, we are facing an imbalance in sizes of the aforementioned
> directories, some directories have a lot of empty space whereas others are
> getting filled quickly.
>
>
> Regards
>
I don't think there's a way to control this during topic creation. I just
took a look through
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/AdminUtils.scala
and it does appear partition assignment does not account for each broker's
different log directories. I also took a look at the kafka-topics.sh script
and it has a --replica-assignment argument but that looks to only allow
specifying brokers. During topic creation, once a replica has been chosen I
think we then choose the directory with the fewest number of partitions -
see
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogManager.scala#L1192
What I think you can do is move existing partitions around with the
kafka-reassign-partitions.sh script. From running the command locally:
--reassignment-json-file <String: The JSON file with the partition
manual assignment json file path> reassignment configurationThe
format
to use is -
{"partitions":
[{"topic": "foo",
"partition": 1,
"replicas": [1,2,3],
"log_dirs": ["dir1","dir2","dir3"]
}],
"version":1
}
Note that "log_dirs" is optional.
When
it is specified, its length must
equal the length of the replicas
list. The value in this list can
be
either "any" or the absolution
path
of the log directory on the
broker.
If absolute log directory path is
specified, the replica will be
moved
to the specified log directory on
the broker.
There's the log_dirs field you can use in the JSON file to move partitions
between directories.
Hope that helps a bit.
Andrew
On Tue, Oct 25, 2022 at 6:56 AM Lehar Jain <lehar.j@media.net.invalid>
wrote:
> Hey,
>
> We run Kafka brokers with multiple log directories. I wanted to know how
> Kafka balances traffic between various directories. Can we have our own
> strategy to distribute different partitions to different directories. As
> currently, we are facing an imbalance in sizes of the aforementioned
> directories, some directories have a lot of empty space whereas others are
> getting filled quickly.
>
>
> Regards
>
Comments
Post a Comment