kafka_topics
Synopsis
Provide a list of Kafka topics, which will be managed by HADeploy
Attributes
Each item of the list has the following attributes:
Name | req? | Description |
---|---|---|
name | yes | The name of the topic |
properties | no | A map of properties associated to the topic. Refer to the kafka documentation for a complete list of available properties. |
partition_factor | yes if assignments is not defined | Specify the number of partition of the topic. |
replication_factor | yes if assignments is not defined | Specify the number of replica for each partition. |
assignments | yes if rep/part factors are not specified | A Map where the key is the partition# and the value a list of broker_id .This allow to manual definition of the distribution of partition's replica, with strict location rules. |
no_remove | no | Boolean: Prevent this group to be removed when HADeploy will be used in REMOVE mode. Default: no |
ranger_policy | no | Definition of Apache Ranger policy bound to this topic. Parameters are same as kafka_ranger_policy except than topics should not be defined.The policy name can be explicitly defined. Otherwise, a name will be generated as " _<topic>_ ".See example below for more information |
when | no | Boolean. Allow conditional deployment of this item. Default True |
Example
Simple case. We let Kafka decide on which brokers our replica will be set:
kafka_topics:
- name: broadapp_t1
partition_factor: 3
replication_factor: 2
properties:
retention.ms: 630720000000
retention.bytes: 858993459200
The same, topic, but we specify explicitly our placement of replica:
kafka_topics:
- name: broadapp_t1
assignments:
0: [ 1, 2 ]
1: [ 2, 3 ]
2: [ 3, 1 ]
properties:
retention.ms: 630720000000
retention.bytes: 858993459200
If kafka_relay
host a broker_id_map
as the following:
kafka_relay:
...
broker_id_map:
1: 1001
2: 1002
3: 1003
Then the first partition (#0) will have two replicas, placed on brokers of id 1001 and 1002.
If hdfs_relay
does not contains a broker_id_map
, then the first partition (#0) will have two replicas, placed on brokers of id 1 and 2.
NB: Recent version of Kafka introduced a 'Rack awareness' capability which ensure a good distribution of replica amongst several racks. This explicit partition assignment may now be used only on very specifics cases.
NB: Partition re-assignment on topic modification is not supported. One may use the kafka provided partition reassignment tool (kafka-reassign-partitions.sh) for this.
Another example, with a Apache Ranger policy granting Publish and Consume rights to the users of group users:
kafka_topics:
- name: broadapp_t1
partition_factor: 3
replication_factor: 2
ranger_policy:
audit: no
permissions:
- groups:
- users
accesses:
- Consume
- Publish
Trick
To find the broker_ids
values, one may use the kdescribe
tool: