Question

MySQL Reader + KafkaWriter: How to specify individual topics for the individual MySQL tables?

  • 6 February 2024
  • 2 replies
  • 18 views

Badge

Hey!
 

I have the following setup:

  • MySQL → Striim → Kafka

This all works as expected, but if I have multiple MySQL tables, all changes are streamed into 1 single Kafka topic.

Is there a way to configure the KafkaWriter to stream the messages to dedicated topics for each table?

For exmaple:

  • Events from table `users` should go to a topic `users`
  • Events from table `items` should go to a topic `itmes`

As far as I can see in the docs, you can only specify one topic per KafkaWriter.


2 replies

Userlevel 1
Badge

Hi! Thank you for the feedback. This is not supported in Striim automatically today but it is on our roadmap. Can you share why you’d like a topic per table versus having one change stream for all tables in kafka and route/filter on the kafka side?

Badge

Hey!

Thanks for the reply and happy to hear that this is on your roadmap!

There are several reasons why having a dedicated Kafka topic for each MySQL table would be beneficial:

  1. If you have 10 MySQL tables, directing their events to separate topics allows consumers to subscribe to just the data they need. This simplifies the consumer logic, as there's no need to filter messages based on the source table, reducing the complexity and improving the efficiency of data processing.

  2. By segregating data into different topics, we can more easily scale our data processing. Different teams and clients can consume from their respective topics without being affected by the load or changes in other tables. This isolation helps in managing performance and scalability more effectively.

  3. Having a topic per table allows for finer-grained control over access to data. It's easier to implement security policies and comply with data governance standards when each table's data is isolated in its own topic, as access can be managed at the topic level in Kafka.

  4. Utilizing one large topic for all 10 tables with different schemas introduces complexities, especially with Avro encoding. Avro relies on schemas to serialize data. When multiple tables with different schemas send data to the same topic, managing schema evolution becomes challenging. Consumers need to be aware of all possible schemas and their versions, increasing the complexity of deserialization. Separate topics per table significantly simplify schema management by isolating each table’s schema evolution.

One work around for the moment that I can think of is to spin up different MySQL readers for each MySQL table + individual KafkaWriters.

Looking forward to seeing this feature implemented and happy to provide further feedback or participate in a beta test if that helps!

Best,

Bobby

Reply