Kafka Message Ordering Part 2

Welcome back to the second blog, Kafka Message Ordering Part 2, in the message ordering series. If you haven’t gone through the first blog, Kafka Message Ordering Part 1, I would recommend you to go through it before proceeding with this one. In Kafka Message Ordering Part 1 I discussed about how various factors can disrupt the order of Kafka messages at producer level itself. In this Kafka Message Ordering Part 2 blog we would see which factors at infrastructure layer and consumer side can alter the order of the sent messages.

Infrastructure Layer Factors: As per OSI model physical, data link, network and transport layers are responsible for packet transfer and message restoration between hosts. Almost every message transfer middleware system recommends asynchronous message processing where system doesn’t wait for message processing acknowledgement before sending the next one as synchronous mode would make it very slow and would not help in building real time message transfer systems. When messages are sent continuously one after the other then due to inherent nature of the data transfer networks the message packets can follow different uncongested, smallest, quickest route to destination making a perfect recipe for out of ordered messages (even if producer sends all the messages in perfect order they will most probably be received in a different order).

Messages Over Infra Layer
Messages Over Infra Layer

Consumer Side Factors: Like producers, the message consumers also try to best utilize the system by parallelizing message processing. And due to this very intent even the ordered message as input to a consumer might get jumbled. Here are a couple of factors due to which message ordering is altered:

  1. Kafka Partitions: For increased throughput and redundancy Kafka provides for multiple partitions in each topic. Though a topic with a single partition can be created and used but it is usually not done for the above reasons. Producers publish their messages to partitions (in round robin manner or key based hashing so that a message with a given key is always sent to the same partition). Simply more partitions means more parallelism and more parallelism means nightmare in managing message order. Messages in each partition are picked up by individual threads of the consumers. Pictorially it can be represented as:
    Single Consumer Thread
    Single Consumer Thread

    In this representation all the messages are processed by only one consumer in a given consumer group and Kafka ensures that it hands over the messages in the exact same order in which they were received within each partition (i.e. the ordering within the partition is maintained). Kafka ensures intra partition ordering and not inter partition ordering. Here consumer has to iterate over the partitions to retrieve the messages and process them. And if you see in this case messages would be processed out of order (e4 with timestamp t4 is processed immediately after e1 whereas it should have processed e2 which arrived at t2).

  2. Parallel Processing: Extending the above point a little further where in most architectures the consumers try to increase the message processing throughput by using more threads to process the messages in parallel which is a certified way shuffling the message ordering. Looking at the picture below: Each consumer thread is allocated a partition to process the messages therein and each of these threads is completely unaware of what other threads are doing.
  3. Re-partitioning: This is the extreme of parallel processing where in the messages from all the partitions are re-divided to achieve the desired parallelism. In the previous approach it was one consumer thread per partition yielding a maximum parallelism of three. Assuming that there are three partitions with 100 messages each making them 300 messages in total. Some implementations might repartition these 300 messages into, say, 15 partitions of 20 messages each to speed up the processing:
    Repartitioning At Consumer Side
    Repartitioning At Consumer Side

    Each vcore (of the CPU core now gets one partition to process and hence all partitions can be processed in parallel. And the repartition process again alters all the ordering we had earlier.

Essentially no matter how you plan to ensure ordering in one layer it can be (and mostly it is) altered by subsequent layers. So architects avoid spending time on these costly message ordering options at various levels (producer, infrastructure and consumer) instead the ordering is best handled at the last layer which is the view layer enabling end user to retrieve the latest and correct snapshot of data. I will discuss one of the two solutions in the next, Kafka Message Ordering Part 3, blog in the series.

Leave a Reply

Your email address will not be published.