September 19, 2024

Choosing Between Message Queues and Event Streams

10 min read
rb_thumb

rbs-img

Implementing an event-driven architecture (EDA) is a road riddled with challenges. Among them is choosing the right tooling for the job. Many event-driven tools seem quite similar, at least at first glance, and you’d expect they could be used equally well for the same purposes. But that’s often not the case and choosing the solution best suited to your needs can be tricky.

Take, for instance, message broker technologies. Choosing a message broker might seem straightforward at first. However, “message broker” is an umbrella term often used to describe different types of components, such as event buses, pub/sub messaging services, message queueing systems and event streaming platforms.

While there’s some overlap in terms of capabilities and use cases between all these types of components, there are also plenty of significant differences. It’s critical to understand these distinctions before embracing one message broker technology or the other (or a mixture of them).

I’ll focus specifically on message queueing and event streaming, highlighting their differences, common denominators and suitability for various use cases. As a follow-up read, I recommend the “Guide to the Event-Driven, Event Streaming Stack,” which talks about all the components of EDA and walks you through a reference use case and decision tree to help you understand where each component fits in.

Understanding Messages Queues and Event Streams

Before discussing message queues and event streams, let’s first clarify what “message” and “event” mean. A message is a generic term used to describe a packet of data sent from one component to another. There are different types of messages, including:

Command message. It carries instructions for the receiver to perform a specific action.

It carries instructions for the receiver to perform a specific action. Query message . A request made to obtain information from a component.

. A request made to obtain information from a component. Reply message . Sent by a server or recipient in reply to a request/query message.

. Sent by a server or recipient in reply to a request/query message. Transactional message. Used in systems where messages are part of a transaction and must be processed in a reliable, often atomic way.

Here’s a basic example of a command message that instructs a banking system to initiate a funds transfer:

{ “commandId”: “cmd-987654321”, “commandType”: “ProcessBankTransfer”, “data”: { “transactionId”: “abc123”, “fromAccountId”: “45678”, “toAccountId”: “98765”, “amount”: 100.00, “currency”: “USD” } } 1 2 3 4 5 6 7 8 9 10 11 { “commandId” : “cmd-987654321” , “commandType” : “ProcessBankTransfer” , “data” : { “transactionId” : “abc123” , “fromAccountId” : “45678” , “toAccountId” : “98765” , “amount” : 100.00 , “currency” : “USD” } }

Meanwhile, an event is a significant occurrence or change in state within a system. A button is clicked in a UI, a motion sensor recording movement, or successfully processing a payment — these are all examples of events. When an event “travels” between the components of a system, it does so in the form of a message, so an event is a type of message.

Below is an example of an event message that records that the above command message was processed and funds were successfully transferred between accounts.

{ “eventId”: “evt-123456789”, “eventType”: “BankTransfer”, “timestamp”: “2023-12-13T09:45:30Z”, “eventData”: { “transactionId”: “abc123”, “fromAccountId”: “45678”, “toAccountId”: “98765”, “amount”: 100.00, “currency”: “USD” } } 1 2 3 4 5 6 7 8 9 10 11 12 { “eventId” : “evt-123456789” , “eventType” : “BankTransfer” , “timestamp” : “2023-12-13T09:45:30Z” , “eventData” : { “transactionId” : “abc123” , “fromAccountId” : “45678” , “toAccountId” : “98765” , “amount” : 100.00 , “currency” : “USD” } }

Since events are packaged as messages, you will often hear “message” and “event” being used interchangeably when discussing event-driven architectures. It’s still worth bearing in mind that while events are messages, not all messages are events.

Now, on to message queues and event streams. Message queues operate on the principle of temporary storage for messages that are to be processed by consumers. Producers send messages to a message broker, which stores them in queues. Consumers retrieve messages from queues, typically in a first-in, first-out (FIFO) order. Once consumed from the queue (and acknowledged), messages are deleted. This setup decouples components, ensuring that messages are processed reliably and in order by consumers.

Similar to message queues, event streaming revolves around producers, consumers, message brokers and messages. However, there are some notable differences between compared to message queues:

Event streaming involves a continuous flow of event messages. (You wouldn’t usually deal with such a high volume and velocity of data with message queues).

The broker usually stores event messages in topics (or channels). Unlike point-to-point queues where a single receiver consumes each message, topics use the pub/sub model, allowing multiple consumers to read the same message.

Messages can be stored in order for longer periods of time. (They are not discarded as soon as they are consumed).

While the main purpose of message queues is to deliver messages from point A to point B reliably, event streaming follows a different paradigm. Event streaming does that too, but aside from distribution, it typically also transforms event data in real time before delivering it to its destination (so, the high-level flow is A > data transformation > B). Data transformation usually involves using stream processing technologies such as Kafka Streams or Apache Flink.

Message Queueing vs. Event Streaming Tech: Comparing Capabilities

There are numerous distinctions between technologies that allow you to implement event streaming and those that you can use for message queueing. To highlight them, I will compare Apache Kafka (event streaming platform) and RabbitMQ (message broker offering message queues). I’ve chosen Kafka and RabbitMQ specifically because they are popular, widely used solutions providing rich capabilities that have been extensively battle-tested in production environments. They’re considered by many to be the gold standard.

Attribute Kafka (event streaming) RabbitMQ (message queues) Supported protocols Custom binary protocol over TCP. Several protocols supported: AMQP (0-9-1 and 1.0), STOMP, MQTT. Additionally, RabbitMQ can be extended to support Java Message Service (JMS) applications (via a plugin and a JMS client). Message ordering Guaranteed at partition level (a partition is a segment of a topic). Guaranteed at queue level. Delivery semantics Supports at least once, at-most-once and even exactly-once semantics (the latter being crucial for industries like banking, where data integrity is critical). Supports at least once and at-most-once semantics. No exactly-once delivery. Message priority No native support. Supports priority levels per message, delivering higher-priority messages first. Message replay Allows replaying messages multiple times, even if already read by consumers. No message replay capability. Dead letter queues Kafka supports the concept of dead letter queues, which is useful for error handling (see this article for details). RabbitMQ supports dead-letter queues, allowing you to diagnose and resend messages that were not successfully processed by consumers. Routing Advanced content-based routing is possible via the Kafka Connect and Kafka Streams components. Advanced, flexible routing capabilities via routing keys and exchange types. Built-in stream processing Yes (Kafka Streams). No built-in capabilities. Message consumption Consumers use a pull model (long polling) to read messages. Consumers can pull messages, or the broker can push them (the push model is the recommended option). Broker andconsumer type Dumb broker, smart consumer. Smart broker, dumb consumer. Data persistence Offers long-term persistence, can retain data indefinitely. Queues only retain messages until they are delivered and processed by consumers. Scalability Up to trillions of messages per day, thousands of topics split into tens of thousands of partitions and hundreds of brokers. Scalable, but not designed for the same levels of scalability as Kafka. Better suited for small and medium-sized deployments and workloads. Performance Up to millions of messages and multiple gigabytes of data per second with consistently low latencies (in the single-digit milliseconds range). Optimized to handle lower throughputs (thousands or tens of thousands of messages per second). Can offer very low latency (comparable to Kafka), but latency increases with high throughput workloads.

The table above is just a condensed comparison between Kafka’s event streaming capabilities and RabbitMQ’s message queues, summarizing the essential differences. However, if you want a more in-depth look at how these two technologies compare (including additional criteria, like architecture, developer experience and ecosystem), check out this Kafka versus RabbitMQ blog post.

Message Queueing and Event Streaming Use Cases

Message queueing and event streaming can both be used in scenarios requiring decoupled, asynchronous communication between different parts of a system. For instance, in microservices architectures, both can power low-latency messaging between various components. However, going beyond messaging, event streaming and message queueing have distinct strengths and are best suited to different use cases.

Message queueing technologies are generally a good choice for:

Communication between components written in different languages and “speaking” different protocols. Message queueing solutions like RabbitMQ and ActiveMQ make this possible by supporting multiple protocols and programming languages.

Use cases that require complex message routing (for example, a stock trading platform that routes buy and sell orders to different processing queues based on the type of stock and order size).

Distributing tasks among worker nodes, where each task is processed only once by a single consumer.

Handling consumers that frequently disconnect. Message queueing systems are a good choice in these cases due to their message ordering, temporary persistence and message redelivery capabilities.

Thousands of companies worldwide have included message queueing technologies in their tech stacks. See, for example, the RabbitMQ Summit website to learn how organizations of all shapes and sizes are using RabbitMQ message queues in production. There are plenty of talks on the website for you to browse (just click the “Past Events” dropdown and select the summit edition of your choice to view all the related talks).

Now, on to event streaming, which is well-suited for:

Collecting, persisting and transmitting large volumes of event streams, such as clickstream data, stock market tickers and high-frequency readings from IoT devices and sensors.

Continuously processing and analyzing data as it arrives to provide actionable insights and enable real-time decision-making (for instance, analyzing financial transactions as they happen to identify fraudulent ones and mitigate them as soon as possible).

Event sourcing. A technology like Kafka is ideal in such scenarios because its immutable and append-only log structure ensures a reliable, ordered and replayable record of events. This enables the full historical sequence of state changes to be stored and queried.

Log aggregation use cases. Event streaming solutions are a fitting choice because they generally offer good performance, strong durability guarantees and low latency. Additionally, event streaming technologies usually integrate with numerous other systems (or provide straightforward ways to do so), making it convenient to ingest log data from disparate components.

There are countless examples of companies leveraging event streaming. For instance, some large organizations, including Uber, PayPal and Netflix, have shared how and why they’re using Kafka and the benefits they are gaining. It’s worth reading about their experiences. But not only large enterprises are relying on event streaming. Check out my previous blog post to see how small and mid-sized companies harness Kafka’s event streaming capabilities.

Note that some event-driven architectures use both event streaming and message queueing. For example, in the case of an e-commerce platform, you could use event streaming to collect and analyze clickstream data from users in real time, so you can serve them relevant banners that offer discounts and product recommendations. Meanwhile, a message queueing solution could be useful to queue orders for payment and processing.

Message Queueing Is Sometimes a Stepping Stone to Event Streaming

Message queueing is a good choice for many messaging use cases. It’s also an appealing proposition if you’re early in your event-driven journey; that’s because message queueing technologies are generally easier to deploy and manage than event streaming solutions. However, organizations sometimes migrate from message queues to event streams despite the additional complexity. The main reasons? Scalability, reliability and performance.

This was very much the experience of DoorDash, AppDirect and a global payments provider. All of them initially used message queueing tech — RabbitMQ and ActiveMQ to be more exact. However, when faced with growing data volumes, these message queueing systems experienced crippling scalability, reliability and performance issues. All three companies ultimately had to reevaluate their tech stacks and ended up replacing RabbitMQ/ActiveMQ with Kafka’s event streaming capabilities. By moving to Kafka, the three organizations have significantly improved uptime, scalability, availability and performance (lower latencies and higher throughputs).

I’m curious to see if more businesses will continue switching from message queues to event streams in the future. Another possible trend is that companies will embrace event streaming platforms like Kafka from the get-go, especially since there’s a plan to introduce queues for Kafka. In other words, Kafka could ultimately become a technology equally suitable for event streaming use cases and traditional message queueing scenarios (right now, using Kafka as a conventional message queue is challenging — see this article for details).

Conclusion

If you’re dealing with small and medium workloads, you’re looking to reliably and flexibly route messages between components and your system is primarily interested in the current state, a message queueing technology is an adequate choice.

On the other hand, if you want to handle high-volume, high-frequency event streams in a scalable and reliable way, you need to do complex processing on data as it arrives to gain actionable real-time insights and your system is concerned not only with the current state, but a historical record of state changes, then event streaming is the right way to go.

As we’ve seen, sometimes companies start with message queues only to later migrate to event streaming tech. This migration is extremely difficult and time-consuming. So, if you’re early in your event-driven journey and you’re pondering whether event streaming or message queueing is the right choice for you, ask yourself this: Can my current requirements be fulfilled equally well by both message queueing tech and event streaming solutions? If the answer is yes, then I advise you to jump on the event streaming bandwagon. It’s a stronger, more reliable foundation for the future.

It’s true that event streaming tools are generally harder to learn and manage than message queues. But don’t let that discourage you. Managed platforms like Confluent Cloud and Redpanda massively simplify the effort of handling event streams. Plus, they work seamlessly with serverless stream processing solutions like Quix, enabling you to effortlessly build, deploy and monitor event streaming apps that extract value from real-time data.

Check out these interactive templates to understand what kind of event-driven applications you can create by combining Confluent Cloud/Kafka/Redpanda as the streaming transport and Quix as the stream processing engine.

TRENDING STORIES

Source: thenewstack.io

Leave a Reply

Your email address will not be published. Required fields are marked *