The Publisher-Subscriber - or shorter Pub/Sub - pattern is a way of decoupling the communication within a distributed system. As such, it is frequently used in microservices or other service-based architectures. The pattern is similar to the Observer pattern, separating the sender and receiver of events. However, Pub/Sub operates more on a component or service level and less on the lower class/object level as the Observer does.
The Problem
Distributed software systems often consist of separated logical units which, while function mainly autonomously, still have the need to exchange information. The data they exchange can have countless different forms: Domain messages notifying about state changes, the result of a performance-intensive calculation, but also data provided by external providers like a payment service.
One way to transfer this information is by directly connecting the component requesting the information (the requestor) with the one providing the data (the replier):
Depending on the technology or protocol, this communication often happens synchronously through a direct HTTP(S) call or by invoking a remote procedure call (e.g., using gRPC).
While this approach may be sufficient for most scenarios, it has some disadvantages:
- The receiver must know precisely who the sender is and how to call it. This creates a strong coupling between the two.
- When using synchronous communication, both receiver and sender are blocked until the information is exchanged. For systems with high load, this can result in performance issues.
The Publisher-Subscriber pattern provides a solution to these problems by replacing the direct connection between the components with an asynchronous communication channel.
Publisher-Subscriber
General Structure
The idea behind this pattern is to add another component between sender and receiver, a message broker or event bus. Instead of communicating directly with its counterpart, the component that wants to receive information must now subscribe to the message broker for any type of message or topic it is interested in. Conversely, the responder publishes its data to the message broker which is now responsible for delivering the message to all subscribers.
Compared to the synchronous and direct method described earlier, the Publisher-Subscriber pattern has some advantages:
- There is no direct coupling between publisher and subscriber anymore. They are only connected through the structure of the message they are interested in. Compared to before, this reduces the coupling significantly. As long as both publisher and subscriber conform to the same message schema, they can now be developed independently, making the system easier to maintain.
- The system scales much better. The communication is now asynchronous. A subscriber marks its interest in a specific topic but does not have to call the sender and wait for new information. Instead, it can continue with its tasks and gets notified when new messages arrive. This asynchronous processing makes it also easier to scale both parts independently. When handling incoming messages takes too long, we can quickly scale our system horizontally by adding more subscriber instances. The message broker also allows us to add filter or queuing mechanisms to better manage higher workloads.
- Extending the system is easier. We can add new subscribers or publishers while leaving all existing components untouched. That way, we can add new functionality to the system without affecting existing implementations.
- Better separation of concerns. As the infrastructure logic required for encoding and sending messages is outsourced to the message broker, the publisher's and subscriber's implementation can focus on implementing domain rules and handling domain events. The technology used for communication is abstracted away by the underlying messaging system.
As we now have an overview of the general structure and purpose of the pattern, let's look at some details.
Topic- or Content-based Filtering
In most scenarios, a single subscriber is only interested in a subset of all messages sent by publishers and ignores all information it considers irrelevant. For this, message brokers often provide different filtering mechanisms. The two most common ones are by topic or by content.
When filtering by Topic, each subscriber has to register upfront for a specific subject it is interested in. The message broker then ensures that only messages declared with this particular topic are routed to the relevant subscribers. To improve performance, this can even be implemented by using a dedicated message channel per topic. Since this is the most basic routing mechanism, nearly all platforms that support asynchronous messaging support this approach.
Content-based filtering, in contrast, matches the publisher and the subscriber based on the content or structure of a message. A common approach here is registering the subscriber to a specific type of action (typically expressed through a message's metadata attribute), but more sophisticated filtering algorithms considering specific data attributes are also possible. Some available cloud-based commercial message brokers provide the ability to define quite complex subscription filters (see, for instance, this post about AWS AppSync).
The main difference between these two approaches is that, for topic-based filtering, it is the publisher's responsibility to classify the message correctly, while for content-based filtering, the filtering logic happens more on the subscription side.
Message Broker
Although the message broker is a core component of the Publisher-Subscriber, the pattern does not make any assumptions about how it should be implemented. While it's possible to implement the broker from scratch using a technology like Websockets, many commercial or open-source products are also available. RabbitMQ or ActiveMQ may be the most known ones, but there are also many commercial cloud service providers that offer their own messaging platforms.
Messages
The message itself is another core concept of the Publisher-Subscriber pattern: They represent the actual information transferred between the different actors and thus define the minimal interface on which both, subscriber and publisher, must agree. While there are no restrictions regarding the concrete message format, data formats that are small and easy to publish (like JSON or XML) have been established. Messages often contain additional meta-data like a timestamp or a type that can be used for further filtering.
Remember that while the pattern reduces coupling between the publisher and the subscriber to a minimum, the message's data schema is where both participants must conform. Otherwise, subscribers would not be able to read the incoming data. Adding a version number field to the message's metadata for backward-compatibility reasons is, therefore, often not a bad idea.
An example message transferred between publisher and subscriber could have the following content and meta data (in JSON, but other formats are also possible):
{
"eventType": "WeatherUpdate",
"timestamp": "2023-10-23T14:30:00",
"messageVersion": "1.2",
"source": "WeatherStation123",
"data": {
"location": "CityXYZ",
"temperature": 22.5,
"humidity": 60.2
}
}
Note that the message has a dedicated eventType
attribute that the subscriber could use to identify the content of the message.
Some things to consider
The Publisher-Subscriber pattern is an elegant way of implementing asynchronous communication. However, there may be scenarios where applying it can have some drawbacks:
Single Point of Failure One of the main drawbacks of the pattern is that the message broker presents a critical component. When your messaging middleware goes down, all parts of your system relying on this communication structure are basically offline. Since this would be a critical failure, you should always ensure that you have a redundant message broker to take over in that case.
Some workflows require synchronous communication. Asynchronous communication is a good choice for decoupling and scaling. However, some workflows require synchronous communication. Paying for an order in an online shop may be a scenario where you have to wait till you receive the confirmation before continuing with the next step. Distributed transactions may be another use case where synchronous communication makes processing complex workflows significantly easier.
Dependencies become implicit. In synchronous communication, it's easy to see which components depend on each other - you just have to follow the direct calls. Finding these dependencies in an asynchronous message-based system is much more challenging. Especially in larger systems, identifying all subscribers and publishers that operate on the same topic (or message filters) can be hard, and changing the scheme of an existing message can have unforeseen effects - that's why you should always deprecate them before removing or changing their content structure.