Category: Get.Rounds, Tech Talk
I used to take the bus to school every day. Me, and thirty other kids, would get on at different stops, sit more or less patiently until the bus arrived at school, where we would emerge on mass to face another day of education. At the end of a mind numbing day of classes, we would then get back on the bus as a group, and be dropped off, one by one, somewhere near our relative houses.
Then, on one lovely day, my uncle bought me an old, beat up, Volkswagen Beetle, and instead of taking the bus, I drove to school each morning, and returned, each night, on my own. Much more efficient for me, as I could arrive and leave whenever I felt like it (with the schools permission of course).
When designing large systems, the first approach is usually like my old beetle. We send messages directly between the servers. Modern networking systems are very robust, and we have a very high level of trust that our messages will arrive quickly, and be handled by the receiver in a stable manner.
There are two inherent problems with direct messaging, where to send the message, and how to scale with load. With direct end-to-end messaging, the client needs to know the address of the receiver. If I only have one server, that’s not a problem. DNS, NIS and other naming services let me use a host name that is translated to the servers address. Because we don’t want to waste time asking for the address on every message, we cache it for some period of time called a TTL, or time-to-live. If the server changes its’ address, we have to wait until the clients refresh their translations.
But what if, instead of one service that can handle these messages, we had ten? We could put all of the addresses into our naming service, and either have the naming service pick for us, or collect all the addresses and decide for ourselves which one we want to actually communicate with. Adding or removing a service is still a relatively static operation.
Enter the message bus.
When I took my own car to school, I paid (my parents paid) all the expenses. When I took the bus, the costs were amortized between all the passengers and it was overall more eco-friendly. The same thing can be done with messaging. Let’s create a bus, where we send messages. The bus is tasked with ensuring that messages get delivered to their correct destination. It may not be the fastest route, but it will provide reliability.
Here is where the analogy breaks down. My school bus delivered us to school. Modern messaging buses hold the messages, until someone comes along and asks for them. This is what makes them so powerful in a scalable environment. In its minimal form, we have one producer who sends messages to the bus, and one consumer who reads and removes messages from the bus. Overall, the latency is higher and we haven’t gained anything.
But if we have tens of producers and tens of consumers, the messaging bus now becomes a very helpful tool. First, we have de-coupled the two sides. Clients don’t need to know which consumer is going to process the request – and consumers don’t need to wait for work. These features provide a form of automatic load balancing. Consumers process only those messages for which they have resources.
Internet systems don’t have balanced access patterns. Sometimes the system is almost idle, whilst at peak times, there are never enough services to handle the load. A message bus mitigates these problems. When the system is idle, consumers can either go to sleep or be cancelled altogether. No notification process is required; the consumer simply stops asking for new messages. At the other end, when the system is overloaded, consumers will be operating at peak load. Any unprocessed messages will remain in the queue until the peak has passed. For many applications, it is more important to eventually handle all messages, than to maintain a low latency while occasionally dropping requests.
One final benefit of a messaging bus is that producers can tell if the system is overloaded. They can either query the size of the queue, or have an automated response system that denies new submissions. This type of activity falls under the label of “Fail Fast”. If it’s not going to work, it will just fail right now instead of waiting. Systems that “Fail Fast” are easier to understand and manage, because they expose problems very early.
Rounds use a few different message buses. Our main processing is handled by one message bus. These requests must eventually be handled, and the system must be able to handle increasingly heavy loads. Our second message bus is for less urgent messages. We keep them separate for now because it’s easier to manage and track.
Message buses are an important tool in large scale design. Most young start-ups are using message buses because they are easy to manage,A and easy to understand. If you are interested in checking them out, look for RabbitMQ, ActiveMQ, NServiceBus, MassTransit and Rhino Service Bus. There are versions for every language and disposition.