Are you looking for a queuing system? Or maybe you’re seeking a better one? Here’s all the info you need!
Queuing systems are the best-kept secret of backend development.
Without trying to write out a poem in praise of queuing systems, I’d say that a junior backend developer becomes a mid-level backend developer after he learns to integrate queues into the system. Queues improve customer experience (we’ll see how), reduce complexity, and improve reliability in a system.
Sure, for very simple web apps with near-zero traffic and brochure websites, queues can be an overall (or even impossible to install if you’re on a typical shared-hosting environment), but non-trivial apps will all gain from queuing systems, and large apps are impossible without queuing involved.
Before we begin, a disclaimer: if you’re already comfortable with queuing systems and want to compare the various options, the next few introductory sections will induce major sleep. 🙂 So feel free to jump right ahead. The introductory sections are meant for those who have only a hazy idea of queuing systems or just heard the name in passing.
What is a queuing system?
Let’s begin by understanding what a queue is.
A queue is a data structure in computer science that mimics, well, the real world queues we see around us. If you go to a ticket counter, for example, you’ll notice that you’ll have to stand at the end of the queue, while the person at the start of the queue will get the ticket first. This is what we also call the “first come, first served” phenomenon. In computer science, it’s possible to write programs that store their tasks like this in a queue, processing them one by one on the same first-come-first-served basis.
Do note that the queue doesn’t do any actual processing itself. It’s just temporary storage of sorts where tasks wait until they get picked up by something. If this all sounds a bit too abstract, don’t worry. It is an abstract concept, but we’ll see clear-cut examples in the next section. 🙂
Why do you need queuing systems?
Without getting into a very lengthy description, I’d say the main need for queuing systems is because of background processing, parallel execution, and recovery from failure. Let’s look at these with the help of examples:
Suppose you’re running an e-commerce marketing campaign where time is of the essence, and that your application is built so that it fires off a confirmation email right before the customer completes the payment and is shown the “thank you” page. If the mail server you’re connecting to is down, the web page will just die, rupturing user experience.
Imagine the high number of support requests you’d be getting! In this case, it’s better to push this email-sending task to a job queue and show the customer the success page.
Many developers, especially those who mostly code simpler, low-traffic apps, are in the habit of using cron jobs for background processing. This is fine until the size of the input grows so large that it can’t be cleared. For example, suppose you have a cron job that compiles analytics reports and emails them to users and that your system can process 100 reports per minute.
As soon as your app grows and starts getting more than 100 requests per minute on average, it will start falling behind more and more and will never be able to complete all the jobs.
In a queuing system, this situation can be avoided by setting up multiple workers, which can each pick a job (containing 100 reports to be done each) and work in parallel to finish off the task much, much sooner.
Recovery from failure
We generally don’t think of failure as web developers. We kind of take it for granted that our servers and the APIs we use will always be online. But the reality is different — network outages are all too common, and the excellent APIs you rely on may be down due to infrastructure issues (before you say “not me!”, don’t forget the massive Amazon S3 outage). So, going back to the reporting example, if part of your report generation requires you to connect to the payments API and that connection is down for 2 minutes, what happens to the 200 reports that failed?
Queuing systems do involve considerable overhead, though. The learning curve is pretty steep as you’re stepping into a whole new domain, the complexity of your application and deployment increases and queued jobs can’t always be controlled with 100% precision. That said, there are situations when building an application without queues is just not possible.
With that out of the way, let’s take a look at some of the common options among queuing backends/systems today.
Redis is known as a key-value store that just stores, updates, and retrieves strings of data with no knowledge of the structure of data. While that might have been true earlier, today Redis has efficient and highly useful data structures like lists, sorted sets, and even a Pub-Sub system, making it highly desirable for queue implementations.
The advantages of Redis are:
- Completely in-memory database, resulting in faster read/writes.
- Highly efficient: Can easily support more than 100,000 read/write operations per second.
- Highly flexible persistence scheme. You can either go for max performance at the cost of possible data loss in the case of failures or set up in fully conservative mode to sacrifice performance for consistency.
- Clusters supported out of the box
Please note that Redis does not have any messaging/queueing/recovery abstractions, so you either need to use a package or build a lightweight system yourself. An example is that Redis is the default queue backend for the Laravel PHP framework, where a scheduler has been implemented by the framework authors.
Learning Redis is easy.
There are a few subtle difference between Redis and RabbitMQ, so let’s get them out of the way first.
First of all, RabbitMQ has a more specialized, well-defined role, and so it built to reflect that — messaging. In other words, its sweet spot is to act as an intermediator between two systems, which isn’t the case for Redis, which acts as a database. As a result, RabbitMQ provides a few more facilities that are missing in Redis: message routing, retries, load distribution, etc.
If you think about it, task queues can also be thought of as a messaging system, where the scheduler, the workers and the job “submitters” can be thought of entities participating in message passing.
RabbitMQ has the following advantages:
- Better abstractions for message passing, reducing application-level work if messaging passing is what you need.
- More resilient to power failures and outages (than Redis, at least by default).
- Cluster and federation support for distributed deployments.
- Helpful tools for managing and monitoring your deployments.
- Support for practically all the non-trivial programming languages out there.
- Deployment with your tool of choice (Docker, Chef, Puppet, etc.).
When to use RabbitMQ? I’d say it’s a great choice when you know you need to use asynchronous message passing but are not ready to tackle the towering complexity of some of the other queuing options on this list (see below).
If you’re into the enterprise space (or building a highly distributed and large-scale app), and you don’t want to have to reinvent the wheel all the time (and make mistakes along the way), ActiveMQ is worth a look.
Here’s where ActiveMQ excels:
- It’s implemented in Java and so has really neat Java integration (follows the JMS standard).
- Multiple protocols supported: AMQP, MQTT, STOMP, OpenWire, etc.
- Handles security, routing, message expiry, analytics, etc., out of the box.
- Baked-in support for popular distributed messaging patterns, saving you time and costly mistakes.
That is not to say that ActiveMQ is available only for Java. It has clients for Python, C/C++, Node, .Net, and other ecosystems, so there should be no concerns for a possible collapse in the future. Besides, ActiveMQ is built on completely open standards and building your own lightweight clients should be easy.
All that said and done, please be aware that ActiveMQ is just a broker and doesn’t include a backend. You’d still need to use one of the supported backends to store the messages. I included it here because it’s not tied to a particular programming language (like other popular solutions like Celery, Sidekiq, etc.)
Amazon MQ deserves a quick but important mention here. If you think that ActiveMQ is the ideal solution for your needs but don’t want to deal with building and to maintain the infrastructure yourself, Amazon MQ offers a managed service to do that. It supports all the protocols ActiveMQ does — there is no difference in features at all — since it uses ActiveMQ itself under the surface.
The advantage is that it’s a managed service, so you don’t need to worry about anything other than using it. It makes even more sense for those deployments that are on AWS, as you can leverage other services and offerings directly from within your deployment (faster data transfers, for example).
We can’t expect Amazon to sit quietly when it comes to critical infrastructure pieces, can we? 🙂
And so we have Amazon SQS, which is a fully hosted, simple queue service (quite literally) by the well known giant AWS. Once again, subtle differences are important, so please note that SQS doesn’t have the concept of message passing. Like Redis, it’s a simple backend for accepting and distributing jobs in queues.
So, when would you want to use Amazon SQS? Here are some reasons:
- You are an AWS fan and won’t touch anything else (honestly, there are many folks out there like that, and I think there’s nothing wrong with that).
- You need a hosted solution so ensure that the failure rate is zero and none of the jobs are lost.
- You don’t want to build out a cluster and have to monitor it yourself. Or worse, have to build monitoring tools when you could be using that time to do productive development.
- You already have substantial investments in the AWS platform and staying locked in makes business sense.
- You want a focused, simple queuing system without any of the fluff associated with message passing, protocols, and whatnot.
All in all, Amazon SQS is a solid choice for anyone wanting to incorporate job queues into their system and not having to worry about installing/monitoring things by themselves.
Beanstalkd has been around for a long time and is a battle-tested, fast, easy backend for job queuing. There are a few characteristics of Beanstalkd that make it differ considerably from Redis:
- It’s strictly a job queuing system and nothing else. You push jobs to it, which get pulled by job workers later on. So if your application has even a tiny need for message passing, you’d want to avoid Beanstalkd.
- There are no advanced data structures like sets, priority queues, etc.
- Beanstalkd is what’s termed a First In, First Out (FIFO) queue. There’s no way to arrange jobs by priority.
- There are no options for clustering.
All this said Beanstalkd makes for a slick and fast queue system for simple projects that live on a single server. For many, it’s faster and more stable than Redis. So if you’re having issues with Redis that you just can’t seem to solve no matter what, and your needs are simple, Beanstalkd is worth a try.
If you’ve read this far (or reached here skim-reading 😉 ), there’s a pretty good chance you’re interested in queuing systems or need one. If so, the list on this page will serve you well, unless you’re looking for a language/framework-specific queue system.
I wish I could tell you that queuing is simple and 100% reliable, but it’s not. It’s messy, and since it’s all in the background and happening very fast (mistakes can go unnoticed and become very costly). Still, queues are very much necessary beyond a point, and you’ll find that they’re a powerful weapon (maybe even the most powerful) in your arsenal. Good luck! 🙂