RabbitMQ

Queues

Keep your queue short

Many messages in a queue can put a heavy load on RAM usage. In order to free up RAM, RabbitMQ starts flushing (page out) messages to disk. The page out process usually takes time and blocks the queue from processing messages when there are many messages to page out, deteriorating queueing speed. Having many messages in the queue might negatively affect the performance of the broker.

Enable lazy queues to get predictable performance

A feature called lazy queues was added in RabbitMQ 3.6. Lazy queues are queues where the messages are automatically stored to disk, thereby minimizing the RAM usage, but extending the throughput time.

Number of queues

Queues are single-threaded in RabbitMQ, and one queue can handle up to about 50 thousand messages. You will achieve better throughput on a multi-core system if you have multiple queues and consumers and if you have as many queues as cores on the underlying node(s).

Split your queues over different cores

Queue performance is limited to one CPU core. You will, therefore, get better performance if you split your queues into different cores, and into different nodes, if you have a RabbitMQ cluster.

Auto-delete queues you are not using

Client connections can fail and potentially leave unused resources (queues) behind, which could affect performance. There are three ways to delete a queue automatically.

An auto-delete queue is deleted when its last consumer has canceled or when the channel/connection is closed (or when it has lost the TCP connection with the server).

Payload

Keep in mind that the amount of messages per second is a way larger bottleneck than the message size itself. While sending large messages is not a good practice, sending multiple small messages might be a bad alternative. A better idea is to bundle them into one larger message and let the consumer split it up. However, if you bundle multiple messages you need to keep in mind that this might affect the processing time.

Connections and channels

Each connection uses about 100 KB of RAM (more, if TLS is used). Thousands of connections can be a heavy burden on a RabbitMQ server. In the worst case, the server can crash because it is out of memory. The AMQP protocol has a mechanism called channels that “multiplexes” a single TCP connection. It is recommended that each process only creates one TCP connection, using multiple channels in that connection for different threads. Connections should be long-lived.

Alternatively, channels can be opened and closed more frequently, if required, and channels should be long-lived if possible, e.g. reuse the same channel per thread of publishing. Don’t open a channel each time you are publishing. The best practice is to reuse connections and multiplex a connection between threads with channels. You should ideally only have one connection per process, and then use a channel per thread in your application.

Don’t share channels between threads

Make sure that you don’t share channels between threads as most clients don’t make channels thread-safe (it would have a serious negative effect on the performance impact).

Don’t open and close connections or channels repeatedly

Make sure that you don’t share channels between threads as most clients don’t make channels thread-safe (it would have a serious negative effect on the performance impact).

Separate connections for publisher and consumer

Separate the connections for publishers and consumers to achieve high throughput. RabbitMQ can apply back pressure on the TCP connection when the publisher is sending too many messages for the server to handle. If you consume on the same TCP connection, the server might not receive the message acknowledgments from the client, thus effecting the consume performance.

Acknowledgements and Confirms

Messages in transit might get lost in an event of a connection failure and need to be retransmitted. Acknowledgments let the server and clients know when to retransmit messages. The client can either ack the message when it receives it, or when the client has completely processed the message. Acknowledgment has a performance impact, so for the fastest possible throughput, manual acks should be disabled.

Publish confirm is the same concept for publishing. The server acks when it has received a message from a publisher. Publish confirm also has a performance impact, however, keep in mind that it’s required if the publisher needs at-least-once processing of messages.

Unacknowledged messages

All unacknowledged messages must reside in RAM on the servers. If you have too many unacknowledged messages, you will run out of memory. An efficient way to limit unacknowledged messages is to limit how many messages your clients prefetch.

Persistent messages and durable queues

If you cannot afford to lose any messages, make sure that your queue is declared as “durable” and that messages are sent with delivery mode „persistent“.

Persistent messages are heavier with regard to performance, as they have to be written to disk. Keep in mind that lazy queues will have the same effect on performance, even though you are sending transient messages. For high performance, the best practice is to use transient messages.

TLS and AMQPS

You can connect to RabbitMQ over AMQPS, which is the AMQP protocol wrapped in TLS. TLS has a performance impact since all traffic has to be encrypted and decrypted. For maximum performance, we recommend using VPC peering instead as the traffic is private and isolated without involving the AMQP client/server.

Prefetch

The prefetch value is used to specify how many messages are being sent to the consumer at the same time. It is used to get as much out of your consumers as possible.

The RabbitMQ default prefetch setting gives clients an unlimited buffer, meaning that RabbitMQ by default sends as many messages as it can to any consumer that looks ready to accept them. Sent messages are cached by the RabbitMQ client library (in the consumer) until processed. Prefetch limits how many messages the client can receive before acknowledging a message. All pre-fetched messages are removed from the queue and invisible to other consumers.

HiPE

HiPE will increase server throughput at the cost of increased startup time. When you enable HiPE, RabbitMQ is compiled at startup. The throughput increases with 20-80 percent according to our benchmark tests. The drawback of HiPE is that the startup time increases dramatically as well, at about 1-3 minutes more. HiPE is still marked as experimental in RabbitMQ’s documentation.

Don’t enable HiPE if you require high availability.

Number of nodes in your cluster

When you create a CloudAMQP instance with one node, you will get one single node with high performance, because messages don’t need to be mirrored between multiple nodes.

Creating a CloudAMQP instance with two nodes, on the other hand, will get you half the performance compared to the same plan size for a single node.

When you create a CloudAMQP instance with three nodes, you will get one-quarter of the performance compared to the same plan size for a single node.

Disable unused plugins

Some plugins might be great, but they also consume a lot of CPU or may use a high amount of RAM. Because of this, they are not recommended for a production server. Disable plugins that are not in use.

Use updated RabbitMQ client libraries

Make sure that you are using the latest recommended version of client libraries. 

Use the latest stable RabbitMQ and Erlang version

Stay up-to-date with the latest stable versions of RabbitMQ and Erlang.

RabbitMQ Mistakes

  • Don’t open and close connections or channels repeatedly.
  • Don’t use too many connections or channels.
  • Don’t share channels between threads.
  • Don’t have queues that are too large or too long.
  • Don’t use old RabbitMQ/Erlang versions or RabbitMQ clients/libraries.
  • Don’t have an unlimited prefetch value.
  • Don’t ignore lazy queues
  • Limit queue size with TTL or max-length, if possible.
  • Use multiple queues and consumers.
  • Persistent messages and durable queues for a message to survive a server restart
  • Split your queues over different cores.
  • Consume (push), don’t poll (pull) for messages.
  • Missing an HA policy while creating a new vhost on a cluster.