Why Dead Letter Queues Are a Must-Have for Reliable Messaging Systems

A Dead Letter Queue (DLQ) is a special queue that handles messages that cannot be successfully processed or delivered to their intended destination. In this post, we will explore what a Dead Letter Queue is, why it’s important, and the benefits of using one.

Rahul Pulikkot Nath
Rahul Pulikkot Nath

Table of Contents

Have you ever wondered what happens to messages that repeatedly fail to process in your application queues?

Are they just lost forever?

Not if you are using a Dead Letter Queue also referred to as DLQ.

In this post, we will explore what a Dead Letter Queue is, why it’s important, and the benefits of using one.

I will use Amazon SQS to explore and demonstrate DLQ, but the core concepts apply to most queuing technologies.

Thanks to AWS for sponsoring this post.

Sample Application Message Handler

For this post, I will use the same message handler I used in the blog post that we explored the AWS Message Processing Framework.

A Step-by-Step Guide to AWS Message Processing with Amazon SQS in .NET
The AWS Message Processing Framework for .NET is an AWS-native toolkit for building .NET applications with messaging services like SQS and EventBridge. Let’s learn how to start using it when creating .NET applications on AWS.

The WeatherForecastAddedEventHandler processes messaging coming into Amazon SQS. If the message content has the word 'Exception' it throws an exception in the handler.

Otherwise it successfully processes and removes the message from the Queue.

class WeatherForecastAddedEventHandler: IMessageHandler<WeatherForecastAddedEvent>
{
    public Task<MessageProcessStatus> HandleAsync(
        MessageEnvelope<WeatherForecastAddedEvent> messageEnvelope, CancellationToken token = default)
    {
        if (messageEnvelope.Message.Summary.Contains("Exception"))
        {
            Log.Error(
                "Error processing weather forecast added event message {DateTime} and {Temperature}",
                messageEnvelope.Message.DateTime, messageEnvelope.Message.TemperatureC);
            
            throw new Exception(messageEnvelope.Message.Summary);
        }

        Log.Information(
            "Processed WeatherForecastAddedEvent with {DateTime} {TemperatureC}",
            messageEnvelope.Message.DateTime, messageEnvelope.Message.TemperatureC);
        return Task.FromResult(MessageProcessStatus.Success());
    }
}

What Happens When There Is No Dead Letter Queue?

To understand the use of a Dead Letter Queue, let's first understand what happens when we don't have one set up.

Any time a message with the content 'Exception' is processed, the handler code throws an exception.

Once the message timeout expires, the message is put back into the source SQS Queue, making it available for a consumer to pick that up again.

This behavior is useful when the error is due to a transient one.

💡
A Transient error, also known as a transient fault or soft error, is a temporary error in a system or network that resolves itself quickly.

In cases where the error is transient, the handler is likely to succeed in the subsequent try to process the message and remove it from the queue.

However, when the error is not transient, the message will be in an infinite loop. Null references, invalid business scenarios, unhandled code scenarios, etc., can be some of the cases for these errors.

Problems of Not Having a Dead Letter Queue

Failed messages are available in the source SQS queue after the default visibility timeout.

These messages take up unnecessary time and resources, as they are processed repeatedly only to be put back into the queue.

As the number of failed messages increases, it takes up significant time and also blocks other messages coming into the queue from being picked up sooner.

Some of the problems of not having a Dead Letter Queue are

  1. Processing Bottlenecks
  2. Inefficient Resource Use
  3. Increased Manual Effort
  4. Message Loss

Cleaning up the source SQS queue, investigating error messages, and purging them involves additional developer time and effort.

Depending on the Queue and the settings, these error messages might also be completely lost after the queue's retention period.

Advantages of Having a Dead Letter Queue

💡
A Dead Letter Queue (DLQ) is a special queue that handles messages that cannot be successfully processed or delivered to their intended destination.

Messages are sent to the DLQ after exceeding a retry threshold or encountering errors, allowing you to isolate and inspect problematic messages without disrupting the main application flow.

Amazon SQS supports setting up a Dead-letter queue directly on the source queue and configure the 'Maximum receives' count setting along with it.

Setting this will be different based on the type of Queue and where you are hosting it.

Setting up a Dead Letter Queue on Amazon SQS

Once the message is retried for the configured number of times, it is automatically moved to the Dead Letter Queue, taking it out of the standard message processing flow.

With a Dead Letter Queue set up, failed messages are automatically moved out of the source queue, freeing it for new messages. This stops unnecessary processing and resource usage, processing the same message repeatedly.

Developers need to monitor and look only at the Dead Letter queue for failed messages and identify the root cause of them.

You can also configure the retention period on these Dead Letter Queues so messages are not automatically deleted or lost forever.

MessagingAWS