SQS

AWS Lambda SQS provisioned mode is cheaper than therapy

There is a specific flavor of nausea reserved for serverless engineering teams. It usually strikes at 2 a.m., shortly after a major product launch, when someone posts a triumphant screenshot of user traffic in Slack. While the marketing team is virtually high-fiving, CloudWatch quietly begins to draw a perfect, vertical line that looks less like a growth chart and more like a cliff edge.

Your SQS queues swell. Lambda invocations crawl. Suddenly, the phrase “fully managed service” sounds less comforting and more like a cruel punchline delivered by a distant cloud provider.

For years, the relationship between Amazon SQS and AWS Lambda has been the backbone of event-driven architecture. You wire up an event source mapping, let Lambda poll the queue, and trust the system to scale as messages arrive. Most days, this works beautifully. On the wrong day, under the wrong kind of spike, it works “eventually.”

But in the world of high-frequency trading or flash sales, “eventually” is just a polite synonym for “too late.”

With the release of AWS Lambda SQS Provisioned Mode on November 14, Amazon is finally admitting that sometimes magic is too slow. It grants you explicit control over the invisible workers that poll SQS for your function. It ensures they are already awake, caffeinated, and standing in line before the mob shows up. It allows you to trade a bit of extra planning (and money) for the guarantee that your system won’t hit the snooze button while your backlog turns into a towering monument to failure.

The uncomfortable truth about standard SQS polling

To understand why we need Provisioned Mode, we have to look at the somewhat lazy nature of the standard behavior.

Out of the box, Lambda uses an event source mapping to poll SQS on your behalf. You give it a queue and some basic configuration, and Lambda spins up pollers to check for work. You never see these pollers. They are the ghosts in the machine.

The problem with ghosts is that they are not particularly urgent. When a massive spike hits your queue, Lambda realizes it needs more pollers and more concurrent function invocations. However, it does not do this instantly. It ramps up. It adds capacity in increments, like a cautious driver merging onto a freeway.

For a steady workload, you will never notice this ramp-up. But during a viral marketing campaign or a market crash, those minutes of warming up feel like an eternity. You are essentially watching a barista who refuses to start grinding coffee beans until the line of customers has already curled around the block.

Standard SQS polling gives you tools like batch size, but it denies you direct influence over the urgency of the consumption. You cannot tell the system, “I need ten workers ready right now.” You can only stand in line and hope the algorithm notices you are drowning.

This is acceptable for background jobs like resizing images or sending emails. It is decidedly less acceptable for payment processing or fraud detection. In those cases, watching twenty thousand messages pile up while your system “automatically scales” is not an architectural feature. It is a resume-generating event.

Paying for a standing army instead of volunteers

Provisioned Mode flips the script on this reactive behavior. Instead of letting Lambda decide how many pollers to use based purely on demand, you tell it the minimum and maximum number of event pollers you want reserved for that queue.

An event poller is a dedicated worker that reads from SQS and hands batches of messages to your function. In standard mode, these pollers are summoned from a shared pool when needed. In Provisioned Mode, you are paying to keep them on retainer.

Think of it as the difference between calling a ride-share service and hiring a private driver to sit in your driveway with the engine running. One is efficient for the general public; the other is necessary if you need to leave the house in exactly three seconds.

The benefits are stark when translated into human terms.

First, you get speed. AWS advertises significantly faster scaling for SQS event source mappings in Provisioned Mode. We are talking about adding up to one thousand new concurrent invocations per minute.

Second, you get capacity. Provisioned Mode can support massive concurrency per SQS mapping, far higher than the default capabilities.

Third, and perhaps most importantly, you get predictability. A single poller is not just a warm body. It is a unit of throughput (handling up to 1 MB per second or 10 concurrent invokes). By setting a minimum number of pollers, you are mathematically guaranteeing a baseline of throughput. You are no longer hoping the waiters show up; you have paid their salaries in advance.

Configuring this without losing your mind

The good news is that Provisioned Mode is not a new service with its own terrifying learning curve. It is just a configuration toggle on the event source mapping you are already using. You can set it up in the AWS Console, the CLI, or your Infrastructure as Code tool of choice.

The interface asks for two numbers, and this is where the engineering art form comes in.

First, it asks for Minimum Pollers. This is the number of workers you always want ready.

Second, it asks for Maximum Pollers. This is the ceiling, the limit you set to ensure you do not accidentally DDoS your own database.

Choosing these numbers feels a bit like gambling, but there is a logic to it. For the minimum, pick a number that comfortably handles your typical traffic plus a standard spike. Start small. Setting this to 100 when you usually need 2 is the serverless equivalent of buying a school bus to commute to work alone.

For the maximum, look at your downstream systems. There is no point in setting a maximum that allows 5,000 concurrent Lambda functions if your relational database curls into a fetal position at 500 connections.

Once you enable it, you need to watch your metrics. Keep an eye on “Queue Depth” and “Age of Oldest Message.” If the backlog clears too slowly, buy more pollers. If your database administrator starts sending you angry emails in all caps, reduce the maximum. The goal is not perfection on day one; it is to replace guesswork with a feedback loop.

The financial hangover

Nothing in life is free, and this applies doubly to AWS features that solve headaches.

When you enable Provisioned Mode, AWS begins charging you for “Event Poller Units.” You pay for the minimum pollers you configure, regardless of whether there are messages in the queue. You are paying for readiness.

This is a mental shift for serverless purists. The whole promise of serverless was “pay for what you use.” Provisioned Mode is “pay for what you might need.”

You are essentially renting a standing army. Most of the time, they will just stand there, playing cards and eating your budget. But when the enemy (traffic) attacks, they are already in position. Standard SQS polling is cheaper because it relies on volunteers. Volunteers are free, but they take a while to put on their boots.

From a FinOps perspective, or simply from the perspective of explaining the bill to your boss, the question is not “Is this expensive?” The question is “What is the cost of latency?”

For a background report generator, a five-minute delay costs nothing. For a high-frequency trading platform, a five-second delay costs everything. You should not enable Provisioned Mode on every queue in your account. That would be financial malpractice. You reserve it for the critical paths, the workflows where the price of slowness is measured in lost customers rather than just infrastructure dollars.

Why you should care about the fourth dial

Architecturally, Provisioned Mode gives us a new layer of control. Previously, we had three main dials in event-driven systems: how fast we write to the queue, how fast the consumers process messages, and how much concurrency Lambda is allowed.

Provisioned Mode adds a fourth dial: the aggression of the retrieval.

It allows you to reason about your system deterministically. If you know that one poller provides X amount of throughput, you can stack them to meet a specific Service Level Agreement. It turns a “best effort” system into a “calculated guarantee” system.

Serverless was sold to us as freedom from capacity planning. We were told we could just write code and let the cloud handle the undignified details of scaling. For many workloads, that promise holds true.

But as your workloads become more critical, you discover the uncomfortable corners where “just let it scale” is not enough. Latency budgets shrink. Compliance rules tighten. Customers grow less patient.

AWS Lambda SQS Provisioned Mode is a small, targeted answer to that discomfort. It allows you to say, “I want at least this much readiness,” and have the platform respect that wish, even when your traffic behaves like a toddler on a sugar high.

So, pick your most critical queue. The one that keeps you awake at night. Enable Provisioned Mode, set a modest minimum, and watch the metrics. Your future self, staring at a flat latency graph during the next Black Friday, will be grateful you decided to stop trusting in magic and started paying for physics.

Choosing your message queue AWS SQS or GCP Pub/Sub

In the world of modern software, applications are rarely monolithic islands. Instead, they are bustling cities of interconnected services, each performing a specific job. For this city to function smoothly, its inhabitants, microservices, functions, and components need a reliable way to communicate without being directly tethered to one another. This is where message brokers come in, acting as the city’s postal service, ensuring that messages are delivered efficiently and reliably.

Two of the most prominent cloud-based postal services are Amazon Web Services’ Simple Queue Service (SQS) and Google Cloud’s Pub/Sub. Both are exceptional at what they do, but they operate on different philosophies. Understanding their unique characteristics is crucial for any cloud architect or DevOps engineer aiming to build robust, scalable, and event-driven systems. This guide will explore their differences to help you choose the right service for your application’s needs.

A quick look at our contenders

Before we examine the details, let’s get a general feel for each service.

AWS SQS is the seasoned veteran of message queuing. Think of it as a highly organized system of mailboxes. A service writes a letter (a message) and places it into a specific mailbox (a queue). The recipient service then comes to that mailbox and picks up its mail when it has the capacity to process it. It’s a straightforward, incredibly reliable system that has been battle-tested for years.

GCP Pub/Sub operates more like a global newspaper subscription. A publisher (your service) doesn’t send a message to a specific recipient. Instead, it publishes a message to a “topic,” like a news flash for the “user-signup” channel. Any service that has subscribed to that topic instantly receives a copy of the message. It’s designed for broad, real-time distribution of information on a global scale.

The delivery dilemma Push versus Pull

The most fundamental difference between SQS and Pub/Sub lies in how messages are delivered. This is often referred to as the “push vs. pull” model.

The pull model, which is SQS’s native approach, is like checking your P.O. box. The consumer application is responsible for periodically asking the queue, “Is there any mail for me?” This gives the consumer complete control over the rate of consumption. If it’s overwhelmed with work, it can slow down its requests or stop asking for new messages altogether. This is ideal for batch processing or any workload where you need to manage the processing pace carefully.

The push model, where Pub/Sub shines, is akin to home mail delivery. When a message is published, Pub/Sub actively “pushes” it to all subscribed endpoints, such as a serverless function or a webhook. The recipient doesn’t have to ask; the message just arrives. This is incredibly efficient for real-time notifications and event-driven workflows where immediate reaction is key. While Pub/Sub also supports a pull model, its architecture is optimized for push-based delivery.

Comparing key features

Let’s break down how these two services stack up in a few critical areas.

Message ordering

Sometimes, the sequence of events is just as important as the events themselves. For these cases, AWS SQS offers a specific FIFO (First-In, First-Out) queue type. This works exactly like a single-file line at a bank; the first person to get in line is the first one to be served. It provides a strict guarantee that messages will be processed in the exact order they were sent, which is critical for tasks like processing financial transactions or application logs.

GCP Pub/Sub, in contrast, does not have a dedicated FIFO queue type. Instead, it achieves partial ordering through the use of ordering keys. You can assign a key to messages (for example, a userId), and Pub/Sub will ensure that all messages with that specific key are delivered in order. However, it doesn’t guarantee order between different keys. To reuse the analogy, it’s less like a single line and more like a deli with separate ticket numbers for the butcher and the bakery. It keeps orders straight within each department, but not across the entire store.

Scale and reach

This is where their architectural differences become clear. SQS is a regional service. It’s incredibly scalable and resilient, but its scope is confined to a single AWS region.

Pub/Sub is inherently global. You publish a message once, and it can be delivered to subscribers in any region around the world with low latency. If your application has a global user base and you need to propagate events worldwide, Pub/Sub has a distinct advantage.

Message size and retention

Think of SQS as being for postcards and letters. It supports messages up to 256 KB. It can hold onto these messages for up to 14 days, giving your consumers plenty of time to process them.

Pub/Sub, on the other hand, can handle larger packages, with a maximum message size of 10 MB. However, its standard retention period is shorter, at 7 days.

Special delivery options

SQS has a native feature called Delay Queues. This allows you to postpone the delivery of a new message for up to 15 minutes. It’s like writing a post-dated check; the message sits in the queue but is invisible to consumers until the timer expires. This is useful for scheduling tasks without a complex scheduling service. Pub/Sub does not offer a similar built-in feature.

When to choose AWS SQS

SQS is your go-to choice when you need a dependable, orderly mailroom for your application. It excels in scenarios where:

Strict ordering is non-negotiable. For task sequencing or financial ledgers, SQS FIFO is the gold standard.
You need to control the pace of consumption. The pull model is perfect for decoupling a fast producer from a slower consumer or for batch processing jobs.
Task scheduling is required. The native delay queue feature is a simple yet powerful tool.
Your application’s architecture is primarily contained within a single AWS region.

When to choose GCP Pub/Sub

Pub/Sub is the right tool when you’re building a global broadcasting system or a highly reactive, event-driven platform. Consider it when:

You need to fan-out messages to many consumers. Pub/Sub’s topic-and-subscription model is designed for this.
Global distribution with low latency is a priority. Its global nature is a massive benefit for distributed systems.
You are sending large messages. The 10 MB limit offers much more flexibility than SQS.
A push-based model fits your architecture. It integrates seamlessly with serverless functions for instant, event-triggered execution.

A final word

So, after all this technical deliberation, which digital courier should you entrust with your precious data packets? The one that meticulously forms a single, orderly queue, or the one that shouts your message through a global megaphone to anyone who will listen?

The truth is, there’s no single “best” service. There’s only the one whose particular brand of crazy best matches your application’s personality. Is your app a stickler for the rules, demanding every event be processed in perfect sequence, lest it have a digital panic attack? Then the quiet, predictable, and slightly obsessive SQS is your soulmate. Or is your app more of a drama queen, needing to announce every minor update to the entire world, immediately? Then the boisterous, globe-trotting Pub/Sub is probably already sliding into your DMs.

Ultimately, the best way to choose is to put them to the test. Think of it as a job interview. Give them both a trial run with your actual workload and see which one handles the pressure with more grace, or at least breaks in a more interesting, less catastrophic way. Go on, do it for science. And for the future sanity of your on-call engineer.

July 6, 2025 by Fernando SRE Cloud stuff

Let’s Party, Understanding Serverless Architecture on AWS

Imagine you’re throwing a big party, but instead of doing all the work yourself, you have a team of helpers who each specialize in different tasks. That’s what we’re doing with serverless architecture on AWS, we’re organizing a digital party where each AWS service is like a specialized helper.

Let’s start with AWS Lambda. Think of Lambda as your multitasking friend who’s always ready to help. Lambda springs into action whenever something happens, like a guest arriving (an API request) or someone bringing a dish (uploading a file). It doesn’t need to be told what to do beforehand; it just responds when needed. This is great because you don’t have to keep this friend around always, only when there’s work to be done.

Now, let’s talk about API Gateway. This is like your doorman. It greets your guests (user requests), checks their invitations (authenticates them), and directs them to the right place in your party (routes the requests). It works closely with Lambda to ensure every guest gets the right experience.

For storing information, we have DynamoDB. Imagine this as a super-efficient filing cabinet that can hold and retrieve any piece of information instantly, no matter how many guests are at your party. It doesn’t matter if you have 10 guests or 10,000; this filing cabinet works just as fast.

Then there’s S3, which is like a magical closet. You can store anything in it, coats, party supplies, even leftover food, and it never runs out of space. Plus, it can alert Lambda whenever something new is put inside, so you can react to new items immediately.

For communication, we use SNS and SQS. Think of SNS as a loudspeaker system that can make announcements to everyone at once. SQS, on the other hand, is more like a ticket system at a delicatessen counter. It makes sure tasks are handled in an orderly fashion, even if a lot of requests come in at once.

Lastly, we have Step Functions. This is like your party planner who knows the sequence of events and makes sure everything happens in the right order. If something goes wrong, like the cake not arriving on time, the planner knows how to adjust and keep the party going.

Now, let’s see how all these helpers work together to throw an amazing party, or in our case, build a photo-sharing app:

When a guest (user) wants to share a photo, they hand it to the doorman (API Gateway).
The doorman calls over the multitasking friend (Lambda) to handle the photo.
This friend puts the photo in the magical closet (S3).
As soon as the photo is in the closet, S3 alerts another multitasking friend (Lambda) to create smaller versions of the photo (thumbnails).
But what if lots of guests are sharing photos at once? That’s where our ticket system (SQS) comes in. It gives each photo a ticket and puts them in an orderly line.
Our multitasking friends (Lambda functions) take photos from this line one by one, making sure no photo is left unprocessed, even during a photo-sharing frenzy.
Information about each processed photo is written down and filed in the super-efficient cabinet (DynamoDB).
The loudspeaker (SNS) announces to interested parties that a new photo has arrived.
If there’s more to be done with the photo, like adding filters, the party planner (Step Functions) coordinates these additional steps.

The beauty of this setup is that each helper does their job independently. If suddenly 100 guests arrive at once, you don’t need to panic and hire more help. Your existing team of AWS services can handle it, expanding their capacity as needed.

This serverless approach means you’re not paying for helpers to stand around when there’s no work to do. You only pay for the actual work done, making it very cost-effective. Plus, you don’t have to worry about managing these helpers or their equipment, AWS takes care of all that for you.

In essence, serverless architecture on AWS is about having a smart, flexible, and efficient team that can handle any party, big or small, without needing to micromanage. It lets you focus on making your app amazing, while AWS ensures everything runs smoothly behind the scenes.

In conclusion, understanding how to integrate AWS services is crucial for building effective serverless architectures. By leveraging the strengths of Lambda, API Gateway, DynamoDB, S3, SNS, SQS, and Step Functions, you can create robust applications that meet your business needs with minimal operational overhead. And just like that, you can enjoy the party with your guests, knowing everything is running smoothly in the background! 🥳🎉

July 26, 2024 by Fernando SRE Cloud stuff