June 2026

The sandbox finally gets its own apartment

There is a particular kind of optimism that appears in architecture diagrams. It is usually represented by a neat rectangle labelled “sandbox”, placed between “user code” and “production systems”, as if malicious input could be discouraged by tidy geometry.

For years, we have treated this rectangle with a level of trust usually reserved for elevators, bridges, and the one person in the meeting who says, “This should be simple.” Inside the box went user scripts, notebook cells, plugin code, CI jobs, vulnerability scanners, browser IDE sessions and, more recently, AI agents with the confidence of a junior developer and the impulse control of a caffeinated raccoon.

The problem was never running code. Computers are embarrassingly eager to do that. The problem was trusting the code not to bite the furniture.

AWS Lambda MicroVMs are interesting because they arrive at the exact moment when everyone is trying to run more untrusted code, more often, with fewer humans staring nervously at terminal output. They are not just another Lambda feature wearing a party hat. They are AWS trying to give each user, job, or agentic session its own small, isolated room, complete with VM-level walls, preserved state, and a serverless landlord who handles the plumbing.

That is worth paying attention to.

The old sandbox had suspicious drywall

Containers made modern platform engineering possible. They package applications beautifully, start quickly and behave well enough that we have built entire civilizations on top of them, some of which even survive Monday mornings.

But containers share the host kernel. That is not a moral failure. It is simply the bargain. You get speed and density, but the isolation boundary is not the same as a virtual machine boundary. For normal workloads, this is often a perfectly reasonable trade. For hostile or unknown code, it becomes a slightly more awkward dinner party.

Running untrusted code in containers usually requires extra hardening. Seccomp profiles, AppArmor or SELinux policies, rootless runtimes, careful filesystem restrictions, capability dropping, network controls, resource limits, image hygiene, and a collection of internal documents with titles like “please do not run customer code as root again”.

This does not mean containers are bad. It means they have spent years being asked to babysit workloads they would probably rather not meet in a dark data center.

Virtual machines have always had a stronger answer. Each workload gets a separate kernel and a more substantial isolation boundary. Security teams like that sort of thing. They sleep better when the boundary is not made entirely of namespace optimism.

The problem is that traditional VMs are heavier. They take longer to boot, require more lifecycle management and make interactive platforms feel like they are starting a small moon every time someone clicks “run”. If you are building a browser IDE, a notebook platform, or an AI coding assistant, waiting minutes for an execution environment is not a product experience. It is a hostage situation with a spinner.

So teams built their own sandboxes. Some used hardened containers. Some orchestrated pools of VMs. Some built elaborate custom schedulers. Many discovered that safely running other people’s code is less a feature and more a lifestyle choice involving pager alerts.

Firecracker was already doing the quiet work

The interesting part is that AWS already had a lightweight virtualization story hiding in plain sight. Firecracker has powered Lambda and Fargate-style isolation patterns for years. It was designed to provide lightweight MicroVMs with fast startup and strong boundaries, without dragging the full emotional luggage of traditional VM management into every request.

Lambda MicroVMs take that idea and expose it as a managed primitive for a very specific class of applications. These are not ordinary Lambda functions that got promoted. They are separate resources with a distinct lifecycle and a different purpose.

Regular Lambda functions are still excellent for event-driven work. A request arrives, a function runs, a result returns, and the function disappears back into the mist like a responsible woodland creature. That model works beautifully for stateless tasks.

Lambda MicroVMs are for the messier cases. A user opens an interactive coding environment. An AI agent needs to inspect files, run code, call tools, and maintain context. A data analytics session installs packages, loads working files, and waits while the human operator goes to lunch, a meeting, or an unexpected argument with Kubernetes.

These sessions need state. They need isolation. They need to start quickly. They need to pause without losing their belongings.

That combination used to be annoying. Now it has a first-class shape.

Lambda functions answer the doorbell, MicroVMs let the guest stay

The lifecycle is the real story.

You create a MicroVM image from a Dockerfile and a code artifact. AWS initializes the environment and captures a Firecracker snapshot of disk and memory. Later, when you launch a MicroVM from that image, it starts from the pre-initialized snapshot instead of beginning from scratch.

This is the difference between cooking dinner from raw ingredients and reheating yesterday’s lasagna. One is noble. The other is faster and, under the correct emotional conditions, superior.

Snapshots matter because many interactive environments spend a surprising amount of time getting ready to do the thing the user actually asked for. Installing packages, loading runtimes, preparing tools, arranging directories, initializing services, warming models, importing dependencies, and opening files. All of this can become part of the prepared environment.

For AI agents, this is especially useful. The environment can already contain the tools, SDKs, interpreters, and supporting processes the agent needs. When a session starts or resumes, it does not have to assemble itself from a cold and resentful universe.

There is a catch, because distributed systems are legally required to have at least one. Snapshotting state means engineers must think carefully about what is safe to preserve and clone. Unique identifiers, network connections, ephemeral credentials, time-sensitive tokens, and initialization-time randomness may all need special handling.

A snapshot is not just a performance trick. It is a memory of a machine. And memories, as anyone who has debugged a stale cache knows, can lie with great confidence.

Eight hours change the serverless conversation

The 8-hour runtime limit is one of the details that make this more than a faster function wrapper.

Traditional serverless thinking has usually pushed us toward short, stateless execution. Do a job, return a result, vanish. This is clean, scalable, and architecturally pleasing, which means real users immediately invent workflows that do not fit it.

Interactive sessions are not short. Vulnerability scans can run for a while. Data analysis can wander. AI agents can start with “inspect this repository” and somehow end up creating four temporary files, two package installs, and a philosophical dispute with a linter.

Lambda MicroVMs support longer-running sessions while preserving memory, disk, and process state. During idle periods, the environment can be suspended and later resumed. That matters because many workloads are not constantly busy. Humans pause. Agents wait. Browsers sleep. Developers get pulled into meetings where someone says “alignment” until the will to compute leaves the room.

Idle suspension changes the economics and the experience. The user can return to a session that still has its working state, but the platform does not have to pay as if the machine were actively sweating the entire time.

This is still serverless, but not the old cartoon version where everything is tiny, stateless, and over before the coffee cools. It is serverless with a memory.

That sentence would have terrified architects ten years ago. Now it may be exactly what agent platforms need.

AI agents need walls, not vibes

Agentic AI has made sandboxing fashionable in the same way kitchen fires make extinguishers fashionable.

An AI agent can generate code, run commands, inspect files, call APIs, and modify state. Sometimes it does this brilliantly. Sometimes it does this like a raccoon found a shell prompt and decided to modernize procurement.

The security model for agents must assume mistakes. It must assume prompt injection. It must assume unexpected tool use. It must assume that the agent is not evil, just helpful in the way a toddler with admin privileges might be helpful.

This is where per-user or per-session MicroVM isolation becomes important. If each agent run gets its own dedicated execution environment, the blast radius becomes easier to reason about. One confused agent should not be able to interfere with another user’s session. One malicious script should not get a friendly wave from a shared kernel and an invitation to explore the neighborhood.

But this is not magic safety dust. VM-level isolation is a stronger primitive, not a complete architecture.

You still need tight IAM roles. You still need network egress controls. You still need to decide whether the MicroVM can reach the internet, internal APIs, metadata services, package repositories or that one ancient endpoint nobody admits owns the business process. You still need secrets management, audit trails, logging, quotas, timeouts and cleanup.

The agent may now have its own padded serverless studio. You still need to check what tools you left in the drawers.

The security story is better, but not finished

The most dangerous sentence in cloud security is “the service handles that”.

Sometimes the service does handle that. Sometimes it handles one very important layer, and everything around it remains your problem, wearing a fake moustache.

Lambda MicroVMs improve the isolation layer. A dedicated MicroVM with no shared kernel between users is a meaningful architectural improvement for untrusted execution. It gives platform teams a stronger foundation than a basic container sandbox. It also reduces the amount of custom virtualization machinery teams need to build themselves.

But the surrounding design still decides whether the result is secure or just expensive theater.

Before production, architects should ask some unfashionable questions.

What can the MicroVM reach on the network?

What credentials exist inside the environment?

Can the workload call internal APIs?

Can it download arbitrary packages?

How is user code authenticated before execution?

How are sessions terminated?

What happens to files after suspension?

What gets written to logs?

How are snapshots built, versioned, and retired?

How are malicious or runaway workloads detected?

What prevents one customer from turning your lovely sandbox platform into a cryptocurrency-themed space heater?

These questions are not reasons to avoid the service. These are the reasons to use it properly. The stronger the primitive, the more tempting it is to stop thinking. Resist this. Architecture diagrams are very good at looking innocent.

The bill is also part of the architecture

Serverless has trained us to associate managed services with financial elegance. Sometimes this is true. Sometimes the invoice arrives carrying a clipboard and asking to speak with the adult who designed the retry policy.

Lambda MicroVMs have a cost model that deserves actual modeling. You pay for baseline resources while a MicroVM is running, and for additional active usage when the workload exceeds that baseline. That is reasonable, but it means usage patterns matter.

A short-lived scan that starts, does work, and terminates has one shape. An interactive coding session that runs, idles, suspends, resumes, and idles again has another. A swarm of AI agents that each receive their own environment has a third shape, one that may require finance to breathe into a paper bag if nobody adds quotas.

The point is not that this will be expensive. The point is that “serverless” does not automatically mean “cheap”. It means the operational model changes. Billing still obeys physics, usage, and the quiet vengeance of concurrency.

If you are building a platform on top of Lambda MicroVMs, model the boring details. Average session duration. Idle time. Resume frequency. Memory size. CPU requirements. Disk usage. Regional availability. Expected concurrency. Abuse scenarios. Free-tier fantasies, if your product team has recently been exposed to those.

Cloud bills are where optimistic architecture diagrams go to receive adult supervision.

Platform engineering gets a new building block

For platform teams, the most promising part of Lambda MicroVMs is not that they make one workload easier. It is that they could simplify a recurring pattern.

Many organizations need a safe way to run untrusted or semi-trusted code. Internal developer platforms need execution sandboxes. AI products need agent workspaces. Security platforms need isolated scanning environments. Data platforms need notebook-like sessions. Education platforms need per-student runtimes. CI systems need disposable workers.

Historically, building this well meant assembling a small private cloud inside your cloud. VM pools, schedulers, cleanup jobs, image builders, network isolation, capacity management, logging pipelines, and emergency scripts named things like final_cleanup_v7_really_final.sh.

Lambda MicroVMs do not remove all platform work. They move the boundary.

You still need golden images. You still need lifecycle policies. You still need guardrails. You still need quotas and tenant mapping. You still need observability. You still need deployment pipelines. You still need to decide how MicroVMs integrate with Lambda functions, S3, CloudWatch, IAM, and whatever control plane runs your product.

But you may not need to own as much of the virtualization substrate yourself. That is not glamorous. It is better. Glamour is what vendors put in keynote slides before someone asks about day two operations.

A sensible pattern might use regular Lambda functions for the event-driven control plane, APIs, and orchestration logic. Lambda MicroVMs handle the isolated execution path, where user code or AI-generated code actually runs. S3 stores build artifacts. CloudWatch captures logs. IAM defines what each environment can do. Network controls define what each environment can reach.

That division feels clean. Not simple, because nothing involving untrusted code is simple. But clean enough to be useful.

The uncomfortable production checklist

Before adopting Lambda MicroVMs, it is worth being deliberately boring. Boring is underrated in architecture. Boring systems sleep through the night.

Ask which workloads actually need VM-level isolation. Some do. Some are perfectly fine as regular Lambda functions, ECS tasks, or containers in Kubernetes. Using a stronger primitive everywhere can become expensive and operationally awkward.

Ask how long sessions really run. Not how long the demo runs. Not how long the product manager hopes they run. How long do users actually keep them alive while opening seven browser tabs and going to lunch?

Ask what state must survive suspension. Then ask what state must never survive suspension. Secrets, temporary credentials, one-time tokens, and sensitive intermediate files need careful treatment.

Ask how images are built and patched. A MicroVM image is still software. It needs supply chain controls, vulnerability scanning, versioning, and retirement. Yesterday’s golden image can become tomorrow’s antique security incident.

Ask what happens when a workload wants more. More CPU. More memory. More disk. GPU acceleration. x86 compatibility. A different region. A longer session. Some requests will fit the service. Some will not. That is not a failure. That is architecture doing its job before production does it for you with alarms.

Ask how abuse is stopped. Per-user quotas, concurrency limits, session expiration, network restrictions, and cost controls should be designed early. Waiting until after the first runaway agent is a traditional learning method, but not a relaxing one.

A better sandbox is not a finished safety plan

Lambda MicroVMs arrive at a useful moment. The industry is asking applications to execute more code that the application developer did not write. Some of that code comes from users. Some comes from AI systems. Some comes from a collaboration between both, which is how you get an incident report with a surprisingly emotional timeline.

The old choices were uncomfortable. Containers were fast but needed serious hardening for hostile workloads. Traditional VMs were safer but heavy. Regular serverless functions were elegant but not designed for long, stateful, interactive sessions.

Lambda MicroVMs offer a new middle ground. VM-level isolation. Fast launch and resume from snapshots. Stateful sessions. Idle suspension. Serverless lifecycle management. A dedicated environment per user, job, or agentic session.

That combination will not remove the need for security architecture. It will not absolve anyone from IAM, networking, secrets, observability, patching, or cost controls. It will not make AI agents safe by virtue of being placed inside a branded rectangle.

But it gives the sandbox real walls.

And that matters. Because the sandbox did not become safer because someone wrote “AI” on the diagram. It became safer because the walls got real.

Why multi cloud architecture is overrated

Somewhere along the way, multi cloud stopped being an architecture decision and became a personality trait. It shows up on slide decks the way kale shows up at weddings, nobody is quite sure who asked for it, but removing it now would feel like a confession. Say the words out loud in a meeting and watch the room nod with the solemn confidence of people who have read the headline but not the footnotes.

The pitch is seductive. Spread your workloads across two or three providers and you get freedom, resilience, and a vague aura of technical adulthood. What you actually get, in a surprising number of cases, is a second pager rotation and a networking bill that reads like a ransom note.

This is not an argument that multi cloud is always wrong. It is an argument that it is rarely the default answer to a question nobody bothered to ask. A lot of cloud decisions get made the way people buy treadmills, for the version of themselves they intend to become, not the one currently standing in the room.

What the multi cloud pitch is really selling

Strip away the architecture diagrams, and the multi cloud sales pitch is mostly emotional. It promises escape from vendor lock in, that primal fear of being trapped by a provider who knows it. It offers the warm feeling of resilience, the sense that if one cloud falls over, your business will calmly stroll to the next one. And it projects maturity, the impression that your team has graduated from amateur hour to serious infrastructure people.

These are real desires. The problem is that multi cloud answers them roughly the way buying a second house answers your fear of a leaky roof. Technically, you now have options. You also have two roofs.

Where the trouble starts

The fantasy version of multi cloud assumes a clean layer of abstraction sitting neatly on top of every provider, so your team writes once and runs anywhere. The practical version is that each cloud is its own country, with its own language, customs, and bewildering opinions about what a load balancer should be called.

Total abstraction almost never survives contact with production. Your engineers do not get to learn one platform deeply. They get to learn two platforms simultaneously, plus the seam between them, which is where the genuinely interesting bugs live. The architecture becomes harder to document, harder to operate, and harder to evolve, which is a polite way of saying nobody on the team fully understands all of it anymore, including the person who built it.

The costs nobody put on the slide

Here is where the war story usually turns grim. The expensive part of multi cloud is rarely the compute. It is everything wrapped around it.

You duplicate your observability stack, because the dashboards that work beautifully in one cloud are politely useless in the other. Your IAM and governance model doubles in surface area, since every permission, role, and policy now needs a twin that behaves identically and never quite does. Network egress, cross cloud replication, and data transfer charges accumulate quietly in the background like a subscription you forgot to cancel.

And then there is the human cost, which never appears in the cost calculator. Multi cloud raises the seniority floor of your entire team. The junior engineer who could safely ship in a single, well governed environment now needs to understand two of everything before they can be trusted near production. You do not just pay for more infrastructure. You pay for more expertise to keep that infrastructure from quietly drifting apart.

Security and compliance, now in stereo

Security people have a particular look they get when you mention multi cloud, the expression of someone who has been asked to childproof two houses for the price of one.

Every additional cloud is another attack surface, another set of policies to keep synchronized, and another generous opportunity for a misconfiguration to go unnoticed. Auditing becomes an exercise in translation, because a control that means one thing in the first provider means something subtly different in the second. The realistic end state is not airtight redundancy. It is an uneven security posture, strong where your team has invested attention and quietly soft where they ran out of hours.

Operations and the art of the unsolvable incident

Incidents are where multi cloud presents its invoice. At three in the morning, when something is on fire, you do not want your logs, metrics, and traces scattered across two providers like a crime scene split between jurisdictions.

Correlating an incident across clouds is slow, and slow is expensive when customers are watching. Mean time to detection creeps up. Mean time to resolution creeps up with it. Postmortems acquire an extra paragraph that always begins with some variation of “we lost time understanding which cloud was actually responsible.” To stitch the picture back together, you end up leaning on yet another external platform whose only job is to make your two clouds look like one, which, if you squint, is a strange amount of effort to undo a decision you made on purpose.

Portability you have versus portability you imagine

The intellectual cornerstone of multi cloud is portability, the comforting belief that you could pack up and move providers whenever you liked. In practice, portability tends to be the gym membership of architecture, fully paid for, rarely used, and quietly aspirational.

Most systems end up leaning on provider specific services anyway, because those services are good and saying no to them on principle is a luxury few teams can afford. Kubernetes and containers genuinely help, they smooth the edges, but they do not magically erase the dependency underneath. Real migrations are still slow, still costly, and still the kind of project that gets proposed with enthusiasm and abandoned with relief.

The honest goal is not total independence. It is a reasonable exit, an architecture where leaving would be painful but possible, rather than one where leaving is theoretically free and practically unthinkable. Designing for a sensible escape route beats promising a freedom you will never exercise.

When multi cloud actually earns its keep

None of this means multi cloud is a mistake. It means it is a tool with a narrow, legitimate set of jobs, and it deserves to be used for those rather than worn as a badge.

It earns its place when regulation or data sovereignty leaves you no choice, when the data legally must live in particular places. It makes sense when your organization is large enough that real negotiating leverage over providers translates into serious money. It is justified when the business risk is so high that genuine cross cloud redundancy is worth its considerable price. It is reasonable when a specific service on another cloud offers an advantage you cannot replicate elsewhere. And it works, crucially, only when the organization is mature enough to operate it, with the teams, processes, and discipline already in place rather than hopefully on order.

Notice that every one of these starts with a concrete business reason, not a vibe.

What I would do instead

If you handed me a blank slate and a reasonable budget, I would pick one primary cloud and govern it properly, because a single environment run well beats two environments run anxiously almost every time.

I would design for enough portability to sleep at night, not for the fantasy of frictionless migration. I would automate deployments and infrastructure from day one, before the shortcuts calcify into tradition. I would standardize observability, security, and cost management so the whole thing stays legible to the people who did not build it. And I would revisit the single cloud decision on a schedule, honestly, looking for a real reason to expand rather than a fashionable one.

That last part matters. The goal is not loyalty to one provider. It is refusing to add a second one until something other than anxiety is asking for it.

The question worth asking

Multi cloud can be genuinely useful. It just should not be the reflex, the thing you reach for because the alternative feels insufficiently ambitious. Far too often, what gets sold as strategy is simply complexity wearing a nicer outfit.

Cloud maturity is not measured in the number of providers on your invoice. It is measured by how well you use the one you actually need. The most sophisticated architecture in the room is frequently the one that resisted the urge to be impressive.

So the question is not how many clouds you run. It is whether each one earns its place.

June 16, 2026 by Fernando SRE Cloud stuff DevOps stuff SRE stuff

Your Ingress resource is living on borrowed time

There is a special kind of grief reserved for infrastructure that works fine. Nobody writes eulogies for the broken stuff; that gets deleted with enthusiasm. The painful goodbyes are for the things that still do their job every day, quietly, while the rest of the industry has already decided they belong in a museum. Your Ingress resources are in that category now. They route traffic, they terminate TLS, and they have not paged you in months. And they are, officially and by design, a dead end.

The Kubernetes project has been remarkably polite about this. Ingress is “frozen”, which is the standards body equivalent of moving someone to a nice farm upstate. No new features, no spec evolution, no fixes for the design decisions everyone now regrets. The replacement is called Gateway API, it reached general availability back in 2023, and it is one of those rare cases where the new thing is not just the old thing with more YAML. It actually fixes the organizational problem that made Ingress miserable, which, as we will see, was never really a technical problem at all.

The Ingress spec was always a rough draft

Here is the part of the story that usually gets left out. When Ingress shipped in 2015, the Kubernetes maintainers did not believe they had solved HTTP routing. They believed, correctly, that they had no idea what HTTP routing should look like, and they shipped a minimal spec on purpose. Host, path, backend service. That was essentially it. Everything else, the maintainers figured, could be handled by annotations until the community figured out what it actually wanted.

The community figured out what it wanted, all right. It wanted everything, and it wanted it via annotations.

If you have ever operated an nginx ingress controller in production, you know the genre. nginx.ingress.kubernetes.io/rewrite-target. nginx.ingress.kubernetes.io/canary-weight. nginx.ingress.kubernetes.io/configuration-snippet, which is the annotation equivalent of a hole in the wall that you push raw nginx config through and hope for the best. Traefik grew its own dialect. HAProxy grew another. At some point, the nginx controller alone supported well over a hundred proprietary annotations, each one a small confession that the spec underneath could not do the job.

The practical consequence is one that every platform engineer has lived. Your routing configuration is portable in theory and welded to your controller in practice. Migrating from nginx to anything else means translating a folklore of annotations by hand, and some of them have no translation, because they were never features of Kubernetes. They were features of one specific reverse proxy, smuggled in through a string field.

None of this makes Ingress bad design. It makes Ingress an honest admission, in 2015, that nobody agreed on what routing should look like. Gateway API is what happened after roughly eight years of arguing, when they finally agreed.

Three resources instead of one, and that is the whole upgrade

Gateway API replaces the single Ingress object with three, and before your YAML fatigue kicks in, stay with me, because the count is not the point. The ownership is.

GatewayClass is the template. It declares what kind of gateway infrastructure your cluster offers (Envoy, Cilium, or a cloud load balancer), and it gets written approximately once, by whoever runs the platform, and then mostly forgotten.

Gateway is a running instance of that template. It is the actual listener, the thing with an IP address and open ports, and it lives in an infrastructure namespace where application developers cannot poke it.

HTTPRoute is the routing rule. It says “traffic for this hostname and this path goes to this service”, and it lives in the application’s own namespace, right next to the Deployment it serves, owned by the team that owns the app.

That is the entire model. Three objects, three different owners, three different namespaces if you want them. Every interesting thing about Gateway API follows from that separation, which brings us to the actual argument.

The hallway belongs to the platform team, and the door belongs to the app team

Think about what an Ingress object actually is, organizationally. It is one resource that contains both infrastructure concerns (TLS certificates, load balancer behavior, controller tuning) and application concerns (which path goes to which service). One object, two very different audiences, and Kubernetes RBAC can only draw permission lines around whole objects.

So every organization running Ingress at scale ends up choosing between two bad options. Option one, the platform team owns all Ingress resources, and application teams file tickets to change a path rule, which is a magnificent way to turn a thirty-second change into a three-day wait. Option two, application teams own their Ingress resources, which means application teams can now set controller-level annotations, and somewhere in your cluster, there is a configuration snippet written by an intern in 2022 that nobody dares to remove. Both options are workarounds for the same flaw. The spec crammed two jobs into one object, and org charts do not bend that way.

Gateway API splits the object along exactly the line where your teams already split. The platform engineer provisions the Gateway in the infra namespace. They decide which ports are open, which TLS policy applies, and, crucially, which namespaces are allowed to attach routes to it. The application developer writes an HTTPRoute in their own namespace that says, in effect, “attach me to the gateway named external-web”. The route references the gateway by name; the gateway grants permission by policy. Cross-namespace routing is not a hack here, it is the core mechanic of the spec, with an explicit handshake on both sides.

If you read my past RBAC article, this will feel familiar, because it is the same principle wearing a different hat. Least privilege stopped being just about who can “kubectl delete” things and started applying to the network path itself. App teams get exactly the surface they need (their routes, their namespace) and nothing else. The platform team stops being a ticket-processing bottleneck and goes back to doing platform work. Nobody negotiates over annotations in a Slack thread at 6 p.m. on a Friday, which I am told does wonders for retention.

There is also a quieter benefit that only shows up in the postmortem. When routing rules live next to the application, the blast radius of a bad change is the application. When everything lives in one shared Ingress layer, a typo in one team’s path rule can take an unrelated team’s traffic with it. Separation of concerns is usually sold as elegance. In production, it is mostly sold as smaller incidents.

What Ingress made you beg your controller to do

Now for the features, briefly, because the features are genuinely less interesting than the reframe behind them.

Take canary deployments. With Ingress on nginx, weight-based traffic splitting means creating a second Ingress object, blessing it with ‘canary: “true”’ and ‘canary-weight: “10”’ annotations, and trusting that the controller interprets your strings correctly. With Gateway API, an HTTPRoute simply lists two backends with weights, 90 and 10, as ordinary structured fields. The API server validates them. Your canary rollout is now plain YAML instead of an incantation, and you did not have to install a service mesh to get it.

Header-based routing gets the same treatment. Routing requests with ‘x-beta-user: true’ to a different backend is a match condition in the spec, not a regex pasted into a controller-specific snippet. URL rewriting is a filter. Request mirroring, the trick where you copy live traffic to a new version without affecting real responses, is a filter too. Timeouts, header manipulation, traffic redirection, all first-class citizens with schemas.

Here is the reframe. None of these capabilities are new. Your reverse proxy could do all of this in 2016; reverse proxies are old and wise. What was missing was a portable way to ask for it. Under Ingress, every feature beyond host-and-path routing required learning the proprietary annotation dialect of whichever controller you happened to inherit, and your hard-won fluency in nginx annotations was worth exactly nothing the day someone migrated to Traefik. Gateway API moves those features into the spec itself, where they are typed, validated, and identical across implementations. The knowledge finally transfers. So do the manifests.

GatewayClass is the new vendor coupling point, and that is a better deal

Time for the honest section, because every article praising a new standard owes you one.

Gateway API does not eliminate vendor lock-in, and anyone telling you otherwise is selling a controller. The GatewayClass is where you commit. You pick Cilium, or Envoy Gateway, or Istio, or nginx-gateway-fabric, and from that moment your gateways run on that implementation’s machinery, with that implementation’s performance profile and that implementation’s extension features. Conformance across implementations is real but not absolute; the spec has core features everyone must support and extended ones they may.

What changed is the geometry of the coupling. With Ingress, the vendor dependency was smeared across your entire estate, hiding inside opaque annotation strings on every single routing object. You could not see it, measure it, or contain it; you discovered its true size on migration day, which is the worst possible day to discover anything. With Gateway API, the coupling is compressed into one object type. Everything above the GatewayClass (your routes, your matches, your filters, your weights) is portable standard YAML. Everything below it is the vendor’s problem. Swapping implementations becomes “change the GatewayClass and re-test”, not “translate three hundred annotations from one dialect to another and pray”.

The ecosystem, for the record, is not a science fair. Cilium ships a Gateway implementation on eBPF. Envoy Gateway is the CNCF’s straightforward Envoy packaging. Istio treats Gateway API as its preferred configuration surface these days. nginx-gateway-fabric exists for the sizable demographic that would like to keep nginx but lose the annotations. All of these run in production at companies whose outages would make the news.

You do not need to migrate everything to start

The best property of Gateway API for anyone with an existing cluster is that it demands nothing of your existing cluster. Gateway API and Ingress run side by side indefinitely. The controllers do not fight, the resources do not overlap, and your hundred working Ingress objects can keep working while you experiment two namespaces away.

The sensible entry point is not a migration project (migration projects are where enthusiasm goes to file status reports). It is one new service, or one feature branch, routed through an HTTPRoute while everything else stays put. You get a feel for the model, your platform team writes its first Gateway, and the canary feature gets a real audition on something low-stakes.

Whether your cluster is already prepared takes one command to find out.

kubectl get crds | grep gateway.networking.k8s.io

If that returns a list of CRDs, the welcome mat is already out; managed offerings like GKE ship them preinstalled. If it returns nothing, the installation is a single manifest from the Gateway API releases page, and then the welcome mat is out.

Ingress will keep working for years. Frozen APIs in Kubernetes enjoy long, comfortable retirements, and nobody is coming to delete your manifests. But every new routing feature, every new controller capability, and increasingly every new piece of documentation is being written for the other API now. Borrowed time is still time. It is just no longer the kind you should be building on.

June 10, 2026 by Fernando SRE Cloud stuff DevOps stuff Kubernetes SRE stuff

RBAC is not least privilege, and your cluster is the proof

Your security scanner ran last night. It came back green. RBAC is configured, there are no critical findings, and you closed the tab with the quiet satisfaction of someone who has done the responsible thing. The cluster is locked down. You can go to lunch.

Here is the uncomfortable part. A green scanner answers the question “Is access controlled?” It does not answer the question “Is access minimal?” Those are different questions, and most teams conflate them because the first one is easy to check and the second one requires reading things nobody wants to read on a Tuesday.

RBAC answers the first. Least privilege requires answering both. And a perfectly valid RBAC configuration can be, at the very same time, a perfectly generous one. The scanner has no opinion about generosity.

The ClusterRole you inherited from a Helm chart in March

Kubernetes ships three aggregated ClusterRoles out of the box (admin, edit, view), and they have a quietly alarming property. They absorb permissions. Any ClusterRole carrying the label ‘rbac.authorization.k8s.io/aggregate-to-edit: “true”’ gets automatically folded into ‘edit’, with no human in the loop and no diff to review.

This is convenient right up until it is not. When you installed that operator back in March, its Helm chart shipped a CRD and a ClusterRole with the aggregation label attached, because that is the polite, idiomatic way to do it. From the moment ‘helm install’ finished, every subject bound to ‘edit’ in your cluster silently gained permissions over a brand new resource type. Nobody approved it. Nobody saw it. The controller did exactly what it was designed to do, which is the part that should worry you.

So the RoleBinding still says ‘edit’. The word has not changed. What it grants has, several times, across several chart upgrades, and the only record of the expansion is scattered across ClusterRole objects nobody has opened since they were applied.

The takeaway is small and annoying: every time you install a chart, check what it aggregated. ‘kubectl get clusterrole -l rbac.authorization.k8s.io/aggregate-to-edit=true’ is two minutes of your life and occasionally a genuine surprise.

That ServiceAccount reads secrets, all of them, probably

Consider a ServiceAccount with ‘get’ on secrets in a single namespace. On paper, this looks narrow and tidy. The reviewer who approved it was right to approve it. The problem is that RBAC grants do not live in isolation; they live next to whatever else is running in that namespace.

If that namespace also hosts External Secrets Operator, a Vault Agent sidecar, or a CSI secrets driver, the secrets sitting there are not application trivia. They are the synced, materialized credentials that those tools pulled from somewhere more important. A grant that reads “can view secrets in ‘team-a’” can, depending on the architecture around it, mean “can read the cloud provider credentials that External Secrets faithfully copied into ‘team-a’ thirty seconds ago.”

Nothing here is broken. Every component is behaving as documented. That is exactly why it slips past review: each piece is reasonable, and the risk only exists in the seam between them, where no single Role definition is looking.

So when you audit a secrets grant, do not read the Role. Read the room. Ask what else lives in that namespace and what those neighbors keep in their pockets.

Creating a Pod sometimes creates a root shell on the node

This is the one people refuse to believe until you show them.

If Pod Security Admission is not enforced in ‘restricted’ mode, a subject with ‘create’ on pods is, functionally, a subject with a path to the node. They can define a pod that mounts the host root filesystem as a volume, sets ‘hostPID: true’, runs ‘privileged: true’, or maps a host port to quietly intercept traffic. From inside that pod, the node is no longer a node; it is a directory.

None of this is a vulnerability. There is no CVE to patch, because Kubernetes is doing precisely what the spec permits. The escalation lives in the gap between two true statements: “we have RBAC” and “nobody can reach the node.” Both can be accurate. Together, they can still be a hole you could drive a cluster through.

The fix is not more RBAC. It is admission control. Enforce PSA ‘restricted’ as the namespace default, and treat every exception as a decision someone wrote down and owns, rather than a default nobody chose.

Three commands that will ruin your afternoon

Theory is comfortable. Here is the part where you actually look.

‘kubectl-who-can’ answers the blunt question: who can perform this verb on this resource, right now. ‘kubectl who-can create pods -n production’ is a fast way to find out that the list is longer than you remembered.

‘rakkess’ produces a full access matrix for a given subject, so you can stare at an entire grid of green checkmarks belonging to a ServiceAccount that, in principle, only needed to read a config map.

‘rbac-tool lookup’ lists everything a specific subject can do across the whole cluster, which is the tool you run when you have a name and a bad feeling.

I will set an honest expectation. The first time you run any of these against a cluster older than a year, you will find at least one thing nobody intended, and there is a decent chance it will be something you granted. This is not a moral failing. It is entropy. Permissions accrete the same way junk drawers do, one reasonable decision at a time.

The scanner will still be green, that is no longer the point

Here is where I am supposed to hand you a fix that makes the scary parts go away. I cannot, because least privilege in Kubernetes is not a configuration state you reach and then defend. It is a process you keep doing, slightly grudgingly, forever.

Start subjects at zero and grant only what the audit log proves they actually use. Tools like ‘audit2rbac’ can generate tight RBAC from real API server audit events, which is to say from evidence rather than from optimism. Enforce PSA ‘restricted’ by default. Audit aggregated ClusterRoles every time you install a chart. Rotate ServiceAccount tokens, because a credential that never expires is just a future incident with good patience.

Do all of that, and run the scanner again. It will still be green. It was always going to be green. The result has not changed at all. The only thing that has changed is the question you now know to ask, and that, inconveniently, was the whole job.

There is no universal answer here, only better-informed trade-offs, and the faint suspicion that your next audit will find something too. It usually does.

June 6, 2026 by Fernando SRE DevOps stuff Kubernetes SRE stuff