GoogleCloud

What replaces Transit Gateway on Google cloud

Spoiler: There is no single magic box. There is a tidy drawer of parts that click together so cleanly you stop missing the box.

The first time I asked a team to set up “Transit Gateway on Google Cloud,” I received the sort of polite silence you reserve for relatives who ask where you keep the fax machine. On AWS, you reach for Transit Gateway and call it a day. On Azure, you reach for Virtual WAN and its Virtual Hubs. On Google Cloud, you reach for… a shorter shopping list: one global VPC, Network Connectivity Center with VPC spokes when you need a hub, VPC Peering when you do not, Private Service Connect for producer‑consumer traffic, and Cloud Router to keep routes honest.

Once you stop searching for a product name and start wiring the right parts, transit on Google Cloud turns out to be pleasantly boring.

The short answer

Inter‑VPC at scale → Network Connectivity Center (NCC) with VPC spokes
One‑to‑one VPC connectivity → VPC Peering (non‑transitive)
Private access to managed or third‑party services → Private Service Connect (PSC)
Hybrid connectivity → Cloud Router + HA VPN or Interconnect with dynamic routing mode set to Global

That’s the toolkit most teams actually need. The rest of this piece is simply: where each part shines, where it bites, and how to string them together without leaving teeth marks.

How do the other clouds solve it?

AWS: VPCs are regional. Transit Gateway acts as the hub; if you span regions, you peer TGWs. It is a well‑lit path and a single product name.
Azure: VNets are regional. Virtual WAN gives you a global fabric with per‑region Virtual Hubs, optionally “secured” with an integrated firewall.
Google Cloud: a VPC is global (routing table and firewalls are global, subnets remain regional). You do not need a separate “global transit” box to make two instances in different regions talk. When you outgrow simple, add NCC with VPC spokes for hub‑and‑spoke, PSC for services, and Cloud Router for dynamic routing.

Different philosophies, same goal. Google Cloud leans into a global network and small, specialized parts.

What a global VPC really means

A Google Cloud VPC gives you a global control plane. You define routes and firewall rules once, and they apply across regions; you place subnets per region where compute lives. That split is why multi‑region feels natural on GCP without an extra transit layer. Not everything is magic, though:

Cloud Router, VPN, and Interconnect are regional attachments. You can and often should set dynamic routing mode to Global so learned routes propagate across the VPC, but the physical attachment still sits in a region.
Global does not mean chaotic. IAM, firewall rules, hierarchical policies, and VPC Service Controls provide the guardrails you actually want.

Choosing the right part

Network connectivity center with VPC spokes

Use it when you have many VPCs and want managed transit without building a mesh of N×N peerings. NCC gives you a hub‑and‑spoke model where spokes exchange routes through the hub, including hybrid spokes via Cloud Router. Think “default” once your VPC count creeps into the double digits.

Use when you need inter‑VPC transit at scale, clear centralization, and easy route propagation.

Avoid when you have only two or three VPCs that will never grow. Simpler is nicer.

VPC Peering

Use it for simple 1:1 connectivity. It is non‑transitive by design. If A peers with B and B peers with C, A does not automatically reach C. This is not a bug; it is a guardrail. If you catch yourself drawing triangles, take the hint and move to NCC.

Use when two VPCs need to talk, and that’s the end of the story.

Avoid when you need full‑mesh or centralized inspection.

Private Service Connect

Use it when a consumer VPC needs private access to a producer (managed Google service like Cloud SQL, or a third‑party/SaaS running behind a producer endpoint). PSC is not inter‑VPC transit; it is producer‑consumer plumbing with private IPs and tight control.

Use when you want “just the sauce” from a service without crossing the public internet.

Avoid when you are trying to stitch two application VPCs together. That is a job for NCC or peering.

Cloud Router with HA VPN or Interconnect

Use it for hybrid. Cloud Router speaks BGP and exchanges routes dynamically with your on‑prem or colo edge. Set dynamic routing to Global so routes learned in one region are known across the VPC. Remember that the attachments are regional; plan for redundancy per region.

Use when you want fewer static routes and less drift between environments.

Avoid when you expected a single global attachment. That is not how physics—or regions—work.

Three quick patterns

Multi‑region application in one VPC

One global VPC, regional subnets in us‑east1, europe‑west1, and asia‑east1. Instances talk across regions without extra kit. If the app grows into multiple VPCs per domain (core, data, edge), bring in NCC as the hub.

Mergers and acquisitions without a month of rewiring

Projects in Google Cloud are movable between folders and even organizations, subject to permissions and policy guardrails. That turns “lift and splice” into a routine operation rather than a quarter‑long saga. Be upfront about prerequisites: billing, liens, org policy, and compliance can slow a move; plan them, do not hand‑wave them.

Shared services with clean tenancy

Run shared services in a host project via Shared VPC. Attach service projects for each team. For an external partner, use VPC Peering or PSC, depending on whether they need network adjacency or just a service endpoint. If many internal VPCs need those shared bits, let NCC be the meeting place.

ASCII sketch of the hub

Pitfalls you can dodge

Expecting peering to be transitive. It is not. If your diagram starts to look like spaghetti, stop and bring in NCC.
Treating Cloud Router as global. It is regional. The routing mode can be Global; the attachment is not. Plan per‑region redundancy.
Using PSC as inter‑VPC glue. PSC is for producer‑consumer privacy, not general transit.
Forgetting DNS. Cross‑project and cross‑VPC name resolution needs deliberate configuration. Decide where you publish private zones and who can see them.
Over‑centralizing inspection. The global VPC makes central stacks attractive, but latency budgets are still a thing. Place controls where the traffic lives.

Security that scales with freedom

A global VPC does not mean a free‑for‑all. The security model leans on identity and context rather than IP folklore.

IAM everywhere for least privilege and clear ownership.
VPC firewall rules with hierarchical policy for the sharp edges.
VPC Service Controls for data perimeter around managed services.
Cloud Armor and load balancers at the edge, where they belong.

The result is a network that is permissive where it should be and stubborn where it must be.

A tiny buying guide for your brain

Two VPCs, done in a week → VPC Peering
Ten VPCs, many teams, add partners next quarter → NCC with VPC spokes
Just need private access to Cloud SQL or third‑party services → PSC
Datacenter plus cloud, please keep routing sane → Cloud Router with HA VPN or Interconnect, dynamic routing Global

If you pick the smallest thing that works today and the most boring thing that still works next year, you will almost always land on the right square.

Where the magic isn’t

Transit Gateway is a great product name. It just happens to be the wrong shopping query on Google Cloud. You are not assembling a monolith; you are pulling the right pieces from a drawer that has been neatly labeled for years. NCC connects the dots, Peering keeps simple things simple, PSC keeps services private, and Cloud Router shakes hands with the rest of your world. None of them is glamorous. All of them are boring in the way electricity is boring when it works.

If you insist on a single giant box, you will end up using it as a hammer. Google Cloud encourages a tidier vice: choose the smallest thing that does the job, then let the global VPC and dynamic routing do the quiet heavy lifting. Need many VPCs to talk without spaghetti? NCC with spokes. Need two VPCs and a quiet life? Peering. Need only the sauce from Cloud SQL or a partner? PSC. Need the campus to meet the cloud without sticky notes of static routes? Cloud Router with HA VPN or Interconnect. Label the bag, not every screw.

The punchline is disappointingly practical. When teams stop hunting for a product name, they start shipping features. Incidents fall in number and in temperature. The network diagram loses its baroque flourishes and starts looking like something you could explain before your coffee cools.

So yes, keep admiring Transit Gateway as a name. Then close the tab and open the drawer you already own. Put the parts back in the same place when you are done, teach the interns what each one is for, and get back to building the thing your users actually came for. The box you were searching for was never the point; the drawer is how you move faster without surprises.

October 7, 2025 by Fernando SRE Cloud stuff

GKE key advantages over other Kubernetes platforms

Exploring the world of containerized applications reveals Kubernetes as the essential conductor for its intricate operations. It’s the common language everyone speaks, much like how standard shipping containers revolutionized global trade by fitting onto any ship or truck. Many cloud providers offer their own managed Kubernetes services, but Google Kubernetes Engine (GKE) often takes center stage. It’s not just another Kubernetes offering; its deep roots in Google Cloud, advanced automation, and unique optimizations make it a compelling choice.

Let’s see what sets GKE apart from alternatives like Amazon EKS, Microsoft AKS, and self-managed Kubernetes, and explore why it might be the most robust platform for your cloud-native ambitions.

Google’s inherent Kubernetes expertise

To truly understand GKE’s edge, we need to look at its origins. Google didn’t just adopt Kubernetes; they invented it, evolving it from their internal powerhouse, Borg. Think of it like learning a complex recipe. You could learn from a skilled chef who has mastered it, or you could learn from the very person who created the dish, understanding every nuance and ingredient choice. That’s GKE.

This “creator” status means:

Direct, Unfiltered Expertise: GKE benefits directly from the insights and ongoing contributions of the engineers who live and breathe Kubernetes.
Early Access to Innovation: GKE often supports the latest stable Kubernetes features before competitors can. It’s like getting the newest tools straight from the workshop.
Seamless Google Cloud Synergy: The integration with Google Cloud services like Cloud Logging, Cloud Monitoring, and Anthos is incredibly tight and natural, not an afterthought.

How Others Compare:

While Amazon EKS and Microsoft AKS are capable managed services, they don’t share this native lineage. Self-managed Kubernetes, whether on-premises or set up with tools like kops, places the full burden of upgrades, maintenance, and deep expertise squarely on your shoulders.

The simplicity of Autopilot fully managed Kubernetes

GKE offers a game-changing operational model called Autopilot, alongside its Standard mode (which is more akin to EKS/AKS where you manage node pools). Autopilot is like hiring an expert event planning team that also handles all the setup, catering, and cleanup for your party, leaving you to simply enjoy hosting. It offers a truly serverless Kubernetes experience.

Key benefits of Autopilot:

Zero Node Management: Google takes care of node provisioning, scaling, and all underlying infrastructure concerns. You focus on your applications, not the plumbing.
Optimized Cost Efficiency: You pay for the resources your pods actually consume, not for idle nodes. It’s like only paying for the electricity your appliances use, not a flat fee for being connected to the grid.
Built-in Enhanced Security: Security best practices are automatically applied and managed by Google, hardening your clusters by default.

How others compare:

EKS and AKS require you to actively manage and scale your node pools. Self-managed clusters demand significant, ongoing operational efforts to keep everything running smoothly and securely.

Unified multi-cluster and multi-cloud operations with Anthos

In an increasingly distributed world, managing applications across different environments can feel like juggling too many balls. GKE’s integration with Anthos, Google’s hybrid and multi-cloud platform, acts as a master control panel.

Anthos allows for:

Centralized command: Manage GKE clusters alongside those on other clouds like EKS and AKS, and even your on-premises deployments, all from a single viewpoint. It’s like having one universal remote for all your different entertainment systems.
Consistent policies everywhere: Apply uniform configurations and security policies across all your environments using Anthos Config Management, ensuring consistency no matter where your workloads run.
True workload portability: Design for flexibility and avoid vendor lock-in, moving applications where they make the most sense.

How Others Compare:

EKS and AKS generally lack such comprehensive, native multi-cloud management tools. Self-managed Kubernetes often requires integrating third-party solutions like Rancher to achieve similar multi-cluster oversight, adding complexity.

Sophisticated networking and security foundations

GKE comes packed with unique networking and security features that are deeply woven into the platform.

Networking highlights:

Global load balancing power: Native integration with Google’s global load balancer means faster, more scalable, and more resilient traffic management than many traditional setups.
Automated certificate management: Google-managed Certificate Authority simplifies securing your services.
Dataplane V2 advantage: This Cilium-based networking stack provides enhanced security, finer-grained policy enforcement, and better observability. Think of it as upgrading your building’s basic security camera system to one with AI-powered threat detection and detailed access logs.

Security fortifications:

Workload identity clarity: This is a more secure way to grant Kubernetes service accounts access to Google Cloud resources. Instead of managing static, exportable service account keys (like having physical keys that can be lost or copied), each workload gets a verifiable, short-lived identity, much like a temporary, auto-expiring digital pass.
Binary authorization assurance: Enforce policies that only allow trusted, signed container images to be deployed.
Shielded GKE nodes protection: These nodes benefit from secure boot, vTPM, and integrity monitoring, offering a hardened foundation for your workloads.

How Others Compare:

While EKS and AKS leverage AWS and Azure security tools respectively, achieving the same level of integration, Kubernetes-native security often requires more manual configuration and piecing together different services. Self-managed clusters place the entire burden of security hardening and ongoing vigilance on your team.

Smart cost efficiency and pricing structure

GKE’s pricing model is competitive, and Autopilot, in particular, can lead to significant savings.

No control plane fees for Autopilot: Unlike EKS, which charges an hourly fee per cluster control plane, GKE Autopilot clusters don’t have this charge. Standard GKE clusters have one free zonal cluster per billing account, with a small hourly fee for regional clusters or additional zonal ones.
Sustained use discounts: Automatic discounts are applied for workloads that run for extended periods.
Cost-Saving VM options: Support for Preemptible VMs and Spot VMs allows for substantial cost reductions for fault-tolerant or batch workloads.

How Others Compare:

EKS incurs control plane costs on top of node costs. AKS offers a free control plane but may not match GKE’s automation depth, potentially leading to other operational costs.

Optimized for AI ML and Big Data workloads

For teams working with Artificial Intelligence, Machine Learning, or Big Data, GKE offers a highly optimized environment.

Seamless GPU and TPU access: Effortless provisioning and utilization of GPUs and Google’s powerful TPUs.
Kubeflow integration: Streamlines the deployment and management of ML pipelines.
Strong BigQuery ML and Vertex AI synergy: Tight compatibility with Google’s leading data analytics and AI platforms.

How Others Compare:

EKS and AKS support GPUs, but native TPU integration is a unique Google Cloud advantage. Self-managed setups require manual configuration and integration of the entire ML stack.

Why GKE stands out

Choosing the right Kubernetes platform is crucial. While all managed services aim to simplify Kubernetes operations, GKE offers a unique blend of heritage, innovation, and deep integration.

GKE emerges as a firm contender if you prioritize:

A truly hands-off, serverless-like Kubernetes experience with Autopilot.
The benefits of Google’s foundational Kubernetes expertise and rapid feature adoption.
Seamless hybrid and multi-cloud capabilities through Anthos.
Advanced, built-in security and networking designed for modern applications.

If your workloads involve AI/ML, and big data analytics, or you’re deeply invested in the Google Cloud ecosystem, GKE provides an exceptionally integrated and powerful experience. It’s about choosing a platform that not only manages Kubernetes but elevates what you can achieve with it.

Comparing permissions management in GCP and AWS

Cloud security forms the foundation of building and maintaining modern digital infrastructures. Central to this security is Identity and Access Management, commonly known as IAM. Google Cloud Platform (GCP) and Amazon Web Services (AWS), two leading cloud providers, handle IAM differently. Understanding these distinctions is crucial for architects and DevOps engineers aiming to create secure, flexible systems tailored to each provider’s capabilities.

IAM fundamentals in Google Cloud Platform

In GCP, permissions management is driven by roles and policies. Consider a role as a keychain, with each key representing a specific permission. A role groups these permissions, streamlining the management by enabling you to grant multiple permissions at once.

GCP assigns roles to identities called members, including individual users, user groups, and service accounts. Here’s a straightforward example:

You have a developer named Alex, who needs to manage compute resources. In GCP, you would assign the Compute Admin role directly to Alex’s Google account, granting all associated permissions instantly.

Here’s an example of a simple GCP IAM policy:

{
  "bindings": [
    {
      "role": "roles/compute.admin",
      "members": [
        "user:alex@example.com"
      ]
    }
  ]
}

IAM fundamentals in Amazon Web Services

AWS uses policies defined as detailed JSON documents explicitly stating allowed or denied actions. Think of an AWS policy as a clear instruction manual that specifies exactly which tasks are permissible.

AWS utilizes three primary IAM entities: users, groups, and roles. A significant difference is how AWS manages roles, which are assumed temporarily rather than permanently assigned.

AWS achieves temporary access through the Security Token Service (STS). For example:

A developer named Jamie temporarily requires access to AWS Lambda functions. Rather than granting permanent access, AWS issues temporary credentials through STS, allowing Jamie to assume a Lambda execution role that expires automatically after a set duration.

Here’s an example of an AWS IAM policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "lambda:InvokeFunction"
      ],
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:my-function"
    }
  ]
}

Implementing temporary access in Google Cloud

Although GCP typically favors direct role assignments, it provides a similar capability to AWS’s temporary role assumption known as service account impersonation.

Service account impersonation in GCP allows temporary adoption of permissions associated with a service account, akin to borrowing someone else’s access badge briefly. This method provides temporary permissions without permanently altering the user’s existing access.

To illustrate clearly:

Emily needs temporary access to a storage bucket. Rather than assigning permanent permissions, Emily can impersonate a service account with those specific storage permissions. Once her task is complete, Emily automatically reverts to her original permission set.

While AWS’s STS and GCP’s impersonation achieve similar goals, their implementations differ notably in complexity and methodology.

Summary of differences

The primary distinction between GCP and AWS in managing permissions revolves around their approach to temporary versus permanent access:

GCP typically favors straightforward, persistent role assignments, enhanced by optional service account impersonation for temporary tasks.
AWS inherently integrates temporary credentials using its Security Token Service, embedding temporary role assumption deeply within its security framework.

Both systems are robust, and understanding their unique aspects is essential. Recognizing these IAM differences empowers architects and DevOps teams to optimize cloud security strategies, ensuring flexibility, robust security, and compliance specific to each cloud platform’s strengths.

May 22, 2025 by Fernando SRE Cloud stuff DevOps stuff SRE stuff