Security

Inside Kubernetes Container Runtimes

Containers have transformed how we build, deploy, and run software. We package our apps neatly into them, toss them onto Kubernetes, and sit back as things smoothly fall into place. But hidden beneath this simplicity is a critical component quietly doing all the heavy lifting, the container runtime. Let’s explain and clearly understand what this container runtime is, why it matters, and how it helps everything run seamlessly.

What exactly is a Container Runtime?

A container runtime is simply the software that takes your packaged application and makes it run. Think of it like the engine under the hood of your car; you rarely think about it, but without it, you’re not going anywhere. It manages tasks like starting containers, isolating them from each other, managing system resources such as CPU and memory, and handling important resources like storage and network connections. Thanks to runtimes, containers remain lightweight, portable, and predictable, regardless of where you run them.

Why should you care about Container Runtimes?

Container runtimes simplify what could otherwise become a messy job of managing isolated processes. Kubernetes heavily relies on these runtimes to guarantee the consistent behavior of applications every single time they’re deployed. Without runtimes, managing containers would be chaotic, like cooking without pots and pans, you’d end up with scattered ingredients everywhere, and things would quickly get messy.

Getting to know the popular Container Runtimes

Let’s explore some popular container runtimes that you’re likely to encounter:

Docker

Docker was the original popular runtime. It played a key role in popularizing containers, making them accessible to developers and enterprises alike. Docker provides an easy-to-use platform that allows applications to be packaged with all their dependencies into lightweight, portable containers.

One of Docker’s strengths is its extensive ecosystem, including Docker Hub, which offers a vast library of pre-built images. This makes it easy to find and deploy applications quickly. Additionally, Docker’s CLI and tooling simplify the development workflow, making container management straightforward even for those new to the technology.

However, as Kubernetes evolved, it moved away from relying directly on Docker. This was mainly because Docker was designed as a full-fledged container management platform rather than a lightweight runtime. Kubernetes required something leaner that focused purely on running containers efficiently without unnecessary overhead. While Docker still works well, most Kubernetes clusters now use containerd or CRI-O as their primary runtime for better performance and integration.

containerd

Containerd emerged from Docker as a lightweight, efficient, and highly optimized runtime that focuses solely on running containers. If Docker is like a full-service restaurant—handling everything from taking orders to cooking and serving, then containerd is just the kitchen. It does the cooking, and it does it well, but it leaves the extra fluff to other tools.

What makes containerd special? First, it’s built for speed and efficiency. It strips away the unnecessary components that Docker carries, focusing purely on running containers without the added baggage of a full container management suite. This means fewer moving parts, less resource consumption, and better performance in large-scale Kubernetes environments.

Containerd is now a graduated project under the Cloud Native Computing Foundation (CNCF), proving its reliability and widespread adoption. It’s the default runtime for many managed Kubernetes services, including Amazon EKS, Google GKE, and Microsoft AKS, largely because of its deep integration with Kubernetes through the Container Runtime Interface (CRI). This allows Kubernetes to communicate with containerd natively, eliminating extra layers and complexity.

Despite its strengths, containerd lacks some of the convenience features that Docker offers, like a built-in CLI for managing images and containers. Users often rely on tools like ctr or crictl to interact with it directly. But in a Kubernetes world, this isn’t a big deal, Kubernetes itself takes care of most of the higher-level container management.

With its low overhead, strong Kubernetes integration, and widespread industry support, containerd has become the go-to runtime for modern containerized workloads. If you’re running Kubernetes today, chances are containerd is quietly doing the heavy lifting in the background, ensuring your applications start up reliably and perform efficiently.

CRI-O

CRI-O is designed specifically to meet Kubernetes standards. It perfectly matches Kubernetes’ Container Runtime Interface (CRI) and focuses solely on running containers. If Kubernetes were a high-speed train, CRI-O would be the perfectly engineered rail system built just for it, streamlined, efficient, and without unnecessary distractions.

One of CRI-O’s biggest strengths is its tight integration with Kubernetes. It was built from the ground up to support Kubernetes workloads, avoiding the extra layers and overhead that come with general-purpose container platforms. Unlike Docker or even containerd, which have broader use cases, CRI-O is laser-focused on running Kubernetes workloads efficiently, with minimal resource consumption and a smaller attack surface.

Security is another area where CRI-O shines. Since it only implements the features Kubernetes needs, it reduces the risk of security vulnerabilities that might exist in larger, more feature-rich runtimes. CRI-O is also fully OCI-compliant, meaning it supports Open Container Initiative images and integrates well with other OCI tools.

However, CRI-O isn’t without its downsides. Because it’s so specialized, it lacks some of the broader ecosystem support and tooling that containerd and Docker enjoy. Its adoption is growing, but it’s not as widely used outside of Kubernetes environments, meaning you may not find as much community support compared to the more established runtimes.
Despite these trade-offs, CRI-O remains a great choice for teams that want a lightweight, Kubernetes-native runtime that prioritizes efficiency, security, and streamlined performance.

Kata Containers

Kata Containers offers stronger isolation by running containers within lightweight virtual machines. It’s perfect for highly sensitive workloads, providing a security level closer to traditional virtual machines. But this added security comes at a cost, it typically uses more resources and can be slower than other runtimes. Consider Kata Containers as placing your app inside a secure vault, ideal when security is your top priority.

gVisor

Developed by Google, gVisor offers enhanced security by running containers within a user-space kernel. This approach provides isolation closer to virtual machines without requiring traditional virtualization. It’s excellent for workloads needing stronger isolation than standard containers but less overhead than full VMs. However, gVisor can introduce a noticeable performance penalty, especially for resource-intensive applications, because system calls must pass through its user-space kernel.

Kubernetes and the Container Runtime Interface

Kubernetes interacts with container runtimes using something called the Container Runtime Interface (CRI). Think of CRI as a universal translator, allowing Kubernetes to clearly communicate with any runtime. Kubernetes sends instructions, like launching or stopping containers, through CRI. This simple interface lets Kubernetes remain flexible, easily switching runtimes based on your needs without fuss.

Choosing the right Runtime for your needs

Selecting the best runtime depends on your priorities:

Efficiency – Does it maximize system performance?
Complexity: Does it avoid adding unnecessary complications?
Security: Does it provide the isolation level your applications demand?

If security is crucial, like handling sensitive financial or medical data, you might prefer runtimes like Kata Containers or gVisor, specifically designed for stronger isolation.

Final thoughts

Container runtimes might not grab headlines, but they’re crucial. They quietly handle the heavy lifting, making sure your containers run smoothly, securely, and efficiently. Even though they’re easy to overlook, runtimes are like the backstage crew of a theater production, diligently working behind the curtains. Without them, even the simplest container deployment would quickly turn into chaos, causing applications to crash, misbehave, or even compromise security.
Every time you launch an application effortlessly onto Kubernetes, it’s because the container runtime is silently solving complex problems for you. So, the next time your containers spin up flawlessly, take a moment to appreciate these hidden champions, they might not get applause, but they truly deserve it.

How ABAC and Cross-Account Roles Revolutionize AWS Permission Management

Managing permissions in AWS can quickly turn into a juggling act, especially when multiple AWS accounts are involved. As your organization grows, keeping track of who can access what becomes a real headache, leading to either overly permissive setups (a security risk) or endless policy updates. There’s a better approach: ABAC (Attribute-Based Access Control) and Cross-Account Roles. This combination offers fine-grained control, simplifies management, and significantly strengthens your security.

The fundamentals of ABAC and Cross-Account roles

Let’s break these down without getting lost in technicalities.

First, ABAC vs. RBAC. Think of RBAC (Role-Based Access Control) as assigning a specific key to a particular door. It works, but what if you have countless doors and constantly changing needs? ABAC is like having a key that adapts based on who you are and what you’re accessing. We achieve this using tags – labels attached to both resources and users.

RBAC: “You’re a ‘Developer,’ so you can access the ‘Dev’ database.” Simple, but inflexible.
ABAC: “You have the tag ‘Project: Phoenix,’ and the resource you’re accessing also has ‘Project: Phoenix,’ so you’re in!” Far more adaptable.

Now, Cross-Account Roles. Imagine visiting a friend’s house (another AWS account). Instead of getting a copy of their house key (a user in their account), you get a special “guest pass” (an IAM Role) granting access only to specific rooms (your resources). This “guest pass” has rules (a Trust Policy) stating, “I trust visitors from my friend’s house.”

Finally, AWS Security Token Service (STS). STS is like the concierge who verifies the guest pass and issues a temporary key (temporary credentials) for the visit. This is significantly safer than sharing long-term credentials.

Making it real

Let’s put this into practice.

Example 1: ABAC for resource control (S3 Bucket)

You have an S3 bucket holding important project files. Only team members on “Project Alpha” should access it.

Here’s a simplified IAM policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket"
      ],
      "Resource": "arn:aws:s3:::your-project-bucket",
      "Condition": {
        "StringEquals": {
          "aws:ResourceTag/Project": "${aws:PrincipalTag/Project}"
        }
      }
    }
  ]
}

This policy says: “Allow actions like getting, putting, and listing objects in ‘your-project-bucket‘ if the ‘Project‘ tag on the bucket matches the ‘Project‘ tag on the user trying to access it.”

You’d tag your S3 bucket with Project: Alpha. Then, you’d ensure your “Project Alpha” team members have the Project: Alpha tag attached to their IAM user or role. See? Only the right people get in.

Example 2: Cross-account resource sharing with ABAC

Let’s say you have a “hub” account where you manage shared resources, and several “spoke” accounts for different teams. You want to let the “DataScience” team from a spoke account access certain resources in the hub, but only if those resources are tagged for their project.

Create a Role in the Hub Account: Create a role called, say, DataScienceAccess.

Trust Policy (Hub Account): This policy, attached to the DataScienceAccess role, says who can assume the role:


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::SPOKE_ACCOUNT_ID:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
            "StringEquals": {
                "sts:ExternalId": "DataScienceExternalId"
            }
      }
    }
  ]
}

Replace SPOKE_ACCOUNT_ID with the actual ID of the spoke account, and it is a good practice to use an ExternalId. This means, “Allow the root user of the spoke account to assume this role”.

Permission Policy (Hub Account): This policy, also attached to the DataScienceAccess role, defines what the role can do. This is where ABAC shines:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": "arn:aws:s3:::shared-resource-bucket/*",
      "Condition": {
        "StringEquals": {
          "aws:ResourceTag/Project": "${aws:PrincipalTag/Project}"
        }
      }
    }
  ]
}

This says, “Allow access to objects in ‘shared-resource-bucket’ only if the resource’s ‘Project’ tag matches the user’s ‘Project’ tag.”

In the Spoke Account: Data scientists in the spoke account would have a policy allowing them to assume the DataScienceAccess role in the hub account. They would also have the appropriate Project tag (e.g., Project: Gamma).

The flow looks like this:

Spoke Account User -> AssumeRole (Hub Account) -> STS provides temporary credentials -> Access Shared Resource (if tags match)

Advanced use cases and automation

Control Tower & Service Catalog: These services help automate the setup of cross-account roles and ABAC policies, ensuring consistency across your organization. Think of them as blueprints and a factory for your access control.
Auditing and Compliance: Imagine needing to prove compliance with PCI DSS, which requires strict data access controls. With ABAC, you can tag resources containing sensitive data with Scope: PCI and ensure only users with the same tag can access them. AWS Config and CloudTrail, along with IAM Access Analyzer, let you monitor access and generate reports, proving you’re meeting the requirements.

Best practices and troubleshooting

Tagging Strategy is Key: A well-defined tagging strategy is essential. Decide on naming conventions (e.g., Project, Environment, CostCenter) and enforce them consistently.
Common Pitfalls:
– Inconsistent Tags: Make sure tags are applied uniformly. A typo can break access.
– Overly Permissive Policies: Start with the principle of least privilege. Grant only the necessary access.
Tools and Resources:
– IAM Access Analyzer: Helps identify overly permissive policies and potential risks.
– AWS documentation provides detailed information.

Summarizing

ABAC and Cross-Account Roles offer a powerful way to manage access in a multi-account AWS environment. They provide the flexibility to adapt to changing needs, the security of fine-grained control, and the simplicity of centralized management. By embracing these tools, we can move beyond the limitations of traditional IAM and build a truly scalable and secure cloud infrastructure.

March 6, 2025 by Fernando SRE Cloud stuff DevOps stuff

AWS Identity Management – Choosing the right Policy or Role

Let’s be honest, AWS Identity and Access Management (IAM) can feel like a jungle. You’ve got your policies, your roles, your managed this, and your inline that. It’s easy to get lost, and a wrong turn can lead to a security vulnerability or a frustrating roadblock. But fear not! Just like a curious explorer, we’re going to cut through the thicket and understand this thing. Why? Mastering IAM is crucial to keeping your AWS environment secure and efficient. So, which policy type is the right one for the job? Ever scratched your head over when to use a service-linked role? Stick with me, and we’ll figure it out with a healthy dose of curiosity and a dash of common sense.

Understanding Policies and Roles

First things first. Let’s get our definitions straight. Think of policies as rulebooks. They are written in a language called JSON, and they define what actions are allowed or denied on which AWS resources. Simple enough, right?

Now, roles are a bit different. They’re like temporary access badges. An entity, be it a user, an application, or even an AWS service itself, can “wear” a role to gain specific permissions for a limited time. A user or a service is not granted permissions directly, it’s the role that has the permissions.

AWS Policy types

Now, let’s explore the different flavors of policies.

AWS Managed Policies

These are like the standard-issue rulebooks created and maintained by AWS itself. You can’t change them, just like you can’t rewrite the rules of physics! But AWS keeps them updated, which is quite handy.

Use Cases: Perfect for common scenarios. Need to give someone basic access to S3? There’s probably an AWS-managed policy for that.
Pros: Easy to use, always up-to-date, less work for you.
Cons: Inflexible, you’re stuck with what AWS provides.

Customer Managed Policies

These are your rulebooks. You write them, you modify them, you control them.

Use Cases: When you need fine-grained control, like granting access to a very specific resource or creating custom permissions for your application, this is your go-to choice.
Pros: Total control, flexible, adaptable to your unique needs.
Cons: More responsibility, you need to know what you’re doing. You’ll be in charge of updating and maintaining them.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::my-specific-bucket/*"
        }
    ]
}

This simple policy allows getting objects only from my-specific-bucket. You have to adapt it to your necessities.

Inline Policies

These are like sticky notes attached directly to a user, group, or role. They’re tightly bound and can’t be reused.

Use Cases: For precise, one-time permissions. Imagine a developer who needs temporary access to a particular resource for a single task.
Pros: Highly specific, good for exceptions.
Cons: A nightmare to manage at scale, not reusable.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "dynamodb:DeleteItem",
            "Resource": "arn:aws:dynamodb:us-east-1:123456789012:table/MyTable"
        }
    ]
}

This policy is directly embedded within users and permits them to delete items from the MyTable DynamoDB table. It does not apply to other users or resources.

Service-Linked Roles. The smooth operators

These are special roles pre-configured by AWS services to interact with other AWS services securely. You don’t create them, the service does.

Use Cases: Think of Auto Scaling needing to launch EC2 instances or Elastic Load Balancing managing resources on your behalf. It’s like giving your trusted assistant a special key to access specific rooms in your house.
Pros: Simplifies setup, and ensures security best practices are followed. AWS takes care of these roles behind the scenes, so you don’t need to worry about them.
Cons: You can’t modify them directly. So, it’s essential to understand what they do.

aws autoscaling create-auto-scaling-group \ --auto-scaling-group-name my-asg \ --launch-template "LaunchTemplateId=lt-0123456789abcdef0,Version=1" \ --min-size 1 \ --max-size 3 \ --vpc-zone-identifier "subnet-0123456789abcdef0" \ --service-linked-role-arn arn:aws:iam::123456789012:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling

This code creates an Auto Scaling group, and the service-linked-role-arn parameter specifies the ARN of the service-linked role for Auto Scaling. It’s usually created automatically by the service when needed.

Best practices

Least Privilege: Always, always, always grant only the necessary permissions. It’s like giving out keys only to the rooms people need to access, not the entire house!
Regular Review: Things change. Regularly review your policies and roles to make sure they’re still appropriate.
Use the Right Tools: AWS provides tools like IAM Access Analyzer to help you manage this stuff. Use them!
Document Everything: Keep track of your policies and roles, their purpose, and why they were created. It will save you headaches later.

In sum

The right policy or role depends on the specific situation. Choose wisely, keep things tidy, and you will have a secure and well-organized AWS environment.

January 29, 2025 by Fernando SRE Cloud stuff SRE stuff