EC2

Secure and simplify EC2 access with AWS Session Manager

Accessing EC2 instances used to be a hassle. Bastion hosts, SSH keys, firewall rules, each piece added another layer of complexity and potential security risks. You had to open ports, distribute keys, and constantly manage access. It felt like setting up an intricate vault just to perform simple administrative tasks.

AWS Session Manager changes the game entirely. No exposed ports, no key distribution nightmares, and a complete audit trail of every session. Think of it as replacing traditional keys and doors with a secure, on-demand teleportation system, one that logs everything.

How AWS Session Manager works

Session Manager is part of AWS Systems Manager, a fully managed service that provides secure, browser-based, and CLI-based access to EC2 instances without needing SSH or RDP. Here’s how it works:

An SSM Agent runs on the instance and communicates outbound to AWS Systems Manager.
When you start a session, AWS verifies your identity and permissions using IAM.
Once authorized, a secure channel is created between your local machine and the instance, without opening any inbound ports.

This approach significantly reduces the attack surface. There is no need to open port 22 (SSH) or 3389 (RDP) for bastion hosts. Moreover, since authentication and authorization are managed by IAM policies, you no longer have to distribute or rotate SSH keys.

Setting up AWS Session Manager

Getting started with Session Manager is straightforward. Here’s a step-by-step guide:

1. Ensure the SSM agent is installed

Most modern Amazon Machine Images (AMIs) come with the SSM Agent pre-installed. If yours doesn’t, install it manually using the following command (for Amazon Linux, Ubuntu, or RHEL):

sudo yum install -y amazon-ssm-agent
sudo systemctl enable amazon-ssm-agent
sudo systemctl start amazon-ssm-agent

2. Create an IAM Role for EC2

Your EC2 instance needs an IAM role to communicate with AWS Systems Manager. Attach a policy that grants at least the following permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ssm:StartSession"
      ],
      "Resource": [
        "arn:aws:ec2:REGION:ACCOUNT_ID:instance/INSTANCE_ID"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "ssm:TerminateSession",
        "ssm:ResumeSession"
      ],
      "Resource": [
        "arn:aws:ssm:REGION:ACCOUNT_ID:session/${aws:username}-*"
      ]
    }
  ]
}

Replace REGION, ACCOUNT_ID, and INSTANCE_ID with your actual values. For best security practices, apply the principle of least privilege by restricting access to specific instances or tags.

3. Connect to your instance

Once the IAM role is attached, you’re ready to connect.

From the AWS Console: Navigate to EC2 > Instances, select your instance, click Connect, and choose Session Manager.

From the AWS CLI: Run:

aws ssm start-session --target i-xxxxxxxxxxxxxxxxx

That’s it, no SSH keys, no VPNs, no open ports.

Built-in security and auditing

Session Manager doesn’t just improve security, it also enhances compliance and auditing. Every session can be logged to Amazon S3 or CloudWatch Logs, capturing a full record of all executed commands. This ensures complete visibility into who accessed which instance and what actions were taken.

To enable logging, navigate to AWS Systems Manager > Session Manager, configure Session Preferences, and enable logging to an S3 bucket or CloudWatch Log Group.

Why Session Manager is better than traditional methods

Let’s compare Session Manager with traditional access methods:

Feature	Bastion Host & SSH	AWS Session Manager
Open inbound ports	Yes (22, 3389)	No
Requires SSH keys	Yes	No
Key rotation required	Yes	No
Logs session activity	Manual setup	Built-in
Works for on-premises	No	Yes

Session Manager removes unnecessary complexity. No more juggling bastion hosts, no more worrying about expired SSH keys, and no more open ports that expose your infrastructure to unnecessary risks.

Real-World applications and operational Benefits

Session Manager is not just a theoretical improvement, it delivers real-world value in multiple scenarios:

Developers can quickly access production or staging instances without security concerns.
System administrators can perform routine maintenance without managing SSH key distribution.
Security teams gain complete visibility into instance access and command history.
Hybrid cloud environments benefit from unified access across AWS and on-premises infrastructure.

With these advantages, Session Manager aligns perfectly with modern cloud-native security principles, helping teams focus on operations rather than infrastructure headaches.

In summary

AWS Session Manager isn’t just another tool, it’s a fundamental shift in how we access EC2 instances securely. If you’re still relying on bastion hosts and SSH keys, it’s time to rethink your approach.Try it out, configure logging, and experience a simpler, more secure way to manage your instances. You might never go back to the old ways.

February 14, 2025 by Fernando SRE Cloud stuff SRE stuff

Boost Performance and Resilience with AWS EC2 Placement Groups

There’s a hidden art to placing your EC2 instances in AWS. It’s not just about spinning up machines and hoping for the best, where they land in AWS’s vast infrastructure can make all the difference in performance, resilience, and cost. This is where Placement Groups come in.

You might have deployed instances before without worrying about placement, and for many workloads, that’s perfectly fine. But when your application needs lightning-fast communication, fault tolerance, or optimized performance, Placement Groups become a critical tool in your AWS arsenal.

Let’s break it down.

What are Placement Groups?

AWS Placement Groups give you control over how your EC2 instances are positioned within AWS’s data centers. Instead of leaving it to chance, you can specify how close, or how far apart, your instances should be placed. This helps optimize either latency, fault tolerance, or a balance of both.

There are three types of Placement Groups: Cluster, Spread, and Partition. Each serves a different purpose, and choosing the right one depends on your application’s needs.

Types of Placement Groups and when to use them

Cluster Placement Groups for speed over everything

Think of Cluster Placement Groups like a Formula 1 pit crew. Every millisecond counts, and your instances need to communicate at breakneck speeds. AWS achieves this by placing them on the same physical hardware, minimizing latency, and maximizing network throughput.

This is perfect for:
✅ High-performance computing (HPC) clusters
✅ Real-time financial trading systems
✅ Large-scale data processing (big data, AI, and ML workloads)

⚠️ The Trade-off: While these instances talk to each other at lightning speed, they’re all packed together on the same hardware. If that hardware fails, everything inside the Cluster Placement Group goes down with it.

Spread Placement Groups for maximum resilience

Now, imagine you’re managing a set of VIP guests at a high-profile event. Instead of seating them all at the same table (risking one bad spill ruining their night), you spread them out across different areas. That’s what Spread Placement Groups do, they distribute instances across separate physical machines to reduce the impact of hardware failure.

Best suited for:
✅ Mission-critical applications that need high availability
✅ Databases requiring redundancy across multiple nodes
✅ Low-latency, fault-tolerant applications

⚠️ The Limitation: AWS allows only seven instances per Availability Zone in a Spread Placement Group. If your application needs more, you may need to rethink your architecture.

Partition Placement Groups, the best of both worlds approach

Partition Placement Groups work like a warehouse with multiple sections, each with its power supply. If one section loses power, the others keep running. AWS follows the same principle, grouping instances into multiple partitions spread across different racks of hardware. This provides both high performance and resilience, a sweet spot between Cluster and Spread Placement Groups.

Best for:
✅ Distributed databases like Cassandra, HDFS, or Hadoop
✅ Large-scale analytics workloads
✅ Applications needing both performance and fault tolerance

⚠️ AWS’s Partitioning Rule: The number of partitions you can use depends on the AWS Region, and you must carefully plan how instances are distributed.

How to Configure Placement Groups

Setting up a Placement Group is straightforward, and you can do it using the AWS Management Console, AWS CLI, or an SDK.

Example using AWS CLI

Let’s create a Cluster Placement Group:

aws ec2 create-placement-group --group-name my-cluster-group --strategy cluster

Now, launch an instance into the group:

aws ec2 run-instances --image-id ami-12345678 --count 1 --instance-type c5.large --placement GroupName=my-cluster-group

For Spread and Partition Placement Groups, simply change the strategy:

aws ec2 create-placement-group --group-name my-spread-group --strategy spread
aws ec2 create-placement-group --group-name my-partition-group --strategy partition

Best practices for using Placement Groups

🚀 Combine with Multi-AZ Deployments: Placement Groups work within a single Availability Zone, so consider spanning multiple AZs for maximum resilience.

📊 Monitor Network Performance: AWS doesn’t guarantee placement if your instance type isn’t supported or there’s insufficient capacity. Always benchmark your performance after deployment.

💰 Balance Cost and Performance: Cluster Placement Groups give the fastest network speeds, but they also increase failure risk. If high availability is critical, Spread or Partition Groups might be a better fit.

Final thoughts

AWS Placement Groups are a powerful but often overlooked feature. They allow you to maximize performance, minimize downtime, and optimize costs, but only if you choose the right type.

The next time you deploy EC2 instances, don’t just launch them randomly, placement matters. Choose wisely, and your infrastructure will thank you for it.

Understanding and using AWS EC2 status checks

Picture yourself running a restaurant. Every morning before opening, you would check different things: Are the refrigerators working? Is there power in the building? Does the kitchen equipment function properly? These checks ensure your restaurant can serve customers effectively. Similarly, Amazon Web Services (AWS) performs various checks on your EC2 instances to ensure they’re running smoothly. Let’s break this down in simple terms.

What are EC2 status checks?

Think of EC2 status checks as your instance’s health monitoring system. Just like a doctor checks your heart rate, blood pressure, and temperature, AWS continuously monitors different aspects of your EC2 instances. These checks happen automatically every minute, and best of all, they are free!

The three types of status checks

1. System status checks as the building inspector

System status checks are like a building inspector. They focus on the infrastructure rather than what is happening in your instance. These checks monitor:

The physical server’s power supply
Network connectivity
System software
Hardware components

When a system status check fails, it is usually an issue outside your control. It is akin to when your apartment building loses power – there’s not much you can do personally to fix it. In these cases, AWS is responsible for the repairs.

What can you do if it fails?

Wait for AWS to fix the underlying problem (similar to waiting for the power company to restore electricity).
You can move your instance to a new “building” by stopping and starting it (note: this is different from simply rebooting).

2. Instance status checks as your personal space monitor

Instance status checks are like having a smart home system that monitors what is happening inside your apartment. These checks look at:

Your instance’s operating system
Network configuration
Software settings
Memory usage
File system status
Kernel compatibility

When these checks fail, it typically means there’s an issue you need to address. It is similar to accidentally tripping a circuit breaker in your apartment – the infrastructure is fine, but the problem is within your own space.

How to fix instance status check failures:

Restart your instance (like resetting that tripped circuit breaker).
Review and modify your instance configuration.
Make sure your instance has enough memory.
Check for corrupted file systems and repair them if needed.

3. EBS status checks as your storage guardian

EBS status checks are like monitoring your external storage unit. They monitor the health of your attached storage volumes and can detect issues like:

Hardware problems with the storage system
Connectivity problems between your instance and its storage
Physical host issues affecting storage access

What to do if EBS checks fail:

Restart your instance to try to restore connectivity.
Replace problematic EBS volumes.
Check and fix any connectivity issues.

How to monitor these checks

Monitoring status checks is straightforward, and you have several options:

Using the AWS management console
- Open the EC2 console.
- Select your instance.
- Look at the “Status Checks” tab.

It’s that simple! You’ll see either a green check (passing) or a red X (failing) for each type of check.

Setting up automated monitoring

Now, here’s where things get interesting. You can set up Amazon CloudWatch to alert you if something goes wrong. It is like having a security system that notifies you if there is an issue.

Here’s a simple example:

aws cloudwatch put-metric-alarm \
  --alarm-name "Instance-Health-Check" \
  --namespace "AWS/EC2" \
  --metric-name "StatusCheckFailed" \
  --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --alarm-actions arn:aws:sns:region:account-id:topic-name

Each parameter here has its purpose:

–alarm-name: The name of your alarm.
–namespace and –metric-name: These identify the CloudWatch metric you are interested in.
–dimensions: Specifies the instance ID being monitored.
–period and –evaluation-periods: Define how often to check and for how long.
–threshold and –comparison-operator: Set the condition for triggering an alarm.
–alarm-actions: The action to take if the alarm state is triggered, like notifying you via SNS.

You could also set up these alarms through the AWS Management Console, which offers an intuitive UI for configuring CloudWatch.

Best practices for status checks

1. Don’t wait for problems

Set up CloudWatch alarms for all critical instances.
Monitor trends in status check results.
Document common issues and their solutions to improve response times.

2. Automate recovery

Configure automatic recovery actions for system status check failures.
Create automated backup systems and recovery procedures.
Test recovery processes regularly to ensure they work when needed.

3. Keep records

Log all status check failures.
Document steps taken to resolve issues.
Track recurring problems and implement solutions to prevent future failures.

Cost considerations

The good news? Status checks themselves are free! However, some recovery actions might incur costs, such as:

Starting and stopping instances (which might change your public IP).
Data transfer costs during recovery.
Additional EBS volumes if replacements are needed.

Real-World example

Imagine you receive an alert at 3 AM about a failed system status check. Here is how you might handle it:

Check the AWS status page: See if there is a broader AWS issue.
If it is isolated to your instance:
- Stop and start the instance (not just reboot).
- Check if the issue persists once the instance moves to new hardware.
If the problem continues:
- Review instance logs for more clues.
- Contact AWS Support if the issue is beyond your expertise or remains unresolved.

Final thoughts

EC2 status checks are your early warning system for potential problems. They are simple to understand but incredibly powerful for keeping your applications running smoothly. By monitoring these checks and setting up appropriate alerts, you can catch and address problems before they impact your users.

Remember: the best problems are the ones you prevent, not the ones you fix. Regular monitoring and proper setup of status checks will help you sleep better at night, knowing your instances are being watched over.

Next time you log into your AWS console, take a moment to check your status checks. They’re like a 24/7 health monitoring system for your cloud infrastructure, ensuring you maintain a healthy, reliable system.

November 15, 2024 by Fernando SRE Cloud stuff DevOps stuff

Exploring Containerization on AWS: Insights into ECS, EKS, Fargate, and ECR

Imagine exploring a vast universe, not of stars and galaxies, but of containers and cloud services. In AWS, this universe is populated by stellar services like ECS, EKS, Fargate, and ECR. Each, with its unique characteristics, serves different purposes, like stars in the constellation of cloud computing.

ECS: The Versatile Heart of AWS, ECS is like an experienced team of astronauts, managing entire fleets of containers efficiently. Picture a global logistics company using ECS to coordinate real-time shipping operations. Each container is a digital package, precisely transported to its destination. The scalability and security of ECS ensure that, even on the busiest days, like Black Friday, everything flows smoothly.

EKS: Kubernetes Orchestration in AWS, Think of EKS as a galactic explorer, harnessing the power of Kubernetes within the AWS cosmos. A university hospital uses EKS to manage electronic medical records. Like an advanced navigation system, EKS directs information through complex routes, maintaining the integrity and security of critical data, even as it expands into new territories of research and treatment.

Fargate: Containers without Server Chains, Fargate is like the anti-gravity of container services: it removes the weight of managing servers. Imagine a TV network using Fargate to broadcast live events. Like a spaceship that automatically adjusts to space conditions, Fargate scales resources to handle millions of viewers without the network having to worry about technical details.

ECR: The Image Warehouse in AWS Space, Finally, ECR can be seen as a digital archive in space, where container images are securely stored. A gaming startup stores versions of its software in ECR, ready to be deployed at any time. Like a well-organized archive, ECR allows this company to quickly retrieve what it needs, ensuring the latest games hit the market faster.

The Elegant Transition: From Complex Orchestration to Streamlined Efficiency

ECS: When Precision and Control Matter, Use ECS when you need fine-grained control over your container orchestration. It’s like choosing a manual transmission over automatic; you get to decide exactly how your containers run, network, and scale. It’s perfect for customized workflows and specific performance needs, much like a tailor-made suit.

EKS: For the Kubernetes Enthusiasts, Opt for EKS when you’re already invested in Kubernetes or when you need its specific features and community-driven plugins. It’s like using a Swiss Army knife; it offers flexibility and a range of tools, ideal for complex applications that require Kubernetes’ extensibility.

Fargate: Simplicity and Efficiency First, Choose Fargate when you want to focus on your application rather than infrastructure. It’s akin to flying autopilot; you define your destination (application), and Fargate handles the journey (server and cluster management). It’s best for straightforward applications where efficiency and ease of use are paramount.

ECR: Enhanced Container Registry for Docker and OCI Images

Leverage ECR for a secure, scalable environment to store and manage not just your Docker images but also OCI (Open Container Initiative) images. Envision ECR as a high-security vault that caters to the most utilized image format in the industry while also embracing the versatility of OCI standards. This dual compatibility ensures seamless integration with ECS and EKS and positions ECR as a comprehensive solution for modern container image management—crucial for organizations committed to security and forward compatibility.

Synthesizing Our Cosmic AWS Voyage

In this expedition through AWS’s container services, we’ve not only explored the distinct capabilities of ECS, EKS, Fargate, and ECR but also illuminated the scenarios where each shines brightest. Like celestial guides in the vast expanse of cloud computing, these services offer tailored paths to stellar solutions.

Choosing between them is less about picking the ‘best’ and more about aligning with your specific mission needs. Whether it’s the tailored precision of ECS, the expansive toolkit of EKS, the streamlined simplicity of Fargate, or the secure repository of ECR, each service is a specialized instrument in our technological odyssey.

Remember, understanding these services is not just about comprehending their technicalities but about appreciating their place in the grand scheme of cloud innovation. They are not just tools; they are the building blocks of modern digital architectures, each playing a pivotal role in scripting the future of technology.

December 9, 2023 by Fernando SRE Cloud stuff DevOps stuff Kubernetes