
The shortcuts I use on every project now, after learning that scale mostly changes the bill, not the mistakes.
Let me tell you how this started. I used to measure my productivity by how many AWS services I could haphazardly stitch together in a single afternoon. Big mistake.
One night, I was deploying what should have been a boring, routine feature. Nothing fancy. Just basic plumbing. Six hours later, I was still babysitting the deployment, clicking through the AWS console like a caffeinated lab rat, re-running scripts, and manually patching up tiny human errors.
That is when the epiphany hit me like a rogue server rack. I was not slow because AWS is a labyrinth of complexity. I was slow because I was doing things manually that AWS already knows how to do in its sleep.
The patterns below did not come from sanitized tutorials. They were forged in the fires of shipping systems under immense pressure and desperately wanting my weekends back.
Event-driven everything and absolutely no polling
If you are polling, you are essentially paying Jeff Bezos for the privilege of wasting your own time. Polling is the digital equivalent of sitting in the backseat of a car and constantly asking, “Are we there yet?” every five seconds.
AWS is an event machine. Treat it like one. Instead of writing cron jobs that anxiously ask the database if something changed, just let AWS tap you on the shoulder when it actually happens.
Where this shines:
- File uploads
- Database updates
- Infrastructure state changes
- Cross-account automation
Example of reacting to an S3 upload instantly:
def lambda_handler(event, context):
for record in event['Records']:
bucket_name = record['s3']['bucket']['name']
object_key = record['s3']['object']['key']
# Stop asking if the file is there. AWS just handed it to you.
trigger_completely_automated_workflow(bucket_name, object_key)
No loops. No waiting. Just action.
Pro tip: Event-driven systems fail less frequently simply because they do less work. They are the lazy geniuses of the cloud world.
Immutable deployments or nothing
SSH is not a deployment strategy. It is a desperate cry for help.
If your deployment plan involves SSH, SCP, or uttering the cursed phrase “just this one quick change in production”, you do not have a system. You have a fragile ecosystem built on hope and duct tape. I stopped “fixing” servers years ago. Now, I just murder them and replace them with fresh clones.
The pattern is brutally simple:
- Build once
- Deploy new
- Destroy old
Example of launching a new EC2 version programmatically:
import boto3
ec2_client = boto3.client('ec2', region_name='eu-west-1')
response = ec2_client.run_instances(
ImageId='ami-0123456789abcdef0', # Totally fake AMI
InstanceType='t3a.nano',
MinCount=1,
MaxCount=1,
TagSpecifications=[{
'ResourceType': 'instance',
'Tags': [{'Key': 'Purpose', 'Value': 'EphemeralClone'}]
}]
)
It is like doing open-heart surgery. Instead of trying to fix the heart while the patient is running a marathon, just build a new patient with a healthy heart and disintegrate the old one. When something breaks, I do not debug the server. I debug the build process. That is where the real parasites live.
Infrastructure as code for the forgettable things
Most teams only use IaC for the big, glamorous stuff. VPCs. Kubernetes clusters. Massive databases.
This is completely backwards. It is like wearing a bespoke tuxedo but forgetting your underwear. The small, forgettable resources are the ones that will inevitably bite you when you least expect it.
What I automate with religious fervor:
- IAM roles
- Alarms
- Schedules
- Policies
- Log retention
Example of creating a CloudWatch alarm in code:
cloudwatch.put_metric_alarm(
AlarmName="QueueIsExploding",
MetricName="ApproximateNumberOfMessagesVisible",
Namespace="AWS/SQS",
Threshold=10000,
ComparisonOperator="GreaterThanThreshold",
EvaluationPeriods=1,
Period=300,
Statistic="Sum"
)
If it matters in production, it lives in code. No exceptions.
Let Step Functions own the flow
Early in my career, I crammed all my business logic into Lambdas. Retries, branching, timeouts, bizarre edge cases. I treated them like a digital junk drawer.
I do not do that anymore. Lambdas should be as dumb and fast as a golden retriever chasing a tennis ball.
The new rule: One Lambda equals one job. If you need a workflow, use Step Functions. They are the micromanaging middle managers your architecture desperately needs.
Example of a simple workflow state:
{
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456789012:function:DoOneThingWell",
"Retry": [
{
"ErrorEquals": ["States.TaskFailed"],
"IntervalSeconds": 3,
"MaxAttempts": 2
}
],
"Next": "CelebrateSuccess"
}
This separation makes debugging highly visual, makes retries explicit, and makes onboarding the new guy infinitely less painful. Your future self will thank you.
Kill cron jobs and use managed schedulers
Cron jobs are perfectly fine until they suddenly are not.
They are the ghosts of your infrastructure. They are completely invisible until they fail, and when they do fail, they die in absolute silence like a ninja with a sudden heart condition. AWS gives you managed scheduling. Just use it.
Why this is fundamentally faster:
- Central visibility
- Built-in retries
- IAM-native permissions
Example of creating a scheduled rule:
eventbridge.put_rule(
Name="TriggerNightlyChaos",
ScheduleExpression="cron(0 2 * * ? *)",
State="ENABLED",
Description="Wakes up the system when nobody is looking"
)
Automation should be highly observable. Cron jobs are just waiting in the dark to ruin your Tuesday.
Bake cost controls into automation
Speed without cost awareness is just a highly efficient way to bankrupt your employer. The fastest teams I have ever worked with were not just shipping fast. They were failing cheaply.
What I automate now with the ruthlessness of a debt collector:
- Budget alerts
- Resource TTLs
- Auto-shutdowns for non-production environments
Example of tagging resources with an expiration date:
ec2.create_tags(
Resources=['i-0deadbeef12345678'],
Tags=[
{"Key": "TerminateAfter", "Value": "2026-12-31"},
{"Key": "Owner", "Value": "TheVoid"}
]
)
Leaving resources without an owner or an expiration date is like leaving the stove on, except this stove bills you by the millisecond. Anything without a TTL is just technical debt waiting to invoice you.
A quote I live by: “Automation does not cut costs by magic. It cuts costs by quietly preventing the expensive little mistakes humans call normal.”
The death of the cloud hero
These patterns did not make me faster because they are particularly clever. They made me faster because they completely eliminated the need to make decisions.
Less clicking. Less remembering. Absolutely zero heroics.
If you want to move ten times faster on AWS, stop asking what to build next. Once automation is in charge, real speed usually arrives as work you no longer have to remember.
