AWS for DevOps Engineers: Complete Guide to Cloud Automation
📅 Published: June 2026
⏱️ Estimated Reading Time: 28 minutes
🏷️ Tags: AWS, DevOps, Cloud Computing, EC2, S3, IAM, Lambda, CloudFormation
Introduction: Why AWS for DevOps?
Amazon Web Services (AWS) is the world's most comprehensive cloud platform. For DevOps engineers, AWS provides building blocks to create scalable, reliable, and automated infrastructure.
Why DevOps engineers need AWS:
Over 90% of enterprises use AWS
DevOps job postings almost always require AWS knowledge
AWS services are designed for automation and infrastructure as code
The AWS ecosystem has tools for every DevOps need
This guide covers the essential AWS services every DevOps engineer must know, organized by how they are used in DevOps workflows.
Part 1: Core Compute Services
EC2 (Elastic Compute Cloud)
EC2 provides virtual servers in the cloud. It is the foundation of most AWS applications.
Key EC2 Concepts:
| Concept | Description |
|---|---|
| AMI | Amazon Machine Image (template for EC2 instances) |
| Instance Type | CPU, memory, storage configuration (t2.micro, m5.large, etc.) |
| Security Group | Virtual firewall for instance |
| Key Pair | SSH access credentials |
| EBS Volume | Persistent block storage |
| User Data | Script that runs at launch |
EC2 CLI Commands:
# List instances aws ec2 describe-instances # Launch instance aws ec2 run-instances \ --image-id ami-0c55b159cbfafe1f0 \ --instance-type t2.micro \ --key-name MyKeyPair \ --security-group-ids sg-12345678 \ --user-data file://user-data.sh # Start/stop/terminate aws ec2 start-instances --instance-ids i-1234567890abcdef0 aws ec2 stop-instances --instance-ids i-1234567890abcdef0 aws ec2 terminate-instances --instance-ids i-1234567890abcdef0
Terraform Example:
resource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0" instance_type = "t2.micro" key_name = aws_key_pair.my_key.key_name vpc_security_group_ids = [aws_security_group.web.id] user_data = <<-EOF #!/bin/bash yum update -y yum install -y nginx systemctl start nginx EOF tags = { Name = "web-server" Environment = "production" } }
EC2 Auto Scaling
Auto Scaling automatically adjusts the number of EC2 instances based on demand.
resource "aws_autoscaling_group" "web" { name = "web-asg" min_size = 2 max_size = 10 desired_capacity = 2 launch_template { id = aws_launch_template.web.id version = "$Latest" } vpc_zone_identifier = aws_subnet.public[*].id tag { key = "Name" value = "web-server" propagate_at_launch = true } } resource "aws_autoscaling_policy" "cpu_policy" { name = "cpu-policy" scaling_adjustment = 1 adjustment_type = "ChangeInCapacity" cooldown = 300 autoscaling_group_name = aws_autoscaling_group.web.name } resource "aws_cloudwatch_metric_alarm" "high_cpu" { alarm_name = "high-cpu" comparison_operator = "GreaterThanThreshold" evaluation_periods = 2 metric_name = "CPUUtilization" namespace = "AWS/EC2" period = 120 statistic = "Average" threshold = 80 dimensions = { AutoScalingGroupName = aws_autoscaling_group.web.name } alarm_actions = [aws_autoscaling_policy.cpu_policy.arn] }
Part 2: Storage Services
S3 (Simple Storage Service)
S3 is object storage for any type of data. It is infinitely scalable and extremely durable.
S3 Storage Classes:
| Class | Use Case | Retrieval |
|---|---|---|
| Standard | Frequently accessed data | Instant |
| Standard-IA | Infrequently accessed | Instant |
| One Zone-IA | Non-critical, infrequent | Instant |
| Glacier Instant Retrieval | Archive, fast access | Milliseconds |
| Glacier Flexible Retrieval | Archive, minutes | 1-5 minutes |
| Glacier Deep Archive | Archive, hours | 12-48 hours |
S3 CLI Commands:
# Create bucket aws s3 mb s3://my-bucket --region us-east-1 # List buckets aws s3 ls # Copy files aws s3 cp file.txt s3://my-bucket/ aws s3 sync ./local-folder s3://my-bucket/folder/ # Set bucket policy aws s3api put-bucket-policy --bucket my-bucket --policy file://policy.json
Terraform Example:
resource "aws_s3_bucket" "data" { bucket = "my-app-data-${var.environment}" tags = { Environment = var.environment ManagedBy = "Terraform" } } resource "aws_s3_bucket_versioning" "data" { bucket = aws_s3_bucket.data.id versioning_configuration { status = "Enabled" } } resource "aws_s3_bucket_server_side_encryption_configuration" "data" { bucket = aws_s3_bucket.data.id rule { apply_server_side_encryption_by_default { sse_algorithm = "AES256" } } } resource "aws_s3_bucket_lifecycle_configuration" "data" { bucket = aws_s3_bucket.data.id rule { id = "transition-to-ia" status = "Enabled" transition { days = 30 storage_class = "STANDARD_IA" } } rule { id = "archive-old" status = "Enabled" transition { days = 90 storage_class = "GLACIER" } } }
EBS (Elastic Block Storage)
EBS provides block storage volumes for EC2 instances.
resource "aws_ebs_volume" "data" { availability_zone = "us-west-2a" size = 100 type = "gp3" tags = { Name = "data-volume" } } resource "aws_volume_attachment" "data_attach" { device_name = "/dev/sdf" volume_id = aws_ebs_volume.data.id instance_id = aws_instance.web.id }
Part 3: Networking Services
VPC (Virtual Private Cloud)
VPC is your private network in AWS.
resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" enable_dns_hostnames = true enable_dns_support = true tags = { Name = "main-vpc" } } # Subnets resource "aws_subnet" "public" { count = 2 vpc_id = aws_vpc.main.id cidr_block = "10.0.${count.index}.0/24" availability_zone = data.aws_availability_zones.available.names[count.index] map_public_ip_on_launch = true tags = { Name = "public-subnet-${count.index + 1}" } } # Internet Gateway resource "aws_internet_gateway" "main" { vpc_id = aws_vpc.main.id tags = { Name = "main-igw" } } # Route Table resource "aws_route_table" "public" { vpc_id = aws_vpc.main.id route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.main.id } tags = { Name = "public-rt" } } # Route Table Association resource "aws_route_table_association" "public" { count = 2 subnet_id = aws_subnet.public[count.index].id route_table_id = aws_route_table.public.id }
Load Balancers
Application Load Balancer (ALB):
resource "aws_lb" "web" { name = "web-alb" internal = false load_balancer_type = "application" security_groups = [aws_security_group.alb.id] subnets = aws_subnet.public[*].id } resource "aws_lb_target_group" "web" { name = "web-tg" port = 80 protocol = "HTTP" vpc_id = aws_vpc.main.id health_check { path = "/health" port = "traffic-port" healthy_threshold = 2 unhealthy_threshold = 2 timeout = 5 interval = 30 } } resource "aws_lb_listener" "web" { load_balancer_arn = aws_lb.web.arn port = 80 protocol = "HTTP" default_action { type = "forward" target_group_arn = aws_lb_target_group.web.arn } }
Part 4: Compute - Serverless
Lambda
Lambda runs code without provisioning servers.
# lambda_function.py import json import boto3 def lambda_handler(event, context): print(f"Received event: {json.dumps(event)}") return { 'statusCode': 200, 'body': json.dumps({ 'message': 'Hello from Lambda!', 'event': event }) }
Terraform Lambda Deployment:
data "archive_file" "lambda" { type = "zip" source_file = "lambda_function.py" output_path = "lambda.zip" } resource "aws_lambda_function" "my_function" { filename = "lambda.zip" function_name = "my-function" role = aws_iam_role.lambda.arn handler = "lambda_function.lambda_handler" runtime = "python3.9" environment { variables = { ENVIRONMENT = var.environment } } } resource "aws_iam_role" "lambda" { name = "lambda-role" assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [{ Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "lambda.amazonaws.com" } }] }) }
Part 5: Security - IAM
IAM (Identity and Access Management)
IAM controls who can access what.
# IAM User resource "aws_iam_user" "developer" { name = "developer" } # IAM Group resource "aws_iam_group" "developers" { name = "developers" } resource "aws_iam_group_membership" "developers" { name = "developers-membership" users = [aws_iam_user.developer.name] group = aws_iam_group.developers.name } # IAM Policy resource "aws_iam_policy" "s3_read_only" { name = "s3-read-only" policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Action = [ "s3:GetObject", "s3:ListBucket" ] Resource = [ "arn:aws:s3:::my-bucket", "arn:aws:s3:::my-bucket/*" ] } ] }) } resource "aws_iam_group_policy_attachment" "developers_s3" { group = aws_iam_group.developers.name policy_arn = aws_iam_policy.s3_read_only.arn } # IAM Role for EC2 resource "aws_iam_role" "ec2_role" { name = "ec2-role" assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [{ Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "ec2.amazonaws.com" } }] }) } resource "aws_iam_instance_profile" "ec2_profile" { name = "ec2-profile" role = aws_iam_role.ec2_role.name } # Attach policy to role resource "aws_iam_role_policy_attachment" "ec2_s3" { role = aws_iam_role.ec2_role.name policy_arn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess" }
Part 6: Database Services
RDS (Relational Database Service)
resource "aws_db_instance" "postgres" { identifier = "my-database" engine = "postgres" engine_version = "15.3" instance_class = "db.t3.micro" allocated_storage = 20 storage_type = "gp3" storage_encrypted = true db_name = "myapp" username = "admin" password = random_password.db_password.result vpc_security_group_ids = [aws_security_group.database.id] db_subnet_group_name = aws_db_subnet_group.main.name backup_retention_period = 7 backup_window = "03:00-04:00" maintenance_window = "Mon:04:00-Mon:05:00" skip_final_snapshot = var.environment != "prod" deletion_protection = var.environment == "prod" tags = { Environment = var.environment } } resource "random_password" "db_password" { length = 16 special = false } resource "aws_db_subnet_group" "main" { name = "main-subnet-group" subnet_ids = aws_subnet.private[*].id }
Part 7: Monitoring - CloudWatch
CloudWatch Monitoring
# CloudWatch Metric Alarm resource "aws_cloudwatch_metric_alarm" "high_cpu" { alarm_name = "high-cpu-alarm" comparison_operator = "GreaterThanThreshold" evaluation_periods = 2 metric_name = "CPUUtilization" namespace = "AWS/EC2" period = 300 statistic = "Average" threshold = 80 dimensions = { InstanceId = aws_instance.web.id } alarm_actions = [aws_sns_topic.alerts.arn] } # SNS Topic for Alerts resource "aws_sns_topic" "alerts" { name = "alerts-topic" } resource "aws_sns_topic_subscription" "email" { topic_arn = aws_sns_topic.alerts.arn protocol = "email" endpoint = "alerts@example.com" } # CloudWatch Dashboard resource "aws_cloudwatch_dashboard" "main" { dashboard_name = "main-dashboard" dashboard_body = jsonencode({ widgets = [ { type = "metric" properties = { metrics = [ ["AWS/EC2", "CPUUtilization", "InstanceId", aws_instance.web.id] ] period = 300 stat = "Average" region = "us-west-2" title = "CPU Utilization" } } ] }) } # CloudWatch Log Group resource "aws_cloudwatch_log_group" "application" { name = "/aws/application/myapp" retention_in_days = 30 }
Part 8: Infrastructure as Code - CloudFormation
CloudFormation Basics
CloudFormation is AWS's native Infrastructure as Code tool.
# template.yaml AWSTemplateFormatVersion: '2010-09-09' Description: 'S3 Bucket with Versioning' Parameters: Environment: Type: String Default: dev AllowedValues: - dev - staging - prod Resources: MyBucket: Type: AWS::S3::Bucket Properties: BucketName: !Sub my-app-${Environment}-${AWS::AccountId} VersioningConfiguration: Status: Enabled Tags: - Key: Environment Value: !Ref Environment - Key: ManagedBy Value: CloudFormation Outputs: BucketName: Description: Name of the S3 bucket Value: !Ref MyBucket BucketArn: Description: ARN of the S3 bucket Value: !GetAtt MyBucket.Arn
CloudFormation CLI:
# Create stack aws cloudformation create-stack \ --stack-name my-stack \ --template-body file://template.yaml \ --parameters ParameterKey=Environment,ParameterValue=prod # Update stack aws cloudformation update-stack \ --stack-name my-stack \ --template-body file://template.yaml # Delete stack aws cloudformation delete-stack --stack-name my-stack # Describe stack aws cloudformation describe-stacks --stack-name my-stack
Part 9: DevOps Tools
CodePipeline (CI/CD)
Resources: SourceRepo: Type: AWS::CodeCommit::Repository Properties: RepositoryName: my-app-repo RepositoryDescription: Source code repository BuildProject: Type: AWS::CodeBuild::Project Properties: Name: my-app-build Source: Type: CODECOMMIT Location: !GetAtt SourceRepo.CloneUrlHttp Artifacts: Type: CODEPIPELINE Environment: Type: LINUX_CONTAINER ComputeType: BUILD_GENERAL1_SMALL Image: aws/codebuild/amazonlinux2-x86_64-standard:3.0 Pipeline: Type: AWS::CodePipeline::Pipeline Properties: Name: my-app-pipeline Stages: - Name: Source Actions: - Name: SourceAction ActionTypeId: Category: Source Owner: AWS Provider: CodeCommit Version: '1' Configuration: RepositoryName: !Ref SourceRepo BranchName: main OutputArtifacts: - Name: SourceOutput - Name: Build Actions: - Name: BuildAction ActionTypeId: Category: Build Owner: AWS Provider: CodeBuild Version: '1' Configuration: ProjectName: !Ref BuildProject InputArtifacts: - Name: SourceOutput OutputArtifacts: - Name: BuildOutput
CodeDeploy (Deployment Automation)
# appspec.yml version: 0.0 os: linux files: - source: / destination: /var/www/html hooks: BeforeInstall: - location: scripts/install_dependencies.sh timeout: 300 runas: root ApplicationStart: - location: scripts/start_server.sh timeout: 300 runas: root ApplicationStop: - location: scripts/stop_server.sh timeout: 300 runas: root
Part 10: Cost Management
AWS Cost Explorer CLI
# Get cost for last month aws ce get-cost-and-usage \ --time-period Start=2025-05-01,End=2025-06-01 \ --granularity MONTHLY \ --metrics "BlendedCost" \ --group-by Type=DIMENSION,Key=SERVICE # Get daily cost for last 7 days aws ce get-cost-and-usage \ --time-period Start=$(date -d '7 days ago' +%Y-%m-%d),End=$(date +%Y-%m-%d) \ --granularity DAILY \ --metrics "UnblendedCost" # Cost by tag aws ce get-cost-and-usage \ --time-period Start=2025-05-01,End=2025-06-01 \ --granularity MONTHLY \ --metrics "BlendedCost" \ --group-by Type=TAG,Key=Environment
Budget Setup
resource "aws_budgets_budget" "monthly" { name = "monthly-budget" budget_type = "COST" limit_amount = "1000" limit_unit = "USD" time_period_start = "2025-01-01_00:00" time_unit = "MONTHLY" notification { comparison_operator = "GREATER_THAN" threshold = 80 threshold_type = "PERCENTAGE" notification_type = "ACTUAL" subscriber_email_addresses = ["alerts@example.com"] } notification { comparison_operator = "GREATER_THAN" threshold = 100 threshold_type = "PERCENTAGE" notification_type = "FORECASTED" subscriber_email_addresses = ["alerts@example.com"] } }
AWS DevOps Best Practices
Security
Enable MFA for all IAM users
Use IAM roles, never access keys for EC2
Encrypt everything (S3, EBS, RDS)
Rotate credentials regularly
Enable CloudTrail for auditing
Cost Optimization
Use Auto Scaling to match capacity to demand
Use Spot Instances for fault-tolerant workloads
Use Reserved Instances for steady-state workloads
Set up budgets and alerts
Delete unused resources
Reliability
Deploy across multiple Availability Zones
Use load balancers for high availability
Implement health checks
Create disaster recovery plan
Practice failure scenarios
Performance
Right-size instances based on metrics
Use caching (ElastiCache, CloudFront)
Optimize database queries
Use CDN for static content
Operational Excellence
Infrastructure as Code (CloudFormation/Terraform)
Immutable infrastructure
Automated testing
Comprehensive monitoring
Regular backups
AWS CLI Cheat Sheet
# EC2 aws ec2 describe-instances aws ec2 start-instances --instance-ids i-xxx aws ec2 stop-instances --instance-ids i-xxx # S3 aws s3 ls aws s3 cp file.txt s3://bucket/ aws s3 sync ./local s3://bucket/ # IAM aws iam list-users aws iam create-user --user-name newuser # CloudFormation aws cloudformation create-stack --stack-name my-stack --template-body file://template.yaml aws cloudformation describe-stacks --stack-name my-stack # Lambda aws lambda list-functions aws lambda invoke --function-name my-function output.json # RDS aws rds describe-db-instances aws rds create-db-instance --db-instance-identifier mydb --db-instance-class db.t3.micro # CloudWatch aws cloudwatch put-metric-data --namespace MyApp --metric-name CPU --value 75 aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name CPUUtilization
AWS Services Map for DevOps
┌─────────────────────────────────────────────────────────────────┐ │ Developer Tools │ │ CodeCommit │ CodeBuild │ CodeDeploy │ CodePipeline │ CodeStar │ ├─────────────────────────────────────────────────────────────────┤ │ Compute │ │ EC2 │ ECS │ EKS │ Lambda │ Lightsail │ Batch │ Elastic Beanstalk│ ├─────────────────────────────────────────────────────────────────┤ │ Storage │ │ S3 │ EBS │ EFS │ Glacier │ Storage Gateway │ FSx │ ├─────────────────────────────────────────────────────────────────┤ │ Database │ │ RDS │ DynamoDB │ Aurora │ Redshift │ ElastiCache │ DocumentDB │ ├─────────────────────────────────────────────────────────────────┤ │ Networking │ │ VPC │ CloudFront │ Route 53 │ API Gateway │ Direct Connect │ ├─────────────────────────────────────────────────────────────────┤ │ Security │ │ IAM │ KMS │ Secrets Manager │ WAF │ Shield │ GuardDuty │ ├─────────────────────────────────────────────────────────────────┤ │ Monitoring │ │ CloudWatch │ CloudTrail │ X-Ray │ Config │ Health Dashboard │ └─────────────────────────────────────────────────────────────────┘
Summary
| Category | Key Services | Primary Use |
|---|---|---|
| Compute | EC2, Lambda, ECS | Run applications |
| Storage | S3, EBS, EFS | Store data |
| Networking | VPC, ALB, Route 53 | Connect resources |
| Database | RDS, DynamoDB | Store structured data |
| Security | IAM, KMS, WAF | Control access |
| Monitoring | CloudWatch, X-Ray | Observe systems |
| CI/CD | CodePipeline, CodeBuild | Automate delivery |
| IaC | CloudFormation, CDK | Infrastructure as code |
Mastering these core AWS services will enable you to build, deploy, and operate applications at scale.
Learn More
Practice AWS DevOps with hands-on exercises in our interactive labs:
https://devops.trainwithsky.com/
Comments
Post a Comment