Move beyond traditional EC2 to containerized deployments with Amazon ECS. Discover the power of containers and the complexity of managing two-dimensional scaling: both your instances AND your tasks.
Building Highly Available AWS Infrastructure: ECS with EC2 - Part 2
In Part 1, we built a solid foundation with Application Load Balancers, Auto Scaling Groups, and EC2 instances. You now have a highly available setup that can handle traffic spikes and instance failures.
But there's a problem: every time you deploy a new version of your application, you're deploying an entire EC2 instance. Want to run multiple services on the same infrastructure? You'll need to manage complex deployment scripts, port conflicts, and dependency isolation.
Enter Amazon Elastic Container Service (ECS)—AWS's container orchestration platform that changes the game entirely. In this article, we'll explore how ECS with the EC2 launch type gives you the power of containers while introducing a fascinating new challenge: two-dimensional scaling.
🐳 The Container Revolution
Before we dive into ECS, let's understand why containers matter.
Traditional Deployment (What We Did in Part 1)
EC2 Instance
├── Operating System
├── Python 3.11
├── Your App v1.0
├── Dependencies
└── Configuration
Deploy process: Build new AMI → Update launch template → Trigger instance refresh
Problems:
- Slow deployments (minutes to launch new instances)
- Large deployment units (entire OS + app)
- Difficult to run multiple apps on same instance
- Environment drift ("works on my machine")
Container Deployment
EC2 Instance (Container Host)
├── Operating System
├── Docker Runtime
└── Containers
├── Container 1: Web App v1.0
├── Container 2: API Service v2.3
└── Container 3: Background Worker v1.5
Deploy process: Build new Docker image → Update ECS task definition → Deploy new tasks
Benefits:
- Fast deployments (seconds to start a container)
- Small deployment units (just your app + dependencies)
- Multiple isolated apps on same instance
- Consistency across environments
🏗️ ECS Architecture: The Big Picture
Amazon ECS adds a new layer of abstraction between your application and your infrastructure:
Application Load Balancer
↓
ECS Service
↓
ECS Tasks (Containers)
↓
EC2 Instances (Container Hosts)
Let's break down the new concepts:
ECS Cluster
A logical grouping of container instances (EC2 instances running the ECS agent)
ECS Task Definition
A blueprint for your application—like a Dockerfile manifest that specifies:
- Which Docker images to run
- CPU and memory requirements
- Port mappings
- Environment variables
- Logging configuration
ECS Task
A running instance of a task definition—essentially your actual running containers
ECS Service
Ensures a specified number of tasks are running and integrates with the ALB
🎯 From EC2 to ECS: The Migration
Let's take our Part 1 setup and containerize it. Here's what changes:
Before: Traditional EC2 Setup
# Launch Template with user data
#!/bin/bash
yum update -y
yum install -y python3
pip3 install -r requirements.txt
python3 app.py
After: ECS with EC2
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8080
CMD ["python", "app.py"]
// ECS Task Definition
{
"family": "web-app",
"networkMode": "bridge", // For EC2; awsvpc is primarily for Fargate
"requiresCompatibilities": ["EC2"],
"cpu": "1024", // Adjust based on your workload
"memory": "1536", // Leave room for OS and ECS agent
"containerDefinitions": [{
"name": "web-app-container",
"image": "123456789.dkr.ecr.us-east-1.amazonaws.com/web-app:latest",
"cpu": 768, // Container-level allocation
"memory": 1024,
"portMappings": [{
"containerPort": 5000, // .NET, Node.js common port
"hostPort": 5000, // Bridge mode requires explicit host port
"protocol": "tcp"
}],
"environment": [
{"name": "ENV", "value": "production"}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/web-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}]
}
🚀 Setting Up ECS with EC2: Step by Step
Step 1: Create an ECS Cluster
aws ecs create-cluster \
--cluster-name production-cluster \
--tags key=Environment,value=Production
Step 2: Create EC2 Instances for ECS
The key difference: these EC2 instances run the ECS agent and register with your cluster.
# Launch Template for ECS-optimized instances
aws ec2 create-launch-template \
--launch-template-name ecs-instance-template \
--launch-template-data '{
"ImageId": "ami-0c55b159cbfafe1f0", # ECS-optimized AMI
"InstanceType": "t3.medium",
"IamInstanceProfile": {
"Name": "ecsInstanceRole"
},
"UserData": "<base64-encoded>", # Configures ECS agent
"SecurityGroupIds": ["sg-12345678"]
}'
The user data script tells the instance which cluster to join:
#!/bin/bash
echo ECS_CLUSTER=production-cluster >> /etc/ecs/ecs.config
echo ECS_ENABLE_TASK_IAM_ROLE=true >> /etc/ecs/ecs.config
Step 3: Create Auto Scaling Group for ECS Instances
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name ecs-instance-asg \
--launch-template LaunchTemplateName=ecs-instance-template \
--min-size 2 \
--max-size 10 \
--desired-capacity 2 \
--vpc-zone-identifier "subnet-1a,subnet-1b,subnet-1c" \
--tags Key=Name,Value=ecs-container-host
Step 4: Create ECS Task Definition
aws ecs register-task-definition \
--cli-input-json file://task-definition.json
Step 5: Create ECS Service
This is where it gets interesting. The ECS service manages your tasks and integrates with the ALB:
aws ecs create-service \
--cluster production-cluster \
--service-name web-app-service \
--task-definition web-app:1 \
--desired-count 4 \
--launch-type EC2 \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:...,containerName=web-app-container,containerPort=5000" \
--health-check-grace-period-seconds 60 \
--placement-constraints type=distinctInstance # Only 1 task per EC2 instance
📍 Task Placement: Constraints and Strategies
You can control where ECS places tasks:
Placement Constraints - Hard rules that tasks MUST satisfy:
# Only 1 task per instance (prevents resource contention)
--placement-constraints type=distinctInstance
# Only place on instances with specific attributes
--placement-constraints "type=memberOf,expression=attribute:instance-type == t3.small"
Placement Strategies - Soft preferences for task distribution:
# Spread tasks across availability zones (high availability)
--placement-strategies type=spread,field=attribute:ecs.availability-zone
# Spread tasks across instances (within each AZ)
--placement-strategies type=spread,field=instanceId
# Pack tasks tightly (bin-packing for cost efficiency)
--placement-strategies type=binpack,field=cpu # or memory
Real-World Example: High Availability Setup
aws ecs create-service \
--cluster production-cluster \
--service-name web-app-service \
--task-definition web-app:1 \
--desired-count 4 \
--placement-constraints type=distinctInstance \
--placement-strategies \
type=spread,field=attribute:ecs.availability-zone \
type=spread,field=instanceId
Why this configuration?
✅ distinctInstance: Each task gets dedicated instance resources
✅ Spread by AZ: Survives entire AZ failure
✅ Spread by instance: Distributes load evenly
✅ True blue/green deploys: Can double task count without resource conflicts
Trade-offs:
- ❌ Less efficient bin-packing (can't stack multiple tasks per instance)
- ❌ Need more EC2 instances for same task count
- ✅ Predictable performance (no "noisy neighbor" tasks)
- ✅ Clean deployments (1 new instance = 1 new task)
When to use distinctInstance:
- Resource-intensive workloads (.NET apps, data processing)
- Smaller instance types (t3.small, t3.medium)
- Predictable performance requirements
- Blue/green deployment strategy
When NOT to use it:
- Large instances (can host many small tasks efficiently)
- Lightweight workloads (Node.js microservices)
- Cost optimization priority over performance isolation
## 🎭 The Two-Dimensional Scaling Challenge
Here's where ECS with EC2 gets complex. You now have **TWO** things that need to scale:
### Dimension 1: ECS Tasks (Your Containers)
**What**: The number of running containers executing your application code
**Scaling triggers**:
- Request count per task
- CPU utilization of tasks
- Memory utilization of tasks
- Custom CloudWatch metrics
**Example**: You have 4 tasks running, each handling 100 requests/second. Traffic doubles → you need 8 tasks.
### Dimension 2: EC2 Instances (Container Hosts)
**What**: The number of EC2 instances providing compute capacity to run your containers
**Scaling triggers**:
- Cluster CPU reservation
- Cluster memory reservation
- Number of pending tasks (tasks that can't be placed due to insufficient capacity)
**Example**: Your 2 EC2 instances can each host 4 tasks. You now need 8 tasks → you need 4 EC2 instances.
## 🤔 The Chicken and Egg Problem
Imagine this scenario:
Current State:
- 2 EC2 instances (t3.medium with 2 vCPU, 4 GB RAM each)
- 4 ECS tasks running (each needs 1 vCPU, 1 GB RAM)
- Each instance is hosting 2 tasks
Traffic Spike:
- ECS Service Auto Scaling triggers: "Need 8 tasks!"
- ECS tries to place 4 more tasks...
- 🚨 ERROR: Insufficient CPU/memory on existing instances
- Tasks go into PENDING state
- Your application can't scale to meet demand!
You've hit the wall. Your task scaling is blocked by insufficient instance capacity.
### The Solution: Multi-Layer Scaling
You need **both** layers to scale in coordination:
#### Layer 1: ECS Service Auto Scaling
Scales the number of tasks based on application metrics:
```bash
# Create Target Tracking Scaling Policy for Tasks
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/production-cluster/web-app-service \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 2 \
--max-capacity 20
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/production-cluster/web-app-service \
--scalable-dimension ecs:service:DesiredCount \
--policy-name cpu-tracking \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 75.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageCPUUtilization"
},
"ScaleInCooldown": 300,
"ScaleOutCooldown": 60
}'
Layer 2: Capacity Provider Auto Scaling
Automatically scales EC2 instances based on task demand:
# Create Capacity Provider
aws ecs create-capacity-provider \
--name ecs-ec2-capacity-provider \
--auto-scaling-group-provider '{
"autoScalingGroupArn": "arn:aws:autoscaling:...",
"managedScaling": {
"status": "ENABLED",
"targetCapacity": 100, # For 1-task-per-instance model
"minimumScalingStepSize": 1,
"maximumScalingStepSize": 10, # Faster scale-out
"instanceWarmupPeriod": 180 # 3 min for app initialization
},
"managedTerminationProtection": "ENABLED"
}'
# Associate with Cluster
aws ecs put-cluster-capacity-providers \
--cluster production-cluster \
--capacity-providers ecs-ec2-capacity-provider \
--default-capacity-provider-strategy capacityProvider=ecs-ec2-capacity-provider,weight=1,base=2
What happens now:
- Traffic increases → Task CPU hits 75%
- ECS Service Auto Scaling adds more tasks
- Cluster capacity hits 80% → Capacity Provider triggers
- EC2 Auto Scaling Group launches new instances
- New instances join cluster
- Pending tasks get placed on new instances
- Your application successfully scales! 🎉
📊 Understanding Capacity Provider Metrics
The targetCapacity setting is critical. It uses this formula:
Cluster Capacity = (Running Tasks * 100) / (Total Available Capacity)
Example:
- 4 EC2 instances, each can host 4 tasks = 16 total capacity
- 12 tasks running
- Capacity = (12 * 100) / 16 = 75%
If targetCapacity = 80: Still OK, no scaling needed
If task count increases to 14:
- Capacity = (14 * 100) / 16 = 87.5%
- 🚀 Exceeds 80% → Capacity Provider scales out instances!
🔄 The Complete Scaling Flow
Let's trace a complete scaling event:
09:00 AM - Normal Traffic
├── 2 EC2 instances
├── 4 ECS tasks
└── Cluster capacity: 50%
09:15 AM - Traffic Spike Begins
├── ALB sees increased requests
├── Task CPU utilization increases to 80%
└── ⚡ ECS Service Auto Scaling triggers
09:16 AM - Task Scaling
├── Service desired count: 4 → 8
├── ECS tries to place 4 new tasks
├── 2 tasks placed successfully
├── 2 tasks PENDING (insufficient capacity)
└── Cluster capacity: 75% → 🔔 Approaching threshold
09:17 AM - Capacity Provider Triggers
├── Cluster capacity exceeds 80%
├── Capacity Provider signals Auto Scaling Group
└── ASG desired capacity: 2 → 4
09:18 AM - New Instances Launch
├── 2 new EC2 instances launching
└── 2 tasks still PENDING
09:20 AM - Instances Join Cluster
├── New instances run ECS agent
├── Instances register with cluster
├── Available capacity increases
└── ⚡ PENDING tasks get placed
09:21 AM - Fully Scaled
├── 4 EC2 instances
├── 8 ECS tasks (all RUNNING)
├── Cluster capacity: 50%
└── Traffic handled successfully
🔵🟢 Blue/Green Deployments with Capacity Providers
With capacity providers and proper configuration, you get automatic blue/green deployments:
// Service Deployment Configuration
{
"deploymentConfiguration": {
"maximumPercent": 200, // Can double task count temporarily
"minimumHealthyPercent": 100 // Keep all current tasks during deploy
}
}
How a deployment works (with 1-task-per-instance):
Before Deploy:
├── Instance-A: Task-1 (old version)
├── Instance-B: Task-2 (old version)
└── Total: 2 instances, 2 tasks
Deploy New Version:
├── ECS wants 4 tasks total (2 new + 2 old = 200% of desired count)
├── Capacity at 200% → Capacity Provider triggers
├── ASG scales to 4 instances
├── Instance-C launches → Task-3 (new version) starts
├── Instance-D launches → Task-4 (new version) starts
├── New tasks pass health checks
├── Old tasks drain and stop
├── Instance-A and Instance-B terminate
└── Final: 2 instances, 2 tasks (all new version)
Result: Zero-downtime deployment with infrastructure that automatically grows and shrinks! 🎉
Key Settings:
# ECS Service
--deployment-configuration \
maximumPercent=200,minimumHealthyPercent=100
# Capacity Provider
--target-capacity 100 # Tight coupling: 1 task = 1 instance
--maximum-scaling-step-size 10 # Can add many instances quickly
# Auto Scaling Group
--new-instances-protected-from-scale-in # Let ECS manage termination
⚙️ Can This Be Done Automatically?
Short answer: Yes, but it requires careful configuration.
Long answer: You need to set up:
- ECS Service Auto Scaling (task-level)
- Capacity Providers (instance-level)
- Proper task sizing (CPU/memory reservations)
- Deployment configuration (for blue/green)
- Monitoring and alerts
Common Pitfalls
Pitfall 1: Task Sizing Depends on Your Strategy
Multi-Task Strategy (bin-packing for efficiency):
// Right-sized for bin-packing
{
"cpu": "512", // 0.5 vCPU
"memory": "1024" // 1 GB
}
// On t3.medium (2 vCPU, 4 GB) → 4 tasks per instance
// ✅ Efficient resource usage
// ✅ Lower cost per task
// ❌ Tasks compete for resources
// ❌ Complex deployments
Single-Task Strategy (dedicated resources):
// Sized for 1-task-per-instance on t3.small
{
"cpu": "1024", // 1 vCPU (50% of 2048)
"memory": "1536" // 1.5 GB (75% of 2048, leaves room for OS)
}
// On t3.small (2 vCPU, 2 GB) → 1 task per instance
// ✅ Predictable performance
// ✅ Clean blue/green deploys
// ✅ No resource contention
// ❌ Higher cost per task
// ❌ More instances needed
Which to choose?
It depends on your workload:
- High-traffic, lightweight apps: Multi-task (microservices, APIs)
- Resource-intensive apps: Single-task (.NET apps, data processing)
- Cost-sensitive: Multi-task
- Performance-critical: Single-task
Pitfall 2: Mismatched Scaling Speeds
# BAD: Instance scaling too slow for demand
Capacity Provider:
instanceWarmupPeriod: 600 # 10 minutes
maximumScalingStepSize: 1 # Only 1 instance at a time
ECS Service Auto Scaling:
scaleOutCooldown: 60 # Can add tasks every minute
# Result: Tasks scale faster than instances → PENDING tasks pile up!
# GOOD: Instance scaling faster than task scaling
ScaleOutCooldown: 60 # 1 minute for tasks
ASG Cooldown: 0 # No cooldown for instances
MinimumScalingStepSize: 2 # Launch 2 instances at once
Pitfall 3: Forgetting Reserved Memory
The ECS agent itself uses memory! Always account for overhead:
t3.medium: 4 GB total
- ECS agent: ~200 MB
- OS overhead: ~300 MB
= Available for tasks: ~3.5 GB
If your tasks reserve 1 GB each:
- Theoretical capacity: 4 tasks
- Real capacity: 3 tasks
🎯 Can This Be Easier?
By now, you might be thinking: "This is getting complicated. I have to manage:
- EC2 instances
- Auto Scaling Groups
- ECS tasks
- Capacity providers
- Two scaling dimensions
- Task bin-packing
- Instance capacity planning"
You're absolutely right. ECS with EC2 gives you fine-grained control and cost optimization, but it comes with operational complexity.
What if you could eliminate an entire dimension of this complexity? What if you didn't have to manage EC2 instances at all?
That's where AWS Fargate comes in.
Fargate is AWS's serverless compute engine for containers. With Fargate:
- ✅ No EC2 instances to manage
- ✅ No capacity providers to configure
- ✅ No cluster capacity to monitor
- ✅ Only ONE scaling dimension: tasks
But Fargate has trade-offs too. Is it right for your use case? What about cost? Performance?
We'll explore all of this in Part 3.
🔍 When Should You Use ECS with EC2?
Despite the complexity, ECS with EC2 launch type has legitimate use cases:
Good Fits for ECS + EC2
1. Cost Optimization with Reserved Instances or Savings Plans
- You can purchase Reserved Instances or Compute Savings Plans
- Long-running, predictable workloads can be 50-70% cheaper than Fargate
2. Specialized Instance Types
- Need GPUs? Use
g4dn.xlargeinstances - Need high memory? Use
r5.largeinstances - Fargate has limited instance type options
3. Large-Scale Deployments
- Running hundreds of tasks continuously
- Can achieve better bin-packing and cost efficiency
- Have dedicated ops team to manage infrastructure
4. Hybrid Requirements
- Some workloads need instance-level access
- Custom AMIs with pre-installed tools
- Special kernel modules or system configurations
Better Fits for Fargate (Preview of Part 3)
- Microservices with variable traffic
- Batch jobs and scheduled tasks
- Teams without dedicated ops resources
- Rapid prototyping and development
- Applications that need to scale to zero
📊 Monitoring Your ECS Cluster
Key metrics to track:
ECS Service Metrics
# CloudWatch Metrics
- CPUUtilization (task-level)
- MemoryUtilization (task-level)
- RunningTaskCount
- PendingTaskCount # 🚨 Alert if this stays elevated!
- DesiredTaskCount
Cluster Capacity Metrics
- CPUReservation # Percentage of CPU reserved by tasks
- MemoryReservation # Percentage of memory reserved by tasks
- RegisteredContainerInstancesCount
- ActiveServicesCount
Critical Alerts to Set Up
# Alert 1: Tasks stuck in PENDING
Metric: PendingTaskCount
Threshold: > 0 for more than 5 minutes
Action: Check cluster capacity!
# Alert 2: Cluster capacity too high
Metric: CPUReservation
Threshold: > 90% for more than 10 minutes
Action: Check Capacity Provider scaling
# Alert 3: Task failure rate
Metric: FailedTaskCount
Threshold: > 5 in 5 minutes
Action: Check task logs and health checks
🎬 Real-World Example: Putting It All Together
Here's a complete example deploying a production web application:
# 1. Create the cluster
aws ecs create-cluster --cluster-name production
# 2. Build and push Docker image
docker build -t web-app .
aws ecr get-login-password | docker login --username AWS --password-stdin <account>.dkr.ecr.us-east-1.amazonaws.com
docker tag web-app:latest <account>.dkr.ecr.us-east-1.amazonaws.com/web-app:latest
docker push <account>.dkr.ecr.us-east-1.amazonaws.com/web-app:latest
# 3. Register task definition
aws ecs register-task-definition --cli-input-json file://task-def.json
# 4. Create ALB target group (for ECS tasks)
aws elbv2 create-target-group \
--name ecs-tasks \
--protocol HTTP \
--port 80 \
--target-type ip \
--vpc-id vpc-12345
# 5. Create ECS service with auto scaling
aws ecs create-service \
--cluster production \
--service-name web-app \
--task-definition web-app:1 \
--desired-count 4 \
--launch-type EC2 \
--load-balancers "targetGroupArn=<tg-arn>,containerName=web-app,containerPort=8080"
# 6. Configure service auto scaling
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/production/web-app \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 2 \
--max-capacity 20
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/production/web-app \
--scalable-dimension ecs:service:DesiredCount \
--policy-name cpu-scaling \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration file://scaling-policy.json
# 7. Create capacity provider for instance scaling
aws ecs create-capacity-provider \
--name ec2-capacity \
--auto-scaling-group-provider "autoScalingGroupArn=<asg-arn>,managedScaling={status=ENABLED,targetCapacity=80}"
aws ecs put-cluster-capacity-providers \
--cluster production \
--capacity-providers ec2-capacity \
--default-capacity-provider-strategy capacityProvider=ec2-capacity,weight=1
🎯 Key Takeaways
- ECS with EC2 gives you containers but you still manage the underlying instances
- Two-dimensional scaling is powerful but complex—you scale both tasks AND instances
- Capacity Providers solve the coordination problem between task and instance scaling
- Proper task sizing is critical for efficient bin-packing and scaling
- Monitoring both layers (tasks and instances) is essential
- There IS an easier way—which we'll explore in Part 3 with Fargate
🚀 Coming Up in Part 3: AWS Fargate
In Part 3, we'll explore AWS Fargate—the serverless compute engine that eliminates the need to manage EC2 instances entirely.
We'll cover:
- What is Fargate and how does it work?
- Migrating from ECS with EC2 to Fargate
- Single-dimensional scaling (just tasks!)
- Cost comparison: EC2 vs Fargate
- When to use each launch type
- A brief history of Fargate for those still reading 😉
And in Part 4, we'll tackle the reality that even with perfect infrastructure, dependencies like databases can fail. Learn how to handle failures gracefully and turn ugly 500 errors into elegant maintenance pages.
Ready to go serverless? See you in Part 3!
Managing infrastructure is about finding the right balance between control and complexity. ECS with EC2 gives you the control—Fargate gives you simplicity. Choose wisely based on your needs.
Comments