Published November 19, 2025 · 17 min read · by Eric Wilson

AWS High Availability Auto Scaling EC2 ALB ECS

Move beyond traditional EC2 to containerized deployments with Amazon ECS. Discover the power of containers and the complexity of managing two-dimensional scaling: both your instances AND your tasks.

Building Highly Available AWS Infrastructure: ECS with EC2 - Part 2

In Part 1, we built a solid foundation with Application Load Balancers, Auto Scaling Groups, and EC2 instances. You now have a highly available setup that can handle traffic spikes and instance failures.

But there's a problem: every time you deploy a new version of your application, you're deploying an entire EC2 instance. Want to run multiple services on the same infrastructure? You'll need to manage complex deployment scripts, port conflicts, and dependency isolation.

Enter Amazon Elastic Container Service (ECS)—AWS's container orchestration platform that changes the game entirely. In this article, we'll explore how ECS with the EC2 launch type gives you the power of containers while introducing a fascinating new challenge: two-dimensional scaling.

🐳 The Container Revolution

Before we dive into ECS, let's understand why containers matter.

Traditional Deployment (What We Did in Part 1)

EC2 Instance
├── Operating System
├── Python 3.11
├── Your App v1.0
├── Dependencies
└── Configuration

Deploy process: Build new AMI → Update launch template → Trigger instance refresh

Problems:

Slow deployments (minutes to launch new instances)
Large deployment units (entire OS + app)
Difficult to run multiple apps on same instance
Environment drift ("works on my machine")

Container Deployment

EC2 Instance (Container Host)
├── Operating System
├── Docker Runtime
└── Containers
    ├── Container 1: Web App v1.0
    ├── Container 2: API Service v2.3
    └── Container 3: Background Worker v1.5

Deploy process: Build new Docker image → Update ECS task definition → Deploy new tasks

Benefits:

Fast deployments (seconds to start a container)
Small deployment units (just your app + dependencies)
Multiple isolated apps on same instance
Consistency across environments

🏗️ ECS Architecture: The Big Picture

Amazon ECS adds a new layer of abstraction between your application and your infrastructure:

Application Load Balancer
        ↓
    ECS Service
        ↓
    ECS Tasks (Containers)
        ↓
    EC2 Instances (Container Hosts)

Let's break down the new concepts:

ECS Cluster

A logical grouping of container instances (EC2 instances running the ECS agent)

ECS Task Definition

A blueprint for your application—like a Dockerfile manifest that specifies:

Which Docker images to run
CPU and memory requirements
Port mappings
Environment variables
Logging configuration

ECS Task

A running instance of a task definition—essentially your actual running containers

ECS Service

Ensures a specified number of tasks are running and integrates with the ALB

🎯 From EC2 to ECS: The Migration

Let's take our Part 1 setup and containerize it. Here's what changes:

Before: Traditional EC2 Setup

# Launch Template with user data
#!/bin/bash
yum update -y
yum install -y python3
pip3 install -r requirements.txt
python3 app.py

After: ECS with EC2

# Dockerfile
FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

EXPOSE 8080
CMD ["python", "app.py"]

// ECS Task Definition
{
  "family": "web-app",
  "networkMode": "bridge",  // For EC2; awsvpc is primarily for Fargate
  "requiresCompatibilities": ["EC2"],
  "cpu": "1024",      // Adjust based on your workload
  "memory": "1536",   // Leave room for OS and ECS agent
  "containerDefinitions": [{
    "name": "web-app-container",
    "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/web-app:latest",
    "cpu": 768,        // Container-level allocation
    "memory": 1024,
    "portMappings": [{
      "containerPort": 5000,  // .NET, Node.js common port
      "hostPort": 5000,       // Bridge mode requires explicit host port
      "protocol": "tcp"
    }],
    "environment": [
      {"name": "ENV", "value": "production"}
    ],
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "/ecs/web-app",
        "awslogs-region": "us-east-1",
        "awslogs-stream-prefix": "ecs"
      }
    }
  }]
}

🚀 Setting Up ECS with EC2: Step by Step

Step 1: Create an ECS Cluster

aws ecs create-cluster \
  --cluster-name production-cluster \
  --tags key=Environment,value=Production

Step 2: Create EC2 Instances for ECS

The key difference: these EC2 instances run the ECS agent and register with your cluster.

# Launch Template for ECS-optimized instances
aws ec2 create-launch-template \
  --launch-template-name ecs-instance-template \
  --launch-template-data '{
    "ImageId": "ami-0c55b159cbfafe1f0",  # ECS-optimized AMI
    "InstanceType": "t3.medium",
    "IamInstanceProfile": {
      "Name": "ecsInstanceRole"
    },
    "UserData": "<base64-encoded>",  # Configures ECS agent
    "SecurityGroupIds": ["sg-12345678"]
  }'

The user data script tells the instance which cluster to join:

#!/bin/bash
echo ECS_CLUSTER=production-cluster >> /etc/ecs/ecs.config
echo ECS_ENABLE_TASK_IAM_ROLE=true >> /etc/ecs/ecs.config

Step 3: Create Auto Scaling Group for ECS Instances

aws autoscaling create-auto-scaling-group \
  --auto-scaling-group-name ecs-instance-asg \
  --launch-template LaunchTemplateName=ecs-instance-template \
  --min-size 2 \
  --max-size 10 \
  --desired-capacity 2 \
  --vpc-zone-identifier "subnet-1a,subnet-1b,subnet-1c" \
  --tags Key=Name,Value=ecs-container-host

Step 4: Create ECS Task Definition

aws ecs register-task-definition \
  --cli-input-json file://task-definition.json

Step 5: Create ECS Service

This is where it gets interesting. The ECS service manages your tasks and integrates with the ALB:

aws ecs create-service \
  --cluster production-cluster \
  --service-name web-app-service \
  --task-definition web-app:1 \
  --desired-count 4 \
  --launch-type EC2 \
  --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:...,containerName=web-app-container,containerPort=5000" \
  --health-check-grace-period-seconds 60 \
  --placement-constraints type=distinctInstance  # Only 1 task per EC2 instance

📍 Task Placement: Constraints and Strategies

You can control where ECS places tasks:

Placement Constraints - Hard rules that tasks MUST satisfy:

# Only 1 task per instance (prevents resource contention)
--placement-constraints type=distinctInstance

# Only place on instances with specific attributes
--placement-constraints "type=memberOf,expression=attribute:instance-type == t3.small"

Placement Strategies - Soft preferences for task distribution:

# Spread tasks across availability zones (high availability)
--placement-strategies type=spread,field=attribute:ecs.availability-zone

# Spread tasks across instances (within each AZ)
--placement-strategies type=spread,field=instanceId

# Pack tasks tightly (bin-packing for cost efficiency)
--placement-strategies type=binpack,field=cpu  # or memory

Real-World Example: High Availability Setup

aws ecs create-service \
  --cluster production-cluster \
  --service-name web-app-service \
  --task-definition web-app:1 \
  --desired-count 4 \
  --placement-constraints type=distinctInstance \
  --placement-strategies \
    type=spread,field=attribute:ecs.availability-zone \
    type=spread,field=instanceId

Why this configuration?

✅ distinctInstance: Each task gets dedicated instance resources
✅ Spread by AZ: Survives entire AZ failure
✅ Spread by instance: Distributes load evenly
✅ True blue/green deploys: Can double task count without resource conflicts

Trade-offs:

❌ Less efficient bin-packing (can't stack multiple tasks per instance)
❌ Need more EC2 instances for same task count
✅ Predictable performance (no "noisy neighbor" tasks)
✅ Clean deployments (1 new instance = 1 new task)

When to use distinctInstance:

Resource-intensive workloads (.NET apps, data processing)
Smaller instance types (t3.small, t3.medium)
Predictable performance requirements
Blue/green deployment strategy

When NOT to use it:

Large instances (can host many small tasks efficiently)
Lightweight workloads (Node.js microservices)
Cost optimization priority over performance isolation


## 🎭 The Two-Dimensional Scaling Challenge

Here's where ECS with EC2 gets complex. You now have **TWO** things that need to scale:

### Dimension 1: ECS Tasks (Your Containers)

**What**: The number of running containers executing your application code

**Scaling triggers**:
- Request count per task
- CPU utilization of tasks
- Memory utilization of tasks
- Custom CloudWatch metrics

**Example**: You have 4 tasks running, each handling 100 requests/second. Traffic doubles → you need 8 tasks.

### Dimension 2: EC2 Instances (Container Hosts)

**What**: The number of EC2 instances providing compute capacity to run your containers

**Scaling triggers**:
- Cluster CPU reservation
- Cluster memory reservation
- Number of pending tasks (tasks that can't be placed due to insufficient capacity)

**Example**: Your 2 EC2 instances can each host 4 tasks. You now need 8 tasks → you need 4 EC2 instances.

## 🤔 The Chicken and Egg Problem

Imagine this scenario:

Current State:

2 EC2 instances (t3.medium with 2 vCPU, 4 GB RAM each)
4 ECS tasks running (each needs 1 vCPU, 1 GB RAM)
Each instance is hosting 2 tasks

Traffic Spike:

ECS Service Auto Scaling triggers: "Need 8 tasks!"
ECS tries to place 4 more tasks...
🚨 ERROR: Insufficient CPU/memory on existing instances
Tasks go into PENDING state
Your application can't scale to meet demand!


You've hit the wall. Your task scaling is blocked by insufficient instance capacity.

### The Solution: Multi-Layer Scaling

You need **both** layers to scale in coordination:

#### Layer 1: ECS Service Auto Scaling

Scales the number of tasks based on application metrics:

```bash
# Create Target Tracking Scaling Policy for Tasks
aws application-autoscaling register-scalable-target \
  --service-namespace ecs \
  --resource-id service/production-cluster/web-app-service \
  --scalable-dimension ecs:service:DesiredCount \
  --min-capacity 2 \
  --max-capacity 20

aws application-autoscaling put-scaling-policy \
  --service-namespace ecs \
  --resource-id service/production-cluster/web-app-service \
  --scalable-dimension ecs:service:DesiredCount \
  --policy-name cpu-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration '{
    "TargetValue": 75.0,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
    },
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60
  }'

Layer 2: Capacity Provider Auto Scaling

Automatically scales EC2 instances based on task demand:

# Create Capacity Provider
aws ecs create-capacity-provider \
  --name ecs-ec2-capacity-provider \
  --auto-scaling-group-provider '{
    "autoScalingGroupArn": "arn:aws:autoscaling:...",
    "managedScaling": {
      "status": "ENABLED",
      "targetCapacity": 100,         # For 1-task-per-instance model
      "minimumScalingStepSize": 1,
      "maximumScalingStepSize": 10,  # Faster scale-out
      "instanceWarmupPeriod": 180    # 3 min for app initialization
    },
    "managedTerminationProtection": "ENABLED"
  }'

# Associate with Cluster
aws ecs put-cluster-capacity-providers \
  --cluster production-cluster \
  --capacity-providers ecs-ec2-capacity-provider \
  --default-capacity-provider-strategy capacityProvider=ecs-ec2-capacity-provider,weight=1,base=2

What happens now:

Traffic increases → Task CPU hits 75%
ECS Service Auto Scaling adds more tasks
Cluster capacity hits 80% → Capacity Provider triggers
EC2 Auto Scaling Group launches new instances
New instances join cluster
Pending tasks get placed on new instances
Your application successfully scales! 🎉

📊 Understanding Capacity Provider Metrics

The targetCapacity setting is critical. It uses this formula:

Cluster Capacity = (Running Tasks * 100) / (Total Available Capacity)

Example:

4 EC2 instances, each can host 4 tasks = 16 total capacity
12 tasks running
Capacity = (12 * 100) / 16 = 75%

If targetCapacity = 80: Still OK, no scaling needed

If task count increases to 14:

Capacity = (14 * 100) / 16 = 87.5%
🚀 Exceeds 80% → Capacity Provider scales out instances!

🔄 The Complete Scaling Flow

Let's trace a complete scaling event:

09:00 AM - Normal Traffic
├── 2 EC2 instances
├── 4 ECS tasks
└── Cluster capacity: 50%

09:15 AM - Traffic Spike Begins
├── ALB sees increased requests
├── Task CPU utilization increases to 80%
└── ⚡ ECS Service Auto Scaling triggers

09:16 AM - Task Scaling
├── Service desired count: 4 → 8
├── ECS tries to place 4 new tasks
├── 2 tasks placed successfully
├── 2 tasks PENDING (insufficient capacity)
└── Cluster capacity: 75% → 🔔 Approaching threshold

09:17 AM - Capacity Provider Triggers
├── Cluster capacity exceeds 80%
├── Capacity Provider signals Auto Scaling Group
└── ASG desired capacity: 2 → 4

09:18 AM - New Instances Launch
├── 2 new EC2 instances launching
└── 2 tasks still PENDING

09:20 AM - Instances Join Cluster
├── New instances run ECS agent
├── Instances register with cluster
├── Available capacity increases
└── ⚡ PENDING tasks get placed

09:21 AM - Fully Scaled
├── 4 EC2 instances
├── 8 ECS tasks (all RUNNING)
├── Cluster capacity: 50%
└── Traffic handled successfully

🔵🟢 Blue/Green Deployments with Capacity Providers

With capacity providers and proper configuration, you get automatic blue/green deployments:

// Service Deployment Configuration
{
  "deploymentConfiguration": {
    "maximumPercent": 200,        // Can double task count temporarily
    "minimumHealthyPercent": 100  // Keep all current tasks during deploy
  }
}

How a deployment works (with 1-task-per-instance):

Before Deploy:
├── Instance-A: Task-1 (old version)
├── Instance-B: Task-2 (old version)
└── Total: 2 instances, 2 tasks

Deploy New Version:
├── ECS wants 4 tasks total (2 new + 2 old = 200% of desired count)
├── Capacity at 200% → Capacity Provider triggers
├── ASG scales to 4 instances
├── Instance-C launches → Task-3 (new version) starts
├── Instance-D launches → Task-4 (new version) starts
├── New tasks pass health checks
├── Old tasks drain and stop
├── Instance-A and Instance-B terminate
└── Final: 2 instances, 2 tasks (all new version)

Result: Zero-downtime deployment with infrastructure that automatically grows and shrinks! 🎉

Key Settings:

# ECS Service
--deployment-configuration \
  maximumPercent=200,minimumHealthyPercent=100

# Capacity Provider
--target-capacity 100       # Tight coupling: 1 task = 1 instance
--maximum-scaling-step-size 10  # Can add many instances quickly

# Auto Scaling Group
--new-instances-protected-from-scale-in  # Let ECS manage termination

⚙️ Can This Be Done Automatically?

Short answer: Yes, but it requires careful configuration.

Long answer: You need to set up:

ECS Service Auto Scaling (task-level)
Capacity Providers (instance-level)
Proper task sizing (CPU/memory reservations)
Deployment configuration (for blue/green)
Monitoring and alerts

Common Pitfalls

Pitfall 1: Task Sizing Depends on Your Strategy

Multi-Task Strategy (bin-packing for efficiency):

// Right-sized for bin-packing
{
  "cpu": "512",       // 0.5 vCPU
  "memory": "1024"    // 1 GB
}
// On t3.medium (2 vCPU, 4 GB) → 4 tasks per instance
// ✅ Efficient resource usage
// ✅ Lower cost per task
// ❌ Tasks compete for resources
// ❌ Complex deployments

Single-Task Strategy (dedicated resources):

// Sized for 1-task-per-instance on t3.small
{
  "cpu": "1024",      // 1 vCPU (50% of 2048)
  "memory": "1536"    // 1.5 GB (75% of 2048, leaves room for OS)
}
// On t3.small (2 vCPU, 2 GB) → 1 task per instance
// ✅ Predictable performance
// ✅ Clean blue/green deploys
// ✅ No resource contention
// ❌ Higher cost per task
// ❌ More instances needed

Which to choose?

It depends on your workload:

High-traffic, lightweight apps: Multi-task (microservices, APIs)
Resource-intensive apps: Single-task (.NET apps, data processing)
Cost-sensitive: Multi-task
Performance-critical: Single-task

Pitfall 2: Mismatched Scaling Speeds

# BAD: Instance scaling too slow for demand
Capacity Provider:
  instanceWarmupPeriod: 600    # 10 minutes
  maximumScalingStepSize: 1    # Only 1 instance at a time

ECS Service Auto Scaling:
  scaleOutCooldown: 60         # Can add tasks every minute

# Result: Tasks scale faster than instances → PENDING tasks pile up!

# GOOD: Instance scaling faster than task scaling
ScaleOutCooldown: 60   # 1 minute for tasks
ASG Cooldown: 0        # No cooldown for instances
MinimumScalingStepSize: 2  # Launch 2 instances at once

Pitfall 3: Forgetting Reserved Memory

The ECS agent itself uses memory! Always account for overhead:

t3.medium: 4 GB total
- ECS agent: ~200 MB
- OS overhead: ~300 MB
= Available for tasks: ~3.5 GB

If your tasks reserve 1 GB each:
- Theoretical capacity: 4 tasks
- Real capacity: 3 tasks

🎯 Can This Be Easier?

By now, you might be thinking: "This is getting complicated. I have to manage:

EC2 instances
Auto Scaling Groups
ECS tasks
Capacity providers
Two scaling dimensions
Task bin-packing
Instance capacity planning"

You're absolutely right. ECS with EC2 gives you fine-grained control and cost optimization, but it comes with operational complexity.

What if you could eliminate an entire dimension of this complexity? What if you didn't have to manage EC2 instances at all?

That's where AWS Fargate comes in.

Fargate is AWS's serverless compute engine for containers. With Fargate:

✅ No EC2 instances to manage
✅ No capacity providers to configure
✅ No cluster capacity to monitor
✅ Only ONE scaling dimension: tasks

But Fargate has trade-offs too. Is it right for your use case? What about cost? Performance?

We'll explore all of this in Part 3.

🔍 When Should You Use ECS with EC2?

Despite the complexity, ECS with EC2 launch type has legitimate use cases:

Good Fits for ECS + EC2

1. Cost Optimization with Reserved Instances or Savings Plans

You can purchase Reserved Instances or Compute Savings Plans
Long-running, predictable workloads can be 50-70% cheaper than Fargate

2. Specialized Instance Types

Need GPUs? Use g4dn.xlarge instances
Need high memory? Use r5.large instances
Fargate has limited instance type options

3. Large-Scale Deployments

Running hundreds of tasks continuously
Can achieve better bin-packing and cost efficiency
Have dedicated ops team to manage infrastructure

4. Hybrid Requirements

Some workloads need instance-level access
Custom AMIs with pre-installed tools
Special kernel modules or system configurations

Better Fits for Fargate (Preview of Part 3)

Microservices with variable traffic
Batch jobs and scheduled tasks
Teams without dedicated ops resources
Rapid prototyping and development
Applications that need to scale to zero

📊 Monitoring Your ECS Cluster

Key metrics to track:

ECS Service Metrics

# CloudWatch Metrics
- CPUUtilization (task-level)
- MemoryUtilization (task-level)
- RunningTaskCount
- PendingTaskCount  # 🚨 Alert if this stays elevated!
- DesiredTaskCount

Cluster Capacity Metrics

- CPUReservation    # Percentage of CPU reserved by tasks
- MemoryReservation # Percentage of memory reserved by tasks
- RegisteredContainerInstancesCount
- ActiveServicesCount

Critical Alerts to Set Up

# Alert 1: Tasks stuck in PENDING
Metric: PendingTaskCount
Threshold: > 0 for more than 5 minutes
Action: Check cluster capacity!

# Alert 2: Cluster capacity too high
Metric: CPUReservation
Threshold: > 90% for more than 10 minutes
Action: Check Capacity Provider scaling

# Alert 3: Task failure rate
Metric: FailedTaskCount
Threshold: > 5 in 5 minutes
Action: Check task logs and health checks

🎬 Real-World Example: Putting It All Together

Here's a complete example deploying a production web application:

# 1. Create the cluster
aws ecs create-cluster --cluster-name production

# 2. Build and push Docker image
docker build -t web-app .
aws ecr get-login-password | docker login --username AWS --password-stdin <account>.dkr.ecr.us-east-1.amazonaws.com
docker tag web-app:latest <account>.dkr.ecr.us-east-1.amazonaws.com/web-app:latest
docker push <account>.dkr.ecr.us-east-1.amazonaws.com/web-app:latest

# 3. Register task definition
aws ecs register-task-definition --cli-input-json file://task-def.json

# 4. Create ALB target group (for ECS tasks)
aws elbv2 create-target-group \
  --name ecs-tasks \
  --protocol HTTP \
  --port 80 \
  --target-type ip \
  --vpc-id vpc-12345

# 5. Create ECS service with auto scaling
aws ecs create-service \
  --cluster production \
  --service-name web-app \
  --task-definition web-app:1 \
  --desired-count 4 \
  --launch-type EC2 \
  --load-balancers "targetGroupArn=<tg-arn>,containerName=web-app,containerPort=8080"

# 6. Configure service auto scaling
aws application-autoscaling register-scalable-target \
  --service-namespace ecs \
  --resource-id service/production/web-app \
  --scalable-dimension ecs:service:DesiredCount \
  --min-capacity 2 \
  --max-capacity 20

aws application-autoscaling put-scaling-policy \
  --service-namespace ecs \
  --resource-id service/production/web-app \
  --scalable-dimension ecs:service:DesiredCount \
  --policy-name cpu-scaling \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration file://scaling-policy.json

# 7. Create capacity provider for instance scaling
aws ecs create-capacity-provider \
  --name ec2-capacity \
  --auto-scaling-group-provider "autoScalingGroupArn=<asg-arn>,managedScaling={status=ENABLED,targetCapacity=80}"

aws ecs put-cluster-capacity-providers \
  --cluster production \
  --capacity-providers ec2-capacity \
  --default-capacity-provider-strategy capacityProvider=ec2-capacity,weight=1

🎯 Key Takeaways

ECS with EC2 gives you containers but you still manage the underlying instances
Two-dimensional scaling is powerful but complex—you scale both tasks AND instances
Capacity Providers solve the coordination problem between task and instance scaling
Proper task sizing is critical for efficient bin-packing and scaling
Monitoring both layers (tasks and instances) is essential
There IS an easier way—which we'll explore in Part 3 with Fargate

🚀 Coming Up in Part 3: AWS Fargate

In Part 3, we'll explore AWS Fargate—the serverless compute engine that eliminates the need to manage EC2 instances entirely.

We'll cover:

What is Fargate and how does it work?
Migrating from ECS with EC2 to Fargate
Single-dimensional scaling (just tasks!)
Cost comparison: EC2 vs Fargate
When to use each launch type
A brief history of Fargate for those still reading 😉

And in Part 4, we'll tackle the reality that even with perfect infrastructure, dependencies like databases can fail. Learn how to handle failures gracefully and turn ugly 500 errors into elegant maintenance pages.

Ready to go serverless? See you in Part 3!

Managing infrastructure is about finding the right balance between control and complexity. ECS with EC2 gives you the control—Fargate gives you simplicity. Choose wisely based on your needs.

Comments

Comments are not available. Feel free to share your feedback on LinkedIn or connect with Geek Cafe.

Back to All Blogs