Module 10: Containers and Amazon ECS
Learning Objectives
By the end of this module, you will be able to:
- Compare containers and virtual machines, and differentiate the isolation, resource usage, and startup characteristics of each approach
- Build a container image from a Dockerfile and troubleshoot common image build errors
- Construct an end-to-end container workflow by pushing images to Amazon Elastic Container Registry (Amazon ECR) and deploying them on Amazon Elastic Container Service (Amazon ECS)
- Differentiate ECS core concepts including clusters, task definitions, services, and tasks, and integrate them into a working deployment
- Compare the Fargate and EC2 launch types and troubleshoot deployment issues specific to each
- Integrate an Application Load Balancer (ALB) with an ECS service for dynamic port mapping, health checks, and traffic distribution
- Differentiate when to use Amazon ECS, Amazon Elastic Kubernetes Service (Amazon EKS), and AWS Lambda based on workload requirements
- Build container images that follow security best practices including non-root users, image scanning, and secrets management
Prerequisites
- Completion of Module 03: Networking Basics (VPC) (VPCs, subnets, security groups, and Availability Zones for placing ECS tasks and load balancers)
- Completion of Module 04: Compute with Amazon EC2 (EC2 instance concepts, IAM instance profiles, and Auto Scaling groups that underpin the EC2 launch type)
- Completion of Module 07: Load Balancing and DNS (ALB listeners, target groups, and health checks used for ECS service load balancing)
Concepts
Containers 101: What Containers Are and Why They Matter
A container is a lightweight, standalone package that includes everything needed to run a piece of software: the application code, runtime, system libraries, and settings. Containers share the host operating system kernel rather than bundling a full OS, which makes them smaller and faster to start than traditional virtual machines (VMs).
In Module 04, you launched EC2 instances, each running a complete operating system. Containers take a different approach. Instead of virtualizing the hardware, containers virtualize the operating system. Multiple containers run on the same OS kernel, isolated from each other through Linux kernel features such as namespaces and cgroups.
Containers vs. Virtual Machines
| Characteristic | Containers | Virtual Machines |
|---|---|---|
| Isolation level | Process-level (shared kernel) | Hardware-level (separate kernel per VM) |
| Startup time | Seconds | Minutes |
| Image size | Megabytes (typically 50 to 500 MB) | Gigabytes (typically 1 to 20 GB) |
| Resource overhead | Low (no guest OS) | High (full guest OS per VM) |
| Density | Hundreds per host | Tens per host |
| Portability | Runs identically on any system with a container runtime | Tied to hypervisor and OS configuration |
| Use case | Microservices, CI/CD pipelines, stateless workloads | Legacy applications, workloads requiring full OS isolation |
Containers solve the "it works on my machine" problem. Because a container packages the application with its dependencies, it runs the same way in development, testing, and production. This consistency simplifies deployments and reduces environment-related bugs.
Tip: Containers and VMs are not mutually exclusive. In production, containers often run on top of VMs (such as EC2 instances) to combine the hardware isolation of VMs with the lightweight packaging of containers.
Docker Basics: Images, Containers, and the Build Workflow
Docker is the most widely used container runtime. Docker uses three core concepts:
- Image. A read-only template that contains the application code, runtime, libraries, and configuration. Images are built in layers, where each layer represents a set of file system changes.
- Container. A running instance of an image. You can run multiple containers from the same image, each with its own writable layer on top of the read-only image layers.
- Dockerfile. A text file with instructions for building an image. Each instruction creates a layer in the image.
Dockerfile Structure
A Dockerfile defines the steps to build your container image. Here is an example for a simple Node.js web application:
# Start from an official Node.js base image
FROM node:20-alpine
# Set the working directory inside the container
WORKDIR /app
# Copy dependency manifests first (for layer caching)
COPY package.json package-lock.json ./
# Install dependencies
RUN npm ci --production
# Copy application source code
COPY . .
# Create a non-root user and switch to it
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
# Expose the application port
EXPOSE 3000
# Define the command to run the application
CMD ["node", "server.js"]
Build and Run Workflow
The typical Docker workflow follows these steps:
- Write a Dockerfile that defines your application environment.
- Build the image using
docker build. - Test the image locally using
docker run. - Push the image to a container registry (such as Amazon ECR).
- Deploy the image on a container orchestrator (such as Amazon ECS).
# Build the image and tag it
docker build -t my-web-app:1.0 .
# Run the container locally
docker run -d -p 8080:3000 my-web-app:1.0
# Verify the container is running
docker ps
Expected output:
CONTAINER ID IMAGE COMMAND STATUS PORTS
a1b2c3d4e5f6 my-web-app:1.0 "node server.js" Up 10 seconds 0.0.0.0:8080->3000/tcp
Tip: Order your Dockerfile instructions from least to most frequently changing. Place dependency installation before source code copying so that Docker can cache the dependency layer and skip reinstalling packages when only your application code changes.
Amazon ECR: Storing and Managing Container Images
Amazon Elastic Container Registry (Amazon ECR) is where you store your container images on AWS. It integrates directly with ECS, EKS, and Lambda, so your orchestration services can pull images without extra authentication steps. Think of ECR as a private Docker Hub that lives inside your AWS account.
Each AWS account gets a default private registry in each Region. Within a registry, you organize images into repositories. A repository holds multiple versions of a related image, identified by tags (such as latest, v1.0, or a Git commit hash).
Pushing an Image to ECR
To push a locally built image to ECR:
# Authenticate Docker to your ECR registry
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
# Create a repository (if it does not exist)
aws ecr create-repository --repository-name my-web-app --region us-east-1
# Tag the local image with the ECR repository URI
docker tag my-web-app:1.0 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-web-app:1.0
# Push the image to ECR
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-web-app:1.0
Image Lifecycle Policies
Over time, repositories accumulate old images that consume storage and increase costs. ECR lifecycle policies automate image cleanup by defining rules that expire images based on age or count. For example, you can create a policy that keeps only the 10 most recent images and expires everything older than 30 days.
Lifecycle policy rules evaluate images based on:
- Tag status. Whether the image is tagged or untagged.
- Tag prefix. A pattern that matches specific tag prefixes (for example,
prod-ordev-). - Count type. Expire images by count (keep the N most recent) or by age (expire images older than N days).
Tip: Always configure a lifecycle policy on your ECR repositories. Untagged images from failed builds and old tagged images accumulate quickly. A simple rule to expire untagged images older than 1 day and keep only the last 20 tagged images covers most use cases.
ECS Concepts: Clusters, Task Definitions, Services, and Tasks
Amazon Elastic Container Service (Amazon ECS) orchestrates your containers so you do not have to. It launches containers, monitors their health, replaces failures, and scales capacity based on demand. You tell ECS what to run (task definitions), where to run it (clusters), and how many copies to keep alive (services), and it handles the rest.
Clusters
An ECS cluster is a logical grouping of tasks and services. A cluster serves as the boundary for your container workloads. You can create separate clusters for different environments (development, staging, production) or for different applications.
A cluster contains the infrastructure that runs your tasks. Depending on the launch type you choose, this infrastructure is either EC2 instances that you manage or AWS Fargate capacity that AWS manages on your behalf.
Task Definitions
A task definition is a blueprint for your application. It describes one or more containers that form your application, similar to a docker-compose.yml file. A task definition specifies:
- Which container images to use
- How much CPU and memory each container needs
- Which ports to expose
- Environment variables to pass to the containers
- IAM roles for the task
- Logging configuration
- Volume mounts
Task definitions are versioned. Each time you update a task definition, ECS creates a new revision. You can roll back to a previous revision if a new deployment causes issues.
Tasks
A task is a running instance of a task definition. When ECS launches a task, it pulls the container images specified in the task definition, creates the containers, and starts them. A task can contain one or more containers that run together on the same host and share networking and storage resources.
Services
An ECS service maintains a specified number of running tasks. If a task fails or stops, the service scheduler launches a replacement to maintain the desired count. Services also integrate with load balancers to distribute traffic across tasks and with auto scaling to adjust the number of tasks based on demand.
The relationship between these concepts:
Cluster
├── Service A (desired count: 3)
│ ├── Task 1 (running task definition revision 5)
│ ├── Task 2 (running task definition revision 5)
│ └── Task 3 (running task definition revision 5)
└── Service B (desired count: 2)
├── Task 1 (running task definition revision 12)
└── Task 2 (running task definition revision 12)
Task Definitions: Container Configuration in Detail
A task definition is a JSON document that tells ECS exactly how to run your containers. Understanding its key parameters is essential for building reliable container deployments.
Container Definitions
Each task definition contains one or more container definitions. Each container definition specifies the image, resource limits, port mappings, and environment configuration for a single container.
CPU and Memory
You allocate CPU and memory at two levels:
- Task level. The total CPU and memory available to all containers in the task. Required for Fargate tasks.
- Container level. The CPU and memory reserved for (or limited to) each individual container within the task.
For Fargate tasks, you must choose from specific CPU and memory combinations:
| CPU (vCPU) | Memory Options |
|---|---|
| 0.25 vCPU | 0.5 GB, 1 GB, 2 GB |
| 0.5 vCPU | 1 GB, 2 GB, 3 GB, 4 GB |
| 1 vCPU | 2 GB, 3 GB, 4 GB, 5 GB, 6 GB, 7 GB, 8 GB |
| 2 vCPU | 4 GB through 16 GB (in 1 GB increments) |
| 4 vCPU | 8 GB through 30 GB (in 1 GB increments) |
Port Mappings
Port mappings connect a port on the container to a port on the host. For Fargate tasks, the host port and container port must be the same (because each task gets its own elastic network interface). For EC2 launch type tasks, you can use dynamic port mapping by setting the host port to 0, which lets the ALB assign a random available port.
Environment Variables
You pass configuration to containers through environment variables. You can define them directly in the task definition or reference values stored in AWS Systems Manager Parameter Store or AWS Secrets Manager for sensitive values such as database passwords and API keys.
IAM Task Roles
ECS supports two types of IAM roles for tasks:
| Role Type | Purpose | Example Permissions |
|---|---|---|
| Task execution role | Permissions that the ECS agent needs to manage the task (pull images, write logs, retrieve secrets) | ecr:GetAuthorizationToken, logs:CreateLogStream, secretsmanager:GetSecretValue |
| Task role | Permissions that your application code inside the container needs to access AWS services | s3:GetObject, dynamodb:PutItem, sqs:SendMessage |
The task execution role is used by the ECS infrastructure. The task role is used by your application. Keep them separate and apply the principle of least privilege to each, as you learned in Module 02.
Warning: Do not embed AWS credentials in your container images or pass them as environment variables. Use IAM task roles instead. The ECS agent automatically provides temporary credentials to your containers through the task metadata endpoint.
Example Task Definition
{
"family": "my-web-app",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/myAppTaskRole",
"containerDefinitions": [
{
"name": "web",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-web-app:1.0",
"portMappings": [
{
"containerPort": 3000,
"protocol": "tcp"
}
],
"environment": [
{
"name": "NODE_ENV",
"value": "production"
}
],
"secrets": [
{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:my-db-password"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-web-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "web"
}
},
"essential": true
}
]
}
Fargate vs. EC2 Launch Type
When you create an ECS cluster, you choose how to provide the underlying compute capacity for your tasks. ECS supports two primary launch types: AWS Fargate (serverless) and EC2 (self-managed instances).
Fargate Launch Type
With Fargate, AWS manages the infrastructure. You do not provision, configure, or scale EC2 instances. You specify the CPU and memory your task needs, and Fargate allocates the right amount of compute. Each Fargate task runs in its own isolated environment with a dedicated elastic network interface (ENI).
EC2 Launch Type
With the EC2 launch type, you manage a fleet of EC2 instances that form the cluster's capacity. You are responsible for choosing instance types, patching the operating system, scaling the instance fleet, and monitoring instance health. In return, you get more control over the underlying infrastructure and access to features like GPU instances and custom AMIs.
Comparison
| Feature | Fargate | EC2 Launch Type |
|---|---|---|
| Infrastructure management | AWS manages everything | You manage EC2 instances |
| Scaling | Per-task scaling (automatic) | Instance-level scaling (Auto Scaling groups) plus task-level scaling |
| Networking | Each task gets its own ENI | Tasks share the host network or use awsvpc mode |
| Pricing | Pay per task (vCPU and memory per second) | Pay for EC2 instances regardless of task utilization |
| Startup time | Slightly longer (infrastructure provisioning) | Faster if instances are already running |
| GPU support | Not supported | Supported (P and G instance families) |
| Custom AMIs | Not applicable | Supported (custom ECS-optimized AMIs) |
| Persistent storage | Ephemeral storage (20 GB default, up to 200 GB) | EBS volumes, instance store, EFS |
| Best for | Most workloads, teams that want to focus on applications | GPU workloads, large-scale cost optimization, workloads needing custom OS configuration |
When to Use Each
Choose Fargate when:
- You want to minimize operational overhead and avoid managing servers.
- Your workloads have variable or unpredictable traffic patterns.
- You are running many small, independent services.
- Your team is small and you want to focus on application development rather than infrastructure.
Choose EC2 launch type when:
- You need GPU instances for machine learning or graphics workloads.
- You have large, steady-state workloads where Reserved Instances or Savings Plans reduce costs significantly.
- You need custom kernel parameters, specific OS configurations, or specialized storage.
- You need to run Windows containers with specific OS version requirements.
Tip: Start with Fargate for new workloads. It removes the undifferentiated heavy lifting of managing EC2 instances. Move to the EC2 launch type only when you have a specific requirement that Fargate cannot meet, such as GPU access or cost optimization at scale.
ECS Services: Desired Count, Deployments, and Auto Scaling
An ECS service ensures that a specified number of task instances are running at all times. If a task fails, the service scheduler automatically launches a replacement. Services also manage deployments when you update your task definition.
Desired Count
The desired count is the number of task instances the service tries to maintain. If you set the desired count to 3, the service scheduler ensures that 3 tasks are always running. If a task stops (due to a crash, health check failure, or host issue), the scheduler launches a new task to replace it.
Deployment Strategies
When you update a service (for example, by deploying a new container image), ECS replaces the running tasks with new ones. ECS supports two deployment strategies:
Rolling Update
The rolling update strategy replaces tasks incrementally. ECS stops a batch of old tasks and starts new tasks, repeating until all tasks run the new version. You control the pace with two parameters:
- minimumHealthyPercent. The minimum percentage of tasks that must remain running during the deployment. For example, 50% means ECS can stop up to half the tasks before starting replacements.
- maximumPercent. The maximum percentage of tasks (relative to the desired count) that can run during the deployment. For example, 200% means ECS can temporarily run twice the desired count to ensure zero downtime.
Blue/Green Deployment
Blue/green deployments use AWS CodeDeploy to create a complete replacement set of tasks (the "green" environment) alongside the existing tasks (the "blue" environment). Traffic shifts from blue to green gradually or all at once. If the green environment fails health checks, CodeDeploy automatically rolls back to the blue environment.
| Feature | Rolling Update | Blue/Green |
|---|---|---|
| Managed by | ECS service scheduler | AWS CodeDeploy |
| Rollback | Manual (redeploy previous task definition) | Automatic (CodeDeploy rolls back on failure) |
| Traffic shift | Gradual (task by task) | Configurable (all at once, linear, or canary) |
| Cost during deployment | Slightly above normal (temporary extra tasks) | Double capacity during transition |
| Complexity | Simple (built into ECS) | More complex (requires CodeDeploy configuration) |
| Best for | Most deployments | Mission-critical services requiring instant rollback |
Deployment Circuit Breaker
The ECS deployment circuit breaker automatically detects when a rolling update deployment is failing. If new tasks repeatedly fail to reach a healthy state, the circuit breaker stops the deployment and optionally rolls back to the last successful deployment. Enable the circuit breaker to prevent a bad deployment from replacing all healthy tasks with failing ones.
Service Auto Scaling
ECS service auto scaling adjusts the desired count of tasks in a service based on demand. It uses Application Auto Scaling and supports three scaling policy types:
- Target tracking. Maintain a target value for a specific metric. For example, keep average CPU utilization at 50%. ECS adds tasks when utilization rises above the target and removes tasks when it drops below.
- Step scaling. Define scaling adjustments based on CloudWatch alarm thresholds. For example, add 2 tasks when CPU exceeds 70% and add 4 tasks when CPU exceeds 90%.
- Scheduled scaling. Scale based on a schedule. For example, increase the desired count to 10 tasks every weekday at 8:00 AM and decrease to 2 tasks at 8:00 PM.
In Module 04, you learned about EC2 Auto Scaling groups that scale instances. ECS service auto scaling is similar but operates at the task level. For Fargate, task-level scaling is all you need. For the EC2 launch type, you may need both instance-level scaling (to add EC2 capacity) and task-level scaling (to add tasks onto that capacity).
Tip: Start with target tracking on CPU or memory utilization. It is the simplest policy to configure and handles most scaling scenarios. Add step scaling or scheduled scaling only when target tracking does not meet your requirements.
Service Discovery and Load Balancing with ALB
In Module 07, you learned how an Application Load Balancer distributes traffic across EC2 instances. ECS extends this pattern to containers, with some important differences.
ECS and ALB Integration
When you associate an ALB with an ECS service, the service automatically registers new tasks with the ALB target group and deregisters tasks that stop. This means the ALB always knows which tasks are healthy and available to receive traffic.
For Fargate tasks (which use awsvpc network mode), each task gets its own private IP address and ENI within your VPC subnets, as you configured in Module 03. The ALB routes traffic to each task's IP address on the container port.
Dynamic Port Mapping
With the EC2 launch type, multiple tasks can run on the same EC2 instance. If each task listens on port 3000, they would conflict. Dynamic port mapping solves this by assigning a random host port to each task. The ALB uses the target group to track which host port maps to which task.
ALB (port 443)
├── EC2 Instance A
│ ├── Task 1: host port 32768 -> container port 3000
│ └── Task 2: host port 32769 -> container port 3000
└── EC2 Instance B
└── Task 3: host port 32768 -> container port 3000
To enable dynamic port mapping, set the host port to 0 in your task definition's port mappings. The ALB target group must use the instance target type.
Tip: With Fargate, you do not need dynamic port mapping because each task has its own IP address. The ALB target group uses the
iptarget type and routes directly to each task's IP and container port.
Health Checks
The ALB performs health checks against your ECS tasks, just as it does for EC2 instances. Configure the health check path to an endpoint in your application that verifies the application is ready to serve traffic (for example, /health). If a task fails its health check, the ALB stops sending traffic to it, and the ECS service scheduler replaces it.
Health check parameters to configure:
| Parameter | Recommended Value | Reason |
|---|---|---|
| Path | /health | Dedicated endpoint that checks application readiness |
| Interval | 30 seconds | Balances detection speed with request overhead |
| Healthy threshold | 2 | Confirms recovery before sending traffic |
| Unhealthy threshold | 3 | Avoids marking tasks unhealthy due to transient issues |
| Timeout | 5 seconds | Allows time for the health endpoint to respond |
Service Discovery with AWS Cloud Map
For service-to-service communication that does not go through a load balancer, ECS integrates with AWS Cloud Map for DNS-based service discovery. When you enable service discovery, ECS automatically registers each task's IP address with a Cloud Map namespace. Other services can find your tasks by querying a DNS name (for example, api.my-app.local).
Service discovery is useful for internal microservices that communicate directly with each other. For services that receive external traffic, use an ALB instead.
ECS vs. EKS vs. Lambda: Choosing the Right Compute Option
AWS offers multiple compute services for running application code. Choosing the right one depends on your workload characteristics, team expertise, and operational requirements. In Module 09, you built serverless applications with Lambda. Now you can compare Lambda with the container orchestration options.
| Feature | Amazon ECS | Amazon EKS | AWS Lambda |
|---|---|---|---|
| Orchestration | AWS-native (ECS scheduler) | Kubernetes (open-source) | Event-driven (no orchestration needed) |
| Unit of deployment | Container (task definition) | Container (Pod spec) | Function (code package) |
| Max execution time | No limit (long-running services) | No limit (long-running services) | 15 minutes |
| Scaling granularity | Task level | Pod level | Function invocation level |
| Cold start | Seconds (container pull and start) | Seconds (container pull and start) | Milliseconds to seconds |
| Pricing model | Per task (Fargate) or per instance (EC2) | Per Pod (Fargate) or per instance (EC2) | Per invocation and duration |
| Portability | AWS-specific | Kubernetes-portable across clouds | AWS-specific |
| Operational complexity | Low to medium | High (Kubernetes expertise required) | Very low |
| Best for | Containerized web services, APIs, background workers | Teams with Kubernetes expertise, multi-cloud strategy, complex scheduling needs | Event-driven processing, APIs with variable traffic, short-duration tasks |
When to Use Each
Choose ECS when:
- You want a managed container orchestration service without the complexity of Kubernetes.
- Your team is already using AWS services and wants tight integration with the AWS ecosystem.
- You are running long-running web services, APIs, or background workers in containers.
Choose EKS when:
- Your team has existing Kubernetes expertise and tooling.
- You need portability across cloud providers or on-premises environments.
- You require advanced scheduling features, custom controllers, or the Kubernetes ecosystem of tools (Helm, Istio, Argo).
Choose Lambda when:
- Your workload is event-driven (responding to S3 uploads, API requests, queue messages).
- Individual executions complete within 15 minutes.
- Traffic is highly variable or unpredictable, with periods of zero traffic.
- You want to minimize operational overhead completely.
Tip: These services are not mutually exclusive. Many production architectures combine them. For example, you might use ECS for your core web application, Lambda for event-driven processing (such as image resizing on S3 upload), and EKS for workloads that your team already manages with Kubernetes.
Container Security Best Practices
Running containers in production requires attention to security at every layer: the image, the runtime, and the orchestration platform. The ECS security best practices guide provides detailed recommendations.
Run as a Non-Root User
By default, containers run as the root user. If an attacker exploits a vulnerability in your application, they gain root access inside the container. To limit the blast radius, create a non-root user in your Dockerfile and switch to it:
# Create a non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# Switch to the non-root user
USER appuser
Scan Images for Vulnerabilities
Amazon ECR image scanning identifies software vulnerabilities in your container images. ECR supports two scanning types:
- Basic scanning. Uses the Common Vulnerabilities and Exposures (CVE) database to scan for OS package vulnerabilities. You can configure scan-on-push to automatically scan every image when it is pushed to a repository.
- Enhanced scanning. Uses Amazon Inspector to continuously monitor images for both OS and programming language package vulnerabilities. Enhanced scanning provides more comprehensive results and continuous monitoring.
Warning: Image scanning identifies known vulnerabilities but does not guarantee that your image is secure. Combine scanning with other practices such as using minimal base images, keeping dependencies updated, and performing static code analysis.
Secrets Management
Never store sensitive values (database passwords, API keys, tokens) in your container images or as plain-text environment variables in task definitions. Instead, use one of these approaches:
- AWS Secrets Manager. Store secrets in Secrets Manager and reference them in your task definition using the
secretsfield. ECS injects the secret value as an environment variable at runtime. - AWS Systems Manager Parameter Store. Store configuration values in Parameter Store as SecureString parameters. Reference them in your task definition the same way as Secrets Manager values.
Both approaches require the task execution role to have permission to read the secrets.
Use Read-Only File Systems
Configure your containers with a read-only root file system to prevent attackers from writing malicious files. In your task definition, set readonlyRootFilesystem to true. If your application needs to write temporary files, mount a writable volume at a specific path (such as /tmp).
Keep Images Minimal
Start from minimal base images such as Alpine Linux or distroless images. Smaller images have fewer packages, which means fewer potential vulnerabilities. Remove build tools, package caches, and temporary files in the same Dockerfile layer where they are created.
Summary of Container Security Practices
| Practice | Implementation | Benefit |
|---|---|---|
| Non-root user | USER appuser in Dockerfile | Limits privilege escalation |
| Image scanning | ECR scan-on-push or enhanced scanning | Detects known vulnerabilities |
| Secrets management | Secrets Manager or Parameter Store references in task definition | Prevents credential exposure |
| Read-only file system | readonlyRootFilesystem: true in task definition | Prevents file system tampering |
| Minimal base images | Use Alpine or distroless base images | Reduces attack surface |
| Immutable tags | Use image digest or unique tags (not latest) | Ensures deployment consistency |
Instructor Notes
Estimated lecture time: 90 minutes
Common student questions:
-
Q: What is the difference between the task execution role and the task role? A: The task execution role is used by the ECS agent (the infrastructure layer) to perform actions such as pulling container images from ECR, writing logs to CloudWatch, and retrieving secrets from Secrets Manager. The task role is used by your application code running inside the container to access AWS services such as S3, DynamoDB, or SQS. Think of the execution role as "what ECS needs to set up the task" and the task role as "what your application needs to do its job." See the task execution role and task role documentation for details.
-
Q: When should I use Fargate versus the EC2 launch type? A: Start with Fargate for most workloads. It eliminates the need to manage EC2 instances, patch operating systems, and configure Auto Scaling groups for the underlying infrastructure. Use the EC2 launch type when you need GPU instances, when you have large steady-state workloads where Reserved Instances significantly reduce costs, or when you need custom OS-level configuration. See the Fargate documentation for supported configurations.
-
Q: How does ECS know when to replace a failed task? A: ECS monitors task health through multiple mechanisms. First, if a container exits (crashes), ECS detects the stopped task and launches a replacement. Second, if you configure an ALB health check, the load balancer marks unhealthy tasks, and the ECS service scheduler replaces them. Third, the deployment circuit breaker detects when new tasks repeatedly fail to start and can automatically roll back the deployment. See the ECS service documentation for details on the service scheduler.
-
Q: Can I run both Fargate and EC2 tasks in the same cluster? A: Yes. An ECS cluster can use capacity providers to mix Fargate and EC2 launch types. You can configure a capacity provider strategy that spreads tasks across both. This is useful when most of your workloads run on Fargate but a few require EC2 (for example, GPU tasks). See the capacity providers section of the cluster documentation.
Teaching tips:
- Start by connecting containers to the EC2 concepts from Module 04. Draw an EC2 instance on the whiteboard, then show how multiple containers run inside it, sharing the OS kernel. Compare this to running multiple separate EC2 instances, each with its own OS. This visual helps students understand the efficiency gain of containers.
- When explaining the ECS component hierarchy (cluster, service, task definition, task), use an analogy: the cluster is a factory, the task definition is a blueprint for a product, the service is the production line that ensures a certain number of products are always being made, and tasks are the individual products rolling off the line.
- Walk through the example task definition JSON field by field. Ask students to identify which fields correspond to concepts they already know (IAM roles from Module 02, port mappings from Module 03, log groups from CloudWatch). This reinforces cross-module connections.
- For the Fargate vs. EC2 comparison, present scenarios and ask students to choose: "Your startup has 3 developers and needs to deploy 5 microservices. Which launch type?" (Fargate, to minimize ops burden.) "Your company runs a machine learning pipeline that needs GPU access. Which launch type?" (EC2, because Fargate does not support GPUs.)
Pause points:
- After Containers 101: ask students to name three advantages of containers over VMs (faster startup, smaller size, higher density, portability). Then ask for a scenario where a VM is still the better choice (legacy application requiring a specific OS kernel version, or workloads needing full hardware isolation).
- After the task definition walkthrough: ask students what would happen if you set
essential: trueon a sidecar container and it crashes (answer: the entire task stops, because an essential container failure stops the task). - After the Fargate vs. EC2 comparison: present a cost scenario. "You run 10 tasks 24/7 on Fargate at 0.25 vCPU and 0.5 GB each. Would it be cheaper on EC2 with a Reserved Instance?" (Likely yes for steady-state workloads, because a single
t3.mediumReserved Instance could host all 10 tasks at a lower hourly rate.) - After the ECS vs. EKS vs. Lambda comparison: ask students which service they would choose for a webhook handler that processes 100 requests per day and takes 2 seconds per request (answer: Lambda, because the traffic is low and sporadic, and each execution is short).
Key Takeaways
- Containers package applications with their dependencies for consistent deployment across environments. They are lighter and faster than VMs, sharing the host OS kernel instead of running a separate guest OS.
- Amazon ECS orchestrates containers using four core concepts: clusters (where tasks run), task definitions (how tasks are configured), services (how many tasks to maintain), and tasks (running instances of a task definition).
- AWS Fargate removes the need to manage EC2 instances for container workloads. Start with Fargate for most use cases and move to the EC2 launch type only when you need GPU support, custom OS configuration, or cost optimization at scale with Reserved Instances.
- ECS integrates with ALB for load balancing, with service auto scaling for demand-based capacity, and with ECR for secure image storage. Use lifecycle policies to clean up old images and scan-on-push to detect vulnerabilities.
- Follow container security best practices: run as a non-root user, scan images for vulnerabilities, manage secrets through Secrets Manager or Parameter Store (never in images or plain-text environment variables), and use read-only file systems where possible.
AWS Bootcamp: From Novice to Architect Author: Samuel Ogunti License: CC BY-NC 4.0