Operations & Observability
Operate with confidence. CloudWatch metrics and logs, X-Ray distributed tracing, cost management, automated backups, disaster recovery, and Systems Manager.
Modules in This Phase
Module 61: CloudWatch Metrics & Alarms
Every AWS service you deploy produces metrics. EC2 instances report CPU usage every 5 minutes. Lambda functions report invocation count, duration, and errors after every execution. RDS databases report connection count, freeable memory, and IOPS consumption. ALBs report request count, target respons
Module 62: CloudWatch Logs
Metrics tell you something is wrong. Logs tell you why.
Module 63: Distributed Tracing with AWS X-Ray
In a monolithic application, debugging is straightforward. A request enters one process, executes sequentially, and you can trace the entire path through a single log file or debugger session. When that request fails or runs slowly, you know exactly where to look.
Module 64: Cost Management
AWS bills you for what you use. This is simultaneously the greatest advantage and the greatest risk of cloud computing. There is no upfront capital expenditure gate that forces you to justify resource allocation before deployment. You can launch a fleet of instances, provision a terabyte database, a
Module 65: Resource Optimization
Cost management tells you where money is going. Resource optimization tells you where money is being wasted. The distinction matters because most organizations skip straight to commitment purchases (Reserved Instances, Savings Plans) without first eliminating the waste they are committing to.
Module 66: AWS Backup
Every AWS service that stores data has its own backup mechanism. RDS has automated backups and manual snapshots. EBS has snapshots. DynamoDB has on-demand backups and point-in-time recovery. EFS has its own backup process. S3 has versioning and replication.
Module 67: Disaster Recovery
Disaster recovery is not about whether a disaster will happen. It is about when. AWS Regions are highly available, but they are not invincible. The 2017 S3 outage in us-east-1 cascaded across dozens of major services. The 2020 Kinesis failure in us-east-1 brought down services across the Region for
Module 68: AWS Systems Manager
Managing one EC2 instance is simple. You SSH in, run commands, install patches, check configurations. Managing 50 instances is tedious but possible with scripts and SSH key distribution. Managing 500 instances across multiple accounts and Regions with SSH is a security and operational nightmare.
Phase 11 Exam
Test your knowledge of all 8 modules in this phase. 25 questions, 70% required to pass.