---
name: cloudwatch
description: AWS CloudWatch monitoring for logs, metrics, alarms, and dashboards. Use when setting up monitoring, creating alarms, querying logs with Insights, configuring metric filters, building dashboards, or troubleshooting application issues.
last_updated: "2026-01-07"
doc_source: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/
---

# AWS CloudWatch

Amazon CloudWatch provides monitoring and observability for AWS resources and applications. It collects metrics, logs, and events, enabling you to monitor, troubleshoot, and optimize your AWS environment.

## Table of Contents

- [Core Concepts](#core-concepts)
- [Common Patterns](#common-patterns)
- [CLI Reference](#cli-reference)
- [Best Practices](#best-practices)
- [Troubleshooting](#troubleshooting)
- [References](#references)

## Core Concepts

### Metrics

Time-ordered data points published to CloudWatch. Key components:
- **Namespace**: Container for metrics (e.g., `AWS/Lambda`)
- **Metric name**: Name of the measurement (e.g., `Invocations`)
- **Dimensions**: Name-value pairs for filtering (e.g., `FunctionName=MyFunc`)
- **Statistics**: Aggregations (Sum, Average, Min, Max, SampleCount, pN)

### Logs

Log data from AWS services and applications:
- **Log groups**: Collections of log streams
- **Log streams**: Sequences of log events from same source
- **Log events**: Individual log entries with timestamp and message

### Alarms

Automated actions based on metric thresholds:
- **States**: OK, ALARM, INSUFFICIENT_DATA
- **Actions**: SNS notifications, Auto Scaling, EC2 actions

## Common Patterns

### Create a Metric Alarm

**AWS CLI:**

```bash
# CPU utilization alarm for EC2
aws cloudwatch put-metric-alarm \
  --alarm-name "HighCPU-i-1234567890abcdef0" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 2 \
  --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:alerts \
  --ok-actions arn:aws:sns:us-east-1:123456789012:alerts
```

**boto3:**

```python
import boto3

cloudwatch = boto3.client('cloudwatch')

cloudwatch.put_metric_alarm(
    AlarmName='HighCPU-i-1234567890abcdef0',
    MetricName='CPUUtilization',
    Namespace='AWS/EC2',
    Statistic='Average',
    Period=300,
    Threshold=80.0,
    ComparisonOperator='GreaterThanThreshold',
    EvaluationPeriods=2,
    Dimensions=[
        {'Name': 'InstanceId', 'Value': 'i-1234567890abcdef0'}
    ],
    AlarmActions=['arn:aws:sns:us-east-1:123456789012:alerts'],
    OKActions=['arn:aws:sns:us-east-1:123456789012:alerts']
)
```

### Lambda Error Rate Alarm

```bash
aws cloudwatch put-metric-alarm \
  --alarm-name "LambdaErrorRate-MyFunction" \
  --metrics '[
    {
      "Id": "errors",
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/Lambda",
          "MetricName": "Errors",
          "Dimensions": [{"Name": "FunctionName", "Value": "MyFunction"}]
        },
        "Period": 60,
        "Stat": "Sum"
      },
      "ReturnData": false
    },
    {
      "Id": "invocations",
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/Lambda",
          "MetricName": "Invocations",
          "Dimensions": [{"Name": "FunctionName", "Value": "MyFunction"}]
        },
        "Period": 60,
        "Stat": "Sum"
      },
      "ReturnData": false
    },
    {
      "Id": "errorRate",
      "Expression": "errors/invocations*100",
      "Label": "Error Rate",
      "ReturnData": true
    }
  ]' \
  --threshold 5 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 3 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:alerts
```

### Query Logs with Insights

```bash
# Find errors in Lambda logs
aws logs start-query \
  --log-group-name /aws/lambda/MyFunction \
  --start-time $(date -d '1 hour ago' +%s) \
  --end-time $(date +%s) \
  --query-string '
    fields @timestamp, @message
    | filter @message like /ERROR/
    | sort @timestamp desc
    | limit 50
  '

# Get query results
aws logs get-query-results --query-id <query-id>
```

**boto3:**

```python
import boto3
import time

logs = boto3.client('logs')

# Start query
response = logs.start_query(
    logGroupName='/aws/lambda/MyFunction',
    startTime=int(time.time()) - 3600,
    endTime=int(time.time()),
    queryString='''
        fields @timestamp, @message
        | filter @message like /ERROR/
        | sort @timestamp desc
        | limit 50
    '''
)

query_id = response['queryId']

# Wait for results
while True:
    result = logs.get_query_results(queryId=query_id)
    if result['status'] == 'Complete':
        break
    time.sleep(1)

for row in result['results']:
    print(row)
```

### Create Metric Filter

Extract metrics from log patterns:

```bash
# Create metric filter for error count
aws logs put-metric-filter \
  --log-group-name /aws/lambda/MyFunction \
  --filter-name ErrorCount \
  --filter-pattern "ERROR" \
  --metric-transformations \
    metricName=ErrorCount,metricNamespace=MyApp,metricValue=1,defaultValue=0
```

### Publish Custom Metrics

```python
import boto3

cloudwatch = boto3.client('cloudwatch')

cloudwatch.put_metric_data(
    Namespace='MyApp',
    MetricData=[
        {
            'MetricName': 'OrdersProcessed',
            'Value': 1,
            'Unit': 'Count',
            'Dimensions': [
                {'Name': 'Environment', 'Value': 'Production'},
                {'Name': 'OrderType', 'Value': 'Standard'}
            ]
        }
    ]
)
```

### Create Dashboard

```bash
cat > dashboard.json << 'EOF'
{
  "widgets": [
    {
      "type": "metric",
      "x": 0, "y": 0, "width": 12, "height": 6,
      "properties": {
        "title": "Lambda Invocations",
        "metrics": [
          ["AWS/Lambda", "Invocations", "FunctionName", "MyFunction"]
        ],
        "period": 60,
        "stat": "Sum",
        "region": "us-east-1"
      }
    },
    {
      "type": "log",
      "x": 12, "y": 0, "width": 12, "height": 6,
      "properties": {
        "title": "Recent Errors",
        "query": "SOURCE '/aws/lambda/MyFunction' | filter @message like /ERROR/ | limit 20",
        "region": "us-east-1"
      }
    }
  ]
}
EOF

aws cloudwatch put-dashboard \
  --dashboard-name MyAppDashboard \
  --dashboard-body file://dashboard.json
```

## CLI Reference

### Metrics Commands

| Command | Description |
|---------|-------------|
| `aws cloudwatch put-metric-data` | Publish custom metrics |
| `aws cloudwatch get-metric-data` | Retrieve metric values |
| `aws cloudwatch get-metric-statistics` | Get aggregated statistics |
| `aws cloudwatch list-metrics` | List available metrics |

### Alarms Commands

| Command | Description |
|---------|-------------|
| `aws cloudwatch put-metric-alarm` | Create or update alarm |
| `aws cloudwatch describe-alarms` | List alarms |
| `aws cloudwatch set-alarm-state` | Manually set alarm state |
| `aws cloudwatch delete-alarms` | Delete alarms |

### Logs Commands

| Command | Description |
|---------|-------------|
| `aws logs create-log-group` | Create log group |
| `aws logs put-log-events` | Write log events |
| `aws logs filter-log-events` | Search log events |
| `aws logs start-query` | Start Insights query |
| `aws logs put-metric-filter` | Create metric filter |
| `aws logs put-retention-policy` | Set log retention |

## Best Practices

### Metrics

- **Use dimensions wisely** — too many creates metric explosion
- **Aggregate before publishing** — batch custom metrics
- **Use high-resolution metrics** (1-second) only when needed
- **Set meaningful units** for custom metrics

### Alarms

- **Use composite alarms** for complex conditions
- **Set appropriate evaluation periods** to avoid flapping
- **Include OK actions** to track recovery
- **Use anomaly detection** for dynamic thresholds

### Logs

- **Set retention policies** — don't keep logs forever
- **Use structured logging** (JSON) for better querying
- **Create metric filters** for key events
- **Use Contributor Insights** for top-N analysis

### Cost Optimization

- **Delete unused dashboards**
- **Reduce log retention** for non-critical logs
- **Avoid high-resolution metrics** unless necessary
- **Use log subscription filters** instead of polling

## Troubleshooting

### Missing Metrics

**Causes:**
- Service not publishing yet (wait 1-5 minutes)
- Wrong namespace/dimensions
- Detailed monitoring not enabled (EC2)

**Debug:**

```bash
# List metrics for a namespace
aws cloudwatch list-metrics \
  --namespace AWS/Lambda \
  --dimensions Name=FunctionName,Value=MyFunction
```

### Alarm Stuck in INSUFFICIENT_DATA

**Causes:**
- Metric not being published
- Dimensions mismatch
- Evaluation period too short

**Debug:**

```bash
# Check if metric has data
aws cloudwatch get-metric-statistics \
  --namespace AWS/Lambda \
  --metric-name Invocations \
  --dimensions Name=FunctionName,Value=MyFunction \
  --start-time $(date -d '1 hour ago' -u +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 60 \
  --statistics Sum
```

### Log Events Not Appearing

**Causes:**
- IAM permissions missing
- CloudWatch Logs agent not running
- Log group doesn't exist

**Debug:**

```bash
# Check log streams
aws logs describe-log-streams \
  --log-group-name /aws/lambda/MyFunction \
  --order-by LastEventTime \
  --descending \
  --limit 5
```

### High CloudWatch Costs

**Check usage:**

```bash
# Get PutLogEvents usage
aws cloudwatch get-metric-statistics \
  --namespace AWS/Logs \
  --metric-name IncomingBytes \
  --dimensions Name=LogGroupName,Value=/aws/lambda/MyFunction \
  --start-time $(date -d '7 days ago' -u +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 86400 \
  --statistics Sum
```

## References

- [CloudWatch User Guide](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/)
- [CloudWatch Logs User Guide](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/)
- [CloudWatch API Reference](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/)
- [CloudWatch CLI Reference](https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/)
- [Logs Insights Query Syntax](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html)
- [boto3 CloudWatch](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/cloudwatch.html)