AWS 云监控
https://github.com/prometheus/cloudwatch_exporter
docker pull prom/cloudwatch-exporter:cloudwatch_exporter-0.11.0
启动
配置文件
---
region: cn-north-1
metrics:
- aws_namespace: AWS/ELB
aws_metric_name: HealthyHostCount
aws_dimensions: [AvailabilityZone, LoadBalancerName]
aws_statistics: [Average]
- aws_namespace: AWS/ELB
aws_metric_name: UnHealthyHostCount
aws_dimensions: [AvailabilityZone, LoadBalancerName]
aws_statistics: [Average]
- aws_namespace: AWS/ELB
aws_metric_name: RequestCount
aws_dimensions: [AvailabilityZone, LoadBalancerName]
aws_statistics: [Sum]
当有多个指标的时候,每个指标都需要列举。以 ELB 为例,可以使用的 Metric 可以在下列链接中找到。 https://docs.amazonaws.cn/elasticloadbalancing/latest/classic/elb-cloudwatch-metrics.html
这些指标的汇总链接在这里 https://docs.amazonaws.cn/AmazonCloudWatch/latest/monitoring/aws-services-cloudwatch-metrics.html
将配置文件挂载到 缺省路径
docker run -p 9106 -v /path/on/host/config.yml:/config/config.yml prom/cloudwatch-exporter
将配置文件挂载到任意位置,然后通过 Docker 的 CMD 命令来指定配置文件路径
docker run -p 9106 -v /path/on/host/us-west-1.yml:/config/us-west-1.yml --env AWS_ACCESS_KEY_ID=AKIxxxx --env AWS_SECRET_ACCESS_KEY=LExxxxxx prom/cloudwatch-exporter:v0.11.0 /config/us-west-1.yml
如果有多个 AWS 账号需要监控,那么要每个账号启动一个 cloudwatch-exporter 。
AWS 的所有监控的服务和对应的指标
附录一 完整的配置示例文件
---
region: eu-west-1
metrics:
- aws_namespace: AWS/NetworkELB
aws_metric_name: HealthyHostCount
aws_dimensions: [AvailabilityZone, LoadBalancer, TargetGroup]
aws_statistics: [Average]
- aws_namespace: AWS/NetworkELB
aws_metric_name: UnHealthyHostCount
aws_dimensions: [AvailabilityZone, LoadBalancer, TargetGroup]
aws_statistics: [Average]
- aws_namespace: AWS/ApplicationELB
aws_metric_name: HealthyHostCount
aws_dimensions: [AvailabilityZone, LoadBalancer, TargetGroup]
aws_statistics: [Average]
- aws_namespace: AWS/ApplicationELB
aws_metric_name: UnHealthyHostCount
aws_dimensions: [AvailabilityZone, LoadBalancer, TargetGroup]
aws_statistics: [Average]
- aws_namespace: AWS/ApplicationELB
aws_metric_name: RequestCount
aws_dimensions: [AvailabilityZone, LoadBalancer]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: CPUUtilization
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: NetworkBytesIn
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: NetworkBytesOut
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: FreeableMemory
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: BytesUsedForCache
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: CurrConnections
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: NewConnections
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: CacheHits
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: CacheMisses
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/ElastiCache
aws_metric_name: ReplicationLag
aws_dimensions: [CacheClusterId]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: DatabaseConnections
aws_dimensions: [ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: HealthStatus
aws_dimensions: [ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: MaintenanceMode
aws_dimensions: [ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: CPUUtilization
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: PercentageDiskSpaceUsed
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: NetworkReceiveThroughput
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: NetworkTransmitThroughput
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: ReadLatency
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: ReadThroughput
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: ReadIOPS
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: WriteLatency
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: WriteThroughput
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
- aws_namespace: AWS/Redshift
aws_metric_name: WriteIOPS
aws_dimensions: [NodeID, ClusterIdentifier]
aws_statistics: [Average]
# S3 Storage metrics are published to Cloudwatch 1x per day with a timestamp of midnight UTC, hence period_seconds: 86400
# Publishing does not always occur at the same time, but it will occur before the next day, hence range_seconds: 172800
- aws_namespace: AWS/S3
aws_metric_name: BucketSizeBytes
aws_dimensions: [BucketName, StorageType]
aws_statistics: [Average] # Valid statistics (https://docs.aws.amazon.com/AmazonS3/latest/dev/cloudwatch-monitoring.html): Average
range_seconds: 172800
period_seconds: 86400
- aws_namespace: AWS/S3
aws_metric_name: NumberOfObjects
aws_dimensions: [BucketName, StorageType]
aws_statistics: [Average] # Valid statistics (https://docs.aws.amazon.com/AmazonS3/latest/dev/cloudwatch-monitoring.html): Average
range_seconds: 172800
period_seconds: 86400
# For CloudFront metrics, you have to set the region to us-east-1
- aws_namespace: AWS/CloudFront
aws_metric_name: Requests
aws_statistics: [Sum]
aws_dimensions: [DistributionId, Region]
aws_dimension_select:
Region: [Global]
- aws_namespace: AWS/CloudFront
aws_metric_name: BytesDownloaded
aws_statistics: [Sum]
aws_dimensions: [DistributionId, Region]
aws_dimension_select:
Region: [Global]
- aws_namespace: AWS/CloudFront
aws_metric_name: 4xxErrorRate
aws_statistics: [Average]
aws_dimensions: [DistributionId, Region]
aws_dimension_select:
Region: [Global]
- aws_namespace: AWS/CloudFront
aws_metric_name: 5xxErrorRate
aws_statistics: [Average]
aws_dimensions: [DistributionId, Region]
aws_dimension_select:
Region: [Global]
- aws_namespace: AWS/CloudFront
aws_metric_name: BytesUploaded
aws_statistics: [Sum]
aws_dimensions: [DistributionId, Region]
aws_dimension_select:
Region: [Global]
- aws_namespace: AWS/CloudFront
aws_metric_name: TotalErrorRate
aws_statistics: [Average]
aws_dimensions: [DistributionId, Region]
aws_dimension_select:
Region: [Global]