카테고리 없음
aws grafana, prometheus 구성_8_B계정 모니터링_1
plli
2025. 5. 30. 14:09
B 계정
이제 B 계정의 EC2 를 모니터링을 할 수 있도록 구성한다.
먼저 B 계정에 ec2 생성 이름은 b_acount_ec2_1
A 계정에서 했던 것과 동일하게 ec2 생성 후 cloudwatch agent 설치
sudo yum install amazon-cloudwatch-agent -y
sudo vim /opt/aws/amazon-cloudwatch-agent/bin/config.json
config.json
{
"agent": {
"metrics_collection_interval": 10,
"run_as_user": "root"
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/secure",
"log_group_name": "b_acount_ec2_1/var/log/secure",
"log_stream_name": "{instance_id}",
"retention_in_days": -1
}
]
}
}
},
"metrics": {
"append_dimensions": {
"InstanceId": "${aws:InstanceId}"
},
"metrics_collected": {
"collectd": {
"metrics_aggregation_interval": 60
},
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"/"
]
},
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60
},
"statsd": {
"metrics_aggregation_interval": 60,
"metrics_collection_interval": 10,
"service_address": ":8125"
}
}
}
}
sudo mkdir -p /usr/share/collectd
sudo touch /usr/share/collectd/types.db
생성 후 해당 ec2 에 CloudWatchAgentServerPolicy 역할 추가
아래 명령어로 cloudwatch agent 재시작
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
cloudwatch 를 통해서 mem, disk 모니터링이 되는지 확인
인스턴스 ID 로 조회
A 계정 모니터링 서버
# role 다르기 때문에 새로 만든다
cp -r a_account_ec2/ b_acount_ec2
cd b_acount_ec2
vi config.yml
====================================================
region: us-west-2
role_arn: arn:aws:iam::B계정의roleARN 입력@@@:role/A-B_Assume_Role
# 수집 항목
metrics:
- aws_dimensions:
- InstanceId
# CPU 수집
aws_metric_name: CPUUtilization
# AWS 자체에서 수집하는 namespace
aws_namespace: AWS/EC2
# 평균값 / max/min 가능
aws_statistics:
- Average
aws_tag_select:
resource_type_selection: ec2:instance
resource_id_dimension: InstanceId
tag_selections:
# 해당 name 과 동일한 tag 를 가진 ec2 의 정보 수집
Name: ["b_acount_ec2_1"]
##### mem #####
- aws_dimensions:
- InstanceId
aws_metric_name: mem_used_percent
aws_namespace: CWAgent
aws_statistics:
- Average
aws_tag_select:
resource_type_selection: ec2:instance
resource_id_dimension: InstanceId
tag_selections:
Name: ["b_acount_ec2_1"]
##### disk #####
- aws_dimensions: [InstanceId,path,device,fstype]
aws_dimension_select:
path: ['/']
aws_metric_name: disk_used_percent
aws_namespace: CWAgent
aws_statistics:
- Average
aws_tag_select:
resource_type_selection: ec2:instance
resource_id_dimension: InstanceId
tag_selections:
Name: ["b_acount_ec2_1"]
docker build -t b_acount_ec2:v1 .
docker run -d -p 9109:9106 --name b_acount_ec2 b_acount_ec2:v1
도커 실행 후 IP 확인
docker inspect b_acount_ec2 | grep IPAddress
vi prometheus/prometheus.yml
global:
scrape_interval: 1m
evaluation_interval: 1m
scrape_configs:
- job_name: 'a_acount_ec2'
static_configs:
- targets: ['172.17.0.2:9106']
metric_relabel_configs:
- source_labels: [tag_Name]
target_label: instance_name
- job_name: 'a_acount_rds'
static_configs:
- targets: ['172.17.0.5:9106']
- job_name: 'a_acount_cache'
static_configs:
- targets: ['172.17.0.6:9106']
#### 이번에 새로 생성한 b_acount_ec2 추가
- job_name: 'b_acount_ec2'
static_configs:
- targets: ['172.17.0.7:9106']
metric_relabel_configs:
- source_labels: [tag_Name]
target_label: instance_name
위 구성 파일로 컨테이너를 새로 올리거나 exec 를 통해 컨테이너에서 수정을 한다.
서버의 prometheus.yml 파일을 수정할 경우
cd prometheus
docker build -t prometheus:v2 .
docker run -d -p 9090:9090 --name prometheus prometheus:v2
또는
prometheus 컨테이너를 수정할 경우
docker exec -it -u root prometheus /bin/sh
vi prometheus.yml -> 수정
docker restart prometheus
모니터링 서버 ip:9090 접속 후
위와 같이 b_acount_ec2 가 뜨면 완료
이제 그라파나로 이동
모니터링 서버 ip:3000
a_count_ec2 대시보드 복사 후 코드 변경을 하면 그래프를 확인할 수 있다.
매트릭
instance_name=~"(b).*" 인 이유는 ec2 cloudwatch exporter 에서 이 부분 설정때문이다.
aws_tag_select:
resource_type_selection: ec2:instance
resource_id_dimension: InstanceId
tag_selections:
Name: ["b_acount_ec2_1"]
# CPU
aws_ec2_cpuutilization_average * on(instance_id) group_left(instance_name, tag_Name) (
max by (instance_id, instance_name, tag_Name) (aws_resource_info{instance_name=~"(b).*"})
)
# Memory
cwagent_mem_used_percent_average * on(instance_id) group_left(instance_name, tag_Name) (
max by (instance_id, instance_name, tag_Name) (aws_resource_info{instance_name=~"(b).*"})
)
# Disk
cwagent_disk_used_percent_average * on(instance_id) group_left(instance_name, tag_Name) (
max by (instance_id, instance_name, tag_Name) (aws_resource_info{instance_name=~"(b).*"})
)