Monitor memory usage, disk usage, and processes for CentOS 7 EC2 instances with CloudWatch

Introduction

I had the opportunity to set up CloudWatch monitoring for CentOS7 series EC2 instances, so I'll leave it as a note. First, according to AWS's Official Documentation (https://docs.aws.amazon.com/ja_jp/AWSEC2/latest/UserGuide/mon-scripts.html), the traditional collecting metrics using CloudWatch monitoring scripts is ** deprecated **. The CloudWatch Monitoring Script is ** not supported ** CentOS7, but the CloudWatch Agent is ** supported ** so let's use the CloudWatch Agent.

Monitoring settings

Grant EC2 instances access to CloudWatch

According to Official Document

The following example assumes that you have specified an IAM role or the awscreds.conf file. Otherwise, you must use the --aws-access-key-id and --aws-secret-key parameters in these commands to specify your credentials.

However, I think it is better to set it with IAM role when granting permissions to AWS resources. So, attach the following IAM policy to the IAM role of the EC2 instance.

Also, don't forget to attach to the EC2 instance after creating the IAM policy. ** * Operations in this area are omitted. ** **

Install CloudWatch Agent

CloudWatch Agent installation command


$ sudo yum install https://s3.amazonaws.com/amazoncloudwatch-agent/centos/amd64/latest/amazon-cloudwatch-agent.rpm

Monitoring item settings

Start the monitoring item setting wizard with the following command and set the required monitoring items.

CloudWatch Agent configuration file creation wizard start command


$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

Press the [number] + [Enter key] or [Enter key] displayed for each setting item as shown below.

Setting Example


$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
=============================================================
= Welcome to the AWS CloudWatch Agent Configuration Manager =
=============================================================
#Select the OS type to be installed
On which OS are you planning to use the agent?
1. linux
2. windows
default choice: [1]:

#Choose between EC2 or on-premises
Trying to fetch the default region based on ec2 metadata...
Are you using EC2 or On-Premises hosts?
1. EC2
2. On-Premises
default choice: [1]:

#Specify the user who runs CloudWatch Agent
Which user are you planning to run the agent?
1. root
2. cwagent
3. others
default choice: [1]:

#Whether to turn on statusd
Do you want to turn on StatsD daemon?
1. yes
2. no
default choice: [1]:

#Specify a statusd listen port
Which port do you want StatsD daemon to listen to?
default choice: [8125]

#Specifies the interval at which the statusd daemon collects metrics
What is the collect interval for StatsD daemon?
1. 10s
2. 30s
3. 60s
default choice: [1]:

#Specifies the aggregation interval for metrics collected by statusd
What is the aggregation interval for metrics collected by StatsD daemon?
1. Do not aggregate
2. 10s
3. 30s
4. 60s
default choice: [4]:

#Whether to monitor metrics with collectd
Do you want to monitor metrics from CollectD?
1. yes
2. no
default choice: [1]:
2

#Whether to monitor host metrics
Do you want to monitor any host metrics? e.g. CPU, memory, etc.
1. yes
2. no
default choice: [1]:

#Whether to monitor CPU metrics per core
Do you want to monitor cpu metrics per core? Additional CloudWatch charges may apply.
1. yes
2. no
default choice: [1]:

#Whether to add EC2 dimensions to all metrics if information is available
Do you want to add ec2 dimensions (ImageId, InstanceId, InstanceType, AutoScalingGroupName) into all of your metrics if the info is available?
1. yes
2. no
default choice: [1]:

#Whether to collect metrics in high resolution
Would you like to collect your metrics at high resolution (sub-minute resolution)? This enables sub-minute resolution for all metrics, but you can customize for specific metrics in the output json file.
1. 1s
2. 10s
3. 30s
4. 60s
default choice: [4]:

#Which default metric setting to use
Which default metrics config do you want?
1. Basic
2. Standard
3. Advanced
4. None
default choice: [1]:

#Currently set of
Current config as follows:
{
	"agent": {
		"metrics_collection_interval": 60,
		"run_as_user": "root"
	},
	"metrics": {
		"append_dimensions": {
			"AutoScalingGroupName": "${aws:AutoScalingGroupName}",
			"ImageId": "${aws:ImageId}",
			"InstanceId": "${aws:InstanceId}",
			"InstanceType": "${aws:InstanceType}"
		},
		"metrics_collected": {
			"disk": {
				"measurement": [
					"used_percent"
				],
				"metrics_collection_interval": 60,
				"resources": [
					"*"
				]
			},
			"mem": {
				"measurement": [
					"mem_used_percent"
				],
				"metrics_collection_interval": 60
			},
			"statsd": {
				"metrics_aggregation_interval": 60,
				"metrics_collection_interval": 10,
				"service_address": ":8125"
			}
		}
	}
}

#Confirmation message whether the above settings are okay
Are you satisfied with the above config? Note: it can be manually customized after the wizard completes to add additional items.
1. yes
2. no
default choice: [1]:

#Whether there is an existing CloudWatch Log Agent configuration file to import for migration
Do you have any existing CloudWatch Log Agent (http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AgentReference.html) configuration file to import for migration?
1. yes
2. no
default choice: [2]:

#Whether to monitor log files
Do you want to monitor any log files?
1. yes
2. no
default choice: [1]:
2

#settings/opt/aws/amazon-cloudwatch-agent/bin/config.Message that saved in json
Saved config file to /opt/aws/amazon-cloudwatch-agent/bin/config.json successfully.
Current config as follows:
{
	"agent": {
		"metrics_collection_interval": 60,
		"run_as_user": "root"
	},
	"metrics": {
		"append_dimensions": {
			"AutoScalingGroupName": "${aws:AutoScalingGroupName}",
			"ImageId": "${aws:ImageId}",
			"InstanceId": "${aws:InstanceId}",
			"InstanceType": "${aws:InstanceType}"
		},
		"metrics_collected": {
			"disk": {
				"measurement": [
					"used_percent"
				],
				"metrics_collection_interval": 60,
				"resources": [
					"*"
				]
			},
			"mem": {
				"measurement": [
					"mem_used_percent"
				],
				"metrics_collection_interval": 60
			},
			"statsd": {
				"metrics_aggregation_interval": 60,
				"metrics_collection_interval": 10,
				"service_address": ":8125"
			}
		}
	}
}

#Whether to save the config in the SSM parameter store
Please check the above content of the config.
The config file is also located at /opt/aws/amazon-cloudwatch-agent/bin/config.json.
Edit it manually if needed.
Do you want to store the config in the SSM parameter store?
1. yes
2. no
default choice: [1]:

#Select a name to save in the parameter store
What parameter store name do you want to use to store your config? (Use 'AmazonCloudWatch-' prefix if you use our managed AWS policy)
default choice: [AmazonCloudWatch-linux]

#Message confirming default region based on EC2 metadata
Trying to fetch the default region based on ec2 metadata...
Which region do you want to store the config in the parameter store?
default choice: [ap-northeast-1]

#Credential confirmation message to send configuration file to parameter store
Which AWS credential should be used to send json config to parameter store?
1. ASIA****************(From SDK)
2. Other
default choice: [1]:

Successfully put config to parameter store AmazonCloudWatch-linux.
Program exits now.

Launch CloudWatch Agent

Start the agent with the following command. With the fetch-config option, it seems to fetch the latest settings from the parameter store and start it.

CloudWatch Agent launch command


$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:AmazonCloudWatch-linux -s

Check the metric transmission status

Of the Metrics of CloudWatch, the metrics of CloudWatchAgent exist in the following path. image.png

The pages to which the metrics are written differ depending on the disk usage, CPU usage, and memory usage, so check if necessary. ** * It will take some time before the metrics are actually displayed as a graph **

image.png

When adding process monitoring settings

If you want to add the process monitoring settings individually, add the following to the settings saved in the parameter store.

mysql process monitoring example


{
  "metrics": {
    "metrics_collected": {
      "procstat": [
        {
          "exe": "mysql",
          "measurement": [
            "pid_count"
          ],
          "metrics_collection_interval": 60
        }
      ]
    }
  }
}

After saving, don't forget to add the fetch-config option on the server side to read the latest config file.

$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:AmazonCloudWatch-linux -s

Creating a dashboard

Once you've verified that the metrics you need have been sent to CloudWatch, let's create a dashboard. For CPU usage, disk usage, memory usage, etc., the line graph is easy to see, but in the case of process monitoring, it may be easier to see the numerical value display. ** * Operations in this area are omitted. ** **

image.png

Alarm settings

Let's set appropriately so that the alarm sounds when the value of each metric exceeds the threshold value. I will omit the operation, but I think it is better to monitor at least CPU usage, memory usage, and disk usage. Also, if you are using a T family of instances such as t3.micro, it's a good idea to monitor the relevant metrics as ** CPU bursts can run out of CPU credits **. ..

trouble shooting

CloudWatchAgent fails to start

Error message


======== Error Log ========
2020-10-23T01:01:11Z E![telegraf] Error running agent: Error parsing /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml, open /usr/share/collectd/types.db: no such file or directory

When configuring the agent in the wizard, enabling the monitoring item using collectd will cause the agent to fail to start.

Items that the wizard asks if you want to use collected monitored items


Do you want to monitor metrics from CollectD?
1. yes
2. no
default choice: [1]:

The cause is that collectd is not included, so if you need the monitoring items of collectd, install collectd by referring to Getting Custom Metrics Using collectd.

$ sudo yum install collectd

reference

Recommended Posts

Monitor memory usage, disk usage, and processes for CentOS 7 EC2 instances with CloudWatch
[AWS] Link memory usage of Ubuntu EC2 instance to CloudWatch