Since the alert was issued, get the server information with low awareness with fabric

Overview

A fabric use case that can be used to check the server status after an alert, rather than automating with low awareness.

Take disk usage

The result of df -h.

fabfile.py


...
def df_stat():
	with hide('everything', 'status'):
		print green(run("df -h"))
...

python



$ fab df_stat -H server1,server2
[server1] Executing task 'df_stat'
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        20G  2.8G   16G  15% /
devtmpfs        240M     0  240M   0% /dev
tmpfs           246M     0  246M   0% /dev/shm
tmpfs           246M   29M  217M  12% /run
tmpfs           246M     0  246M   0% /sys/fs/cgroup
[server2] Executing task 'df_stat'
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        20G  2.8G   16G  15% /
devtmpfs        240M     0  240M   0% /dev
...

Take CPU load at a specified time

The result of specifying the time with -s and -e of sar -u, -q

fabfile.py


...
def cpu_stat(start_time, end_time):
	with hide('everything', 'status'):
		print green(run("sar -q -s " + start_time + " -e " + end_time))
		print green(run("sar -u -s " + start_time + " -e " + end_time))
...

python



$ fab cpu_stat:start_time="08:50:00",end_time="09:10:00" -H server1,server2,server3
[server1] Executing task 'cpu_stat'
08:50:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
08:51:01 AM         1       116      0.00      0.01      0.05         0
08:52:01 AM         1       116      0.00      0.01      0.05         0
08:53:01 AM         1       116      0.00      0.01      0.05         0
08:54:01 AM         1       116      0.00      0.01      0.05         0
...
08:50:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
08:51:01 AM     all      0.08      0.00      0.07      0.02      0.00     99.83
08:52:01 AM     all      0.07      0.00      0.05      0.00      0.00     99.88
08:53:01 AM     all      0.03      0.00      0.05      0.00      0.00     99.92
08:54:01 AM     all      0.08      0.00      0.03      0.00      0.00     99.88
08:55:01 AM     all      0.05      0.00      0.05      0.00      0.00     99.90
08:56:01 AM     all      0.07      0.00      0.05      0.02      0.00     99.87
...
[server2] Executing task 'cpu_stat'
08:50:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
08:51:01 AM         1       116      0.00      0.01      0.05         0
...


Take apache access_log statistics

Count access_log nicely with cut or sort | unic -c.

fabfile.py


...
def access_log_report(hour):
	base = "/var/log/hoge/logs/"
	with cd(base):
		with hide('everything', 'status'):
			ip_cmd = "cat access_log |grep `date +%Y`:" + hour + ":| cut -f 1 -d ' ' | sort | uniq -c | sort -nr | head -10"
			contents_cmd = "cat access_log |grep `date +%Y`:" + hour + ":| cut -f 7 -d ' ' | sort | uniq -c | sort -nr | head -10"
			
			print "#### top ip"
			ip = run(ip_cmd, shell_escape=False)
			print ip
			print "#### top contents"
			contents = run(contents_cmd, shell_escape=False)
			print contents
...

python



$ fab access_log_report:hour=17 -H server1,server2
[server1] Executing task 'access_log_report'
#### top ip
      20 XXX.XXX.XXX.XXX
      10 YYY.YYY.YYY.YYY
#### top contents
      15 /
      15 /hoge
[server2] Executing task 'access_log_report'
...

Summary

If you specify another option of sar, ps, free, whatever you like, It seems that you can usually check the place where you enter the server and hit the command.

As a point

The actual code looks like the following.

fabfile.py



from fabric.colors import *
from fabric.api import run, env, hide

env.warn_only=True

def access_log_report(hour):
	base = "/var/log/hoge/logs/"
	with cd(base):
		with hide('everything', 'status'):
			ip_cmd = "cat access_log |grep `date +%Y`:" + hour + ":| cut -f 1 -d ' ' | sort | uniq -c | sort -nr | head -10"
			contents_cmd = "cat access_log |grep `date +%Y`:" + hour + ":| cut -f 7 -d ' ' | sort | uniq -c | sort -nr | head -10"
			
			print "#### top ip"
			ip = run(ip_cmd, shell_escape=False)
			print ip
			print "#### top contents"
			contents = run(contents_cmd, shell_escape=False)
			print contents

def df_stat():
	with hide('everything', 'status'):
		print green(run("df -h"))

def cpu_stat(start_time, end_time):
	with hide('everything', 'status'):
		print green(run("sar -q -s " + start_time + " -e " + end_time))
		print green(run("sar -u -s " + start_time + " -e " + end_time))


poem

Due to various reasons, it is difficult to enter the server and it is difficult to manage the server automatically. (In fact, there are many things to think about automation, and if there is already an existing mechanism, it is quite difficult to adapt it to it.)

So for the time being, what I usually do over the console I tried to automate it, or I tried using fabric like that.

There is also talk of low automation.

I see a tutorial like fabric, you can do quite a lot! I didn't think there were any specific use cases, so I summarized them.

By the way, for monitoring, fluentd and norikra seem to be useful, so let's use them Look at the application log. chef --Next generation monitoring tool Sensu reference --Qiita I think you should check the server status.

However, there are times when you don't have what you want for that kind of thing ... In addition, my finger reflexively enters the server and hits a command to check it directly ...

Also, serverspec is too conscious, so once it is fabric. I'm more used to python (or rather, the server shell command as it is). However, it's quite convenient to just ssh and hit commands easily.

that's all.

Recommended Posts

Since the alert was issued, get the server information with low awareness with fabric
Get the latest AMI information with the AWS CLI
Get swagger.json with Flask-RESTX (Flask-RESTPlus) without starting the server
Get the id of a GPU with low memory usage
[Python x Zapier] Get alert information and notify with Slack
Get information with zabbix api
Get Alembic information with Python
PhytoMine-I tried to get the genetic information of plants with Python