There are many ways to download logs stored on an instance of RDS using download_db_log_file_portion
when you google. However, there was a problem, so I wrote a script that uses the API downloadCompleteLogFile
.
--If it is awscli, it will be interrupted in the middle (pagination will be interrupted) --Even if you hit the API directly, it needs to be divided into small pieces, and it takes time to download. --Mojibake (If Japanese is included, it will be replaced with?)
downloadCompleteLogFile
solves both problems.
You must sign SigV4 yourself to access downloadCompleteLogFile
. You need an IAM user or IAM role.
This Python script is created by signing the curl command to download with SigV4. The Python script itself does not download, so it is a way to execute the created curl command later.
It is assumed that ~ / .aws / config
and ~ / .aws / credentials
have the appropriate permission settings.
import boto3
from botocore.awsrequest import AWSRequest
import botocore.auth as auth
import urllib.request
import pprint
profile = "default"
instance_id = "database-1"
region = "ap-northeast-1"
session = boto3.session.Session(profile_name = profile)
credentials = session.get_credentials()
sigv4auth = auth.SigV4Auth(credentials, "rds", region)
rds_client = session.client('rds')
files = rds_client.describe_db_log_files(DBInstanceIdentifier = instance_id)
for file in files["DescribeDBLogFiles"]:
file_name = file["LogFileName"]
#Judge download exclusion from file name
if not file_name.startswith("error/"):
continue
if file_name == "error/postgres.log":
continue
#downloadCompleteLogFile API URL
remote_host = "rds." + region + ".amazonaws.com"
url = "https://" + remote_host + "/v13/downloadCompleteLogFile/" + instance_id + "/" + file_name
#Sig V4 signature
awsreq = AWSRequest(method = "GET", url = url)
sigv4auth.add_auth(awsreq)
req = urllib.request.Request(url, headers = {
"Authorization": awsreq.headers['Authorization'],
"Host": remote_host,
"X-Amz-Date": awsreq.context['timestamp'],
})
#Echo command for download progress
echo_cmd = "echo '" + file_name + "' >&2"
print(echo_cmd)
#curl command
header = " ".join(["-H '" + k + ": " + v + "'" for (k, v) in req.headers.items()])
cmd = "curl " + header + " '" + url + "'"
print(cmd)
This Python produces the following output:
echo 'error/postgresql.log.2020-11-05-23' >&2
curl -H 'Authorization: AWS4-HMAC-SHA256 Credential=AKIAXXXXXXXXXXXXXXXX/20201105/ap-northeast-1/rds/aws4_request, SignedHeaders=host;x-amz-date, Signature=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' -H 'Host: rds.ap-northeast-1.amazonaws.com' -H 'X-amz-date: 20201105T231307Z' 'https://rds.ap-northeast-1.amazonaws.com/v13/downloadCompleteLogFile/database-1/error/postgresql.log.2020-11-05-23'
Do this in Bash.
$ python download-rds-log.py | bash > log.txt
Since the curl command is output, it can be improved to parallel execution. It's much faster than using download_db_log_file_portion without modification.
I also wrote about the SigV4 signature in the following article.
-To access the AWS API Gateway with IAM authentication by signing SigV4 from Python -To access the AWS API Gateway with IAM authentication from C # with a SigV4 signature as an IAM user -To access the AWS API Gateway with IAM authentication from C # with a SigV4 signature in the IAM role
Recommended Posts