Mettre en forme le journal d'accès AWS ALB au format JSON

Si vous souhaitez analyser sérieusement les journaux d'accès, vous devez utiliser Athena.

Cependant, il arrive parfois que vous souhaitiez simplement voir le journal d'accès devant vous. Dans de tels moments, le format séparé par des espaces est mauvais pour les yeux. Par conséquent, convertissez-le en JSON facile à lire.

Les délimiteurs d'espaces sont un type de csv et peuvent être analysés avec le module csv.

#!/usr/local/bin/python
# alb_access_log_to_json.py

import fileinput
import json
import csv

# https://docs.aws.amazon.com/ja_jp/elasticloadbalancing/latest/application/load-balancer-access-logs.html#access-log-entry-format
FIELD_KEYS = """
type
timestamp
elb
client:port
target:port
request_processing_time
target_processing_time
response_processing_time
elb_status_code
target_status_code
received_bytes
sent_bytes
request
user_agent
ssl_cipher
ssl_protocol
target_group_arn
trace_id
domain_name
chosen_cert_arn
matched_rule_priority
request_creation_time
actions_executed
redirect_url
error_reason
target:port_list
target_status_code_list
""".split()

reader = csv.reader(fileinput.input(), delimiter=' ', quotechar='"', escapechar='\\')
for fields in reader:
    j = dict(zip(FIELD_KEYS, fields))
    print(json.dumps(j))

Exemple d'exécution:

$ head -1 access_log.txt
h2 2020-03-08T23:50:58.701251Z app/xxxxxx-prod-alb/xxxxxxxxx 222.222.222.222:64202 - -1 -1 -1 302 - 1254 224 "GET https://example.com:443/action_store HTTP/2.0" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 - "Root=1-xxxxxxx-xxxxxxxxxxxx" "at.m3.com" "arn:aws:acm:ap-northeast-1:xxxxxxxxxx:certificate/xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx" 300 2020-03-08T23:50:58.701000Z "redirect" "https://example.com:443/at/action_store" "-" "-" "-"

$ cat access_log.txt | python3 alb_access_log_to_json.py | jq .
{
  "type": "h2",
  "timestamp": "2020-03-08T23:50:58.701251Z",
  "elb": "app/xxxxxx-prod-alb/xxxxxxxxx",
  "client:port": "222.222.222.222:64202",
  "target:port": "-",
  "request_processing_time": "-1",
  "target_processing_time": "-1",
  "response_processing_time": "-1",
  "elb_status_code": "302",
  "target_status_code": "-",
  "received_bytes": "1254",
  "sent_bytes": "224",
  "request": "GET https://example.com:443/action_store HTTP/2.0",
  "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362",
  "ssl_cipher": "ECDHE-RSA-AES128-GCM-SHA256",
  "ssl_protocol": "TLSv1.2",
  "target_group_arn": "-",
  "trace_id": "Root=1-xxxxxxx-xxxxxxxxxxxx",
  "domain_name": "at.m3.com",
  "chosen_cert_arn": "arn:aws:acm:ap-northeast-1:xxxxxxxxxx:certificate/xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx",
  "matched_rule_priority": "300",
  "request_creation_time": "2020-03-08T23:50:58.701000Z",
  "actions_executed": "redirect",
  "redirect_url": "https://example.com:443/at/action_store",
  "error_reason": "-",
  "target:port_list": "-",
  "target_status_code_list": "-"
}

Recommended Posts

Mettre en forme le journal d'accès AWS ALB au format JSON
Journal de sortie au format JSON avec journalisation standard Python
Formater les nombres au format monétaire