Output CloudWatch Logs to S3 with AWS Lambda (Pythyon ver)

I wrote the source code to output CloudWatch Logs to S3 using AWS Lambda (runtime is Python). By the way, I think you can run Lambda manually or regularly with CloudWatch Event.

point

1. Log output API works asynchronously

For boto3

logs = boto3.client("logs") response = logs.create_export_task(**kwargs)

The log output works with, but create_export_task is executed asynchronously, so if you do the next log output without confirming the end of processing, an error may occur. So, when you output multiple logs, be sure to check if the log output is finished.

logs.describe_export_tasks(taskId = response["taskId"])

Let's insert the processing of.

2. Environment variables

The values of the environment variables are as follows.

variable value
BUCKET_NAME Log output destination S3 bucket name
WAITING_TIME 10
LOG_GROUPS CloudWatchLogGroup,Connect with a break

Source code



# -*- coding: utf-8 -*-
from datetime import datetime,timezone,timedelta
import os
import boto3
import time
import logging
import traceback

#Log settings
logger = logging.getLogger()
logger.setLevel(os.getenv('LOG_LEVEL', logging.DEBUG))
logs = boto3.client("logs")
BUCKET_NAME = os.environ["BUCKET_NAME"]
WAITING_TIME = int(os.environ["WAITING_TIME"])

#Set timezone to Japan time (JST)
JST = timezone(timedelta(hours=9),"JST")

#Date type when outputting logs to S3
DATE_FORMAT = "%Y-%m-%d"

def lambda_handler(event, context):
    """
Output one day's worth of CloudWatch Logs to S3.
The target time is as follows.
    AM 00:00:00.000000 ~ PM 23:59:59.999999
    """
    try:
        #Yesterday PM23:59:59.999999
        tmp_today = datetime.now(JST).replace(hour=0,minute=0,second=0,microsecond=0) - timedelta(microseconds=1)
        #Yesterday AM00:00:00.000000
        tmp_yesterday = (tmp_today - timedelta(days=1)) + timedelta(microseconds=1)
        #Used as a prefix when outputting S3 logs
        target_date = tmp_yesterday.strftime(DATE_FORMAT)
        #Convert to time stamp type for log output (take up to microseconds)
        today = int(tmp_today.timestamp() * 1000)
        yesterday = int(tmp_yesterday.timestamp() * 1000)
        

        #Get CloudWatchLogGroup from environment variable
        logGroups = os.environ["LOG_GROUPS"].split(",")
        for logGroupName in logGroups:
            try:
                keys = ["logGroupName","yesterday","today","target_date"]
                values = [logGroupName,yesterday,today,target_date]
                payload = dict(zip(keys,values))
                
                #Execute log output
                response = logs.create_export_task(
                    logGroupName = payload["logGroupName"],
                    fromTime = payload["yesterday"],
                    to = payload["today"],
                    destination = BUCKET_NAME,
                    destinationPrefix = "Logs" + payload["logGroupName"] + "/" + payload["target_date"]
                )
                
                #Wait for the log output to finish executing.
                taskId = response["taskId"]
                while True:
                    response = logs.describe_export_tasks(
                        taskId = taskId
                    )
                    status = response["exportTasks"][0]["status"]["code"]
                    #Break if task execution is finished
                    if status != "PENDING" and status != "PENDING_CANCEL" and status != "RUNNING":
                        logger.info(f"taskId {taskId} has finished exporting")
                        break
                    else:
                        logger.info(f"taskId {taskId} is now exporting")
                        time.sleep(WAITING_TIME)
                        continue
                
            except Exception as e:
                traceback.print_exc()
                logger.warning(f"type = {type(e)} , message = {e}",exc_info=True)

    except Exception as e:
        traceback.print_exc()
        logger.error(f"type = {type(e)} , message = {e}",exc_info=True)
        raise
        

Recommended Posts

Output CloudWatch Logs to S3 with AWS Lambda (Pythyon ver)
[Python] Regularly export from CloudWatch Logs to S3 with Lambda
Connect to s3 with AWS Lambda Python
Move CloudWatch logs to S3 on a regular basis with Lambda
[Python] Convert CSV file uploaded to S3 to JSON file with AWS Lambda
Send images taken with ESP32-WROOM-32 to AWS (API Gateway → Lambda → S3)
[AWS] Link Lambda and S3 with boto3
[AWS] Do SSI-like things with S3 / Lambda
Upload what you got in request to S3 with AWS Lambda Python
Export RDS snapshot to S3 with Lambda (Python)
I want to AWS Lambda with Python on Mac!
Manage your Amazon CloudWatch loggroup retention with AWS Lambda
[AWS] What to do when you want to pip with Lambda
[AWS] Try adding Python library to Layer with SAM + Lambda (Python)
Output to syslog with Loguru
AWS Lambda with PyTorch [Lambda import]
Periodically start / stop Sakura's cloud server with AWS Lambda + CloudWatch Events
Prepare an environment to use OpenCV and Pillow with AWS Lambda
How to create a serverless machine learning API with AWS Lambda
[AWS] Create API with API Gateway + Lambda
Output to csv file with Python
Output cell to file with Colaboratory
Using Lambda with AWS Amplify with Go
Easy AWS S3 testing with MinIO
Notify HipChat with AWS Lambda (Python)
Try to output audio with M5STACK
Use boto3 to mess with S3
[AWS lambda] Deploy including various libraries with lambda (generate a zip with a password and upload it to s3) @ Python
ImportError when trying to use gcloud package with AWS Lambda Python version
I tried to delete bad tweets regularly with AWS Lambda + Twitter API
[AWS] Using ini files with Lambda [Python]
Output color characters to pretty with python
Output Python log to console with GAE
Regularly post to Twitter using AWS lambda!
I want to analyze logs with Python
I want to play with aws with python
Download CloudWatch Logs logs to your local environment
Summary of how to write AWS Lambda
Python + Selenium + Headless Chromium with aws lambda
I just did FizzBuzz with AWS Lambda
Get and parse S3 Logging logs with Lambda event nortification and plunge into BigQuery
[AWS; Introduction to Lambda] 2nd; Extract sentences from json file and save S3 ♬
Make it easy to specify the time of AWS CloudWatch Events with CDK.
Lambda Function (python version) that decompresses and outputs elements to CloudWatch Logs when a compressed file is uploaded to s3