Output the key list included in S3 Bucket to a file

I started using S3 at work.

The S3 Key is stored in the DB and is usually fine, but it is troublesome if it shifts somewhere.

So I used boto to get a list of S3 keys.

I think it's efficient because it seems that only HEAD is thrown.

#! /usr/bin/env python
# -*- coding: utf-8 -*-
"""
Output the list of files contained in the target bucket to TSV.
"""
import sys
import os
import csv
from ConfigParser import SafeConfigParser
from getpass import getpass

from boto import connect_s3


AWS_CLI_CONFIG_PATH = os.path.expanduser('~/.aws/config')


def get_aws_config(config_path=AWS_CLI_CONFIG_PATH):
    """
Returns the following key from aws cli config
    - aws_access_key_id
    - aws_secret_access_key'
    """
    keys = ['aws_access_key_id', 'aws_secret_access_key']
    cfg = SafeConfigParser()
    with open(config_path, 'r') as fp:
        cfg.readfp(fp)
    return tuple(cfg.get('default', x) for x in keys)


def get_bucket(aws_access_key_id, aws_secret_access_key, bucket_name):
    """
returns boto S3 bucket
    """
    if not aws_access_key_id and not aws_secret_access_key:
        aws_access_key_id, aws_secret_access_key = get_aws_config()
    return connect_s3(aws_access_key_id, aws_secret_access_key).get_bucket(bucket_name)


def write_tsv(aws_access_key_id, aws_secret_access_key, bucket_name, file_name):
    """
S3 bucket key.name list file_Export to name as TSV.
    """
    #Determining the absolute file path
    file_path = os.path.abspath(file_name)

    def _writerows(rows):
        with open(file_path, 'a') as fp:
            writer = csv.writer(fp, dialect='excel-tab')
            writer.writerows(rows)

    #Export header
    _writerows([('key_name', )])

    #Export body
    rows = []
    for key in get_bucket(aws_access_key_id, aws_secret_access_key, bucket_name).list():
        rows.append(key.name)
        if len(rows) > 1000:
            _writerows(rows)
            rows = []
    else:
        _writerows(rows)


if __name__ == '__main__':
    if len(sys.argv) != 2:
        print('Please specify output filename.')

    else:
        print('Please input the aws_access_key_id/aws_secret_access_key and a target bucket name.')
        print('If you don\'t input the aws_access_key_id/aws_secret_access_key, then we use awscli config.')
        aws_access_key_id = getpass('aws_access_key_id: ')
        aws_secret_access_key = getpass('aws_secret_access_key: ')
        bucket_name = raw_input('target bucket name: ')

        if not aws_access_key_id and not aws_secret_access_key and not os.path.isfile(AWS_CLI_CONFIG_PATH):
            print('Please specify the aws_access_key_id/aws_secret_access_key or create awscli config.')
            sys.exit(1)

        write_tsv(
            aws_access_key_id,
            aws_secret_access_key,
            bucket_name,
            sys.argv[1])
        print('Output: {}'.format(sys.argv[1]))

** Sutema ** The company to which I belong seems to be recruiting employees. If you think you should try writing Python, please apply.

Recommended Posts

Output the key list included in S3 Bucket to a file
Get the list in the S3 bucket with Python and search with a specific Key. Output the Key name, last update date, and count number to a file.
Change the standard output destination to a file in Python
Output a binary dump in binary and revert to a binary file
[Python] How to output the list values in order
How to get the last (last) value in a list in Python
Write standard output to a file
Get the value of a specific key up to the specified index in the dictionary list in Python
To make sure that the specified key is in the specified bucket in Boto 3
How to output the output result of the Linux man command to a file
Change the list in a for statement
How to specify a .ui file in the dialog / widget GUI in PySide
[Linux] Command to get a list of commands executed in the past
I want to sort a list in the order of other lists
I made a program to check the size of a file in Python
[Cloudian # 5] Try to list the objects stored in the bucket with Python (boto3)
How to clear tuples in a list (Python)
How to create a JSON file in Python
Make a copy of the list in Python
Get only the subclass elements in a list
Save the object to a file with pickle
Output in the form of a python array
A python amateur tries to summarize the list ②
How to read a file in a different directory
Various ways to read the last line of a csv file in Python
How to pass the execution result of a shell command in a list in Python
To output a value even in the middle of a cell with Jupyter Notebook
Attempt to launch another .exe and save the console output to a text file
How to count the number of elements in Django and output to a template
Use libsixel to output Sixel in Python and output a Matplotlib graph to the terminal.
[Python] Concatenate a List containing numbers and write it to an output file.
Output timing is incorrect when standard (error) output is converted to a file in Python
I want to see a list of WebDAV files in the Requests module
Don't forget to close the file just because it's in a temporary folder
How to get a list of files in the same directory with python
Run the output code with tkinter, saying "A, pretending to be B" in python
Parse a JSON string written to a file in Python
How to display the modification date of a file in C language up to nanoseconds
Get the file name in a folder using glob
Get the value of a specific key in a list from the dictionary type in the list with Python
How to identify the element with the smallest number of characters in a Python list?
How to list files under the specified directory in a list (multiple conditions / subdirectory search)
A memorandum to run a python script in a bat file
I want to randomly sample a file in Python
Things to note when initializing a list in Python
How to check in Python if one of the elements of a list is in another list
Output the specified table of Oracle database in Python to Excel for each file
[Mac] A super-easy way to execute system commands in Python and output the results
Output the output result of sklearn.metrics.classification_report as a CSV file
How to find the first element that matches your criteria in a Python list
[Golang] Command to check the supported GOOS and GOARCH in a list (Check the supported platforms of the build)
Hello World! Output list by various languages (scheduled to be updated in a timely manner)
Recursively get the Excel list in a specific folder with python and write it to Excel.
Parse the Researchmap API in Python and automatically create a Word file for the achievement list
When a character string of a certain series is in the Key of the dictionary, the character string is converted to the Value of the dictionary.
How to pass the execution result of a shell command in a list in Python (non-blocking version)
Define a task to set the fabric env in YAML
Assigned scaffolding macro in Python script file to F12 key
[Sublime Text 2] Always execute a specific file in the project
Save the pystan model and results in a pickle file
[python] How to check if the Key exists in the dictionary