Format the Git log and get the committed file name in csv format

Purpose

I wanted to get git commit information in csv format With the --pretty = format option of git log, I couldn't get it in the format I wanted, so I output the log once and then format the log.

environment

OS Mac Catalina Pycharm CE python 3.7

Preparation

Get git log

A log file is output when you hit it with a console such as GitBash.

--date-order --date=format:'%Y/%m/%d %H:%M:%S' > git.log```

##### Expected git log to capture


#### **` git.log`**
```log

commit f36da445d06d2db7b4f08a508be835f5464ded
Author: nomori<[email protected]>
Date:   2020/10/10 23:50:29
    first commit.
A	.gitignore
A	perse_git_log.py

Implementation

perse_git_log.py


import re
import csv
import os

COMMIT_ID = 'commit '
STATUS_ADD = 'A	'
STATUS_MOD = 'M	'
STATUS_DEL = 'D	'
GIT_AUTHOR = 'Author: '
GIT_DATE = 'Date:   '

path = './input/git.log'

#Read git log file.
array_commit_info = []
with open(path) as git_log_file:
    data = git_log_file.readlines()

for item in data:
    #Removed trailing line feed code.
    item = item.replace('\n', '')

    if COMMIT_ID in item:
        #Get the hash ID of the commit.
        commit_id = item.replace(COMMIT_ID, '')

    elif GIT_AUTHOR in item:
        #Get committed user information.
        author_tmp = item.replace(GIT_AUTHOR, '')
        #Delete the email address part.
        author = re.sub(' +<.*>', '', author_tmp)

    elif GIT_DATE in item:
        #Get commit date and time.
        date = item.replace(GIT_DATE, '')

    else:
        #Get file change history.
        file_status = item[0:2]
        if file_status == STATUS_ADD or file_status == STATUS_MOD or file_status == STATUS_DEL:
            #Get file name excluding Git status.
            file_name = item[2:]
            #Hold information in an array for output.
            array_commit_info.append([commit_id, author, date, file_name])

print(array_commit_info)

#Output in CSV format.
file_path = './output/'
if not os.path.exists(file_path):
    os.mkdir(file_path)

output_filename = file_path + 'git_output.csv'
with open(output_filename, 'w') as f:
    writer = csv.writer(f)

    #Output header information.
    writer.writerow(['COMMIT_ID', 'AUTHOR', 'DATE', 'COMMIT_FILE_NAME'])
    for line_data in array_commit_info:
        #Output commit information.
        writer.writerow(line_data)

Output example

git_output.csv


f36da445d06d2db7b4f08a508be835f5464ded,nomori,2020/10/10 23:50:29,.gitignore
f36da445d06d2db7b4f08a508be835f5464ded,nomori,2020/10/10 23:50:29,perse_git_log.py

reference

--Reading and writing (creating / adding) files with Python https://note.nkmk.me/python-file-io-open-with/ --Python string manipulation master https://qiita.com/tomotaka_ito/items/594ee1396cf982ba9887 --Remove some strings in Python (strip, etc.) https://note.nkmk.me/python-str-remove-strip/ -[Python3] How to use configparser and sample https://www.mathkuro.com/python/configperser/ --Python configuration file management summary https://kodocode.net/python-begin-settings/

Recommended Posts

Format the Git log and get the committed file name in csv format
Get the file name in a folder using glob
Read the csv file and display it in the browser
I want to get the file name, line number, and function name in Python 3.4
Get the file name saved in AWS S3 (1000 or more)
Get the host name in Python
Search the file name including the specified word and extension in the directory
Get date and time in specified format
Replace the directory name and the file name in the directory together with a Linux command.
The one that divides the csv file, reads it, and processes it in parallel
The file name was bad in Python and I was addicted to import
Extract only the file name excluding the directory in the directory
Let's parse the git commit log in Python!
Get git branch name and tag name with python
Use pygogo to get the log in json.
When I name the file flask.py in Flask, I get Import Error: cannot import name'Flask'
Handle CSV that contains the element you want to parse in the file name
The first step to log analysis (how to format and put log data in Pandas)
How to get the variable name itself in python
Specify the file name when sending the csv attached mail
Describe the multi-stage ssh destination in the config, log in easily, and copy the file with scp
Get the result in dict format with Python psycopg2
Read the linked list in csv format with graph-tool
Get a participant's username and screen name in Slack
One liner to get the nth commit hash in Git
Save the pystan model and results in a pickle file
[Python] Open the csv file in the folder specified by pandas
Get and create nodes added and updated in the new version
[Python] Read the csv file and display the figure with matplotlib
Search for variables in pandas.DataFrame and get the corresponding row.
How to get all the keys and values in the dictionary
Get the current date and time in Python, considering the time difference
[Shell] How to get the remote default branch in Git
Download the file in Python
Determine the date and time format in Python and convert to Unixtime
About the need for the first slash in the subscriber name and publisher name
How to get and set the NTP server name by DHCP
python> Display 3 decimals in ".3f, .3f, .3f" format / Get 3 coordinate values in the range [-1: 1]
[Django] Import and export DB tables in Excel or CSV format
Anyway, the fastest serial communication log is left in a file
Format the CSV file of "National Holiday" of the Cabinet Office with pandas
Get the list in the S3 bucket with Python and search with a specific Key. Output the Key name, last update date, and count number to a file.
Get the formula in an excel file as a string in Python
Get the title and delivery date of Yahoo! News in Python
From the AWS cloud product page, put the AWS service name in csv
How to get all the keys and values in the dictionary
Normalize the file that converted Excel to csv as it is.
Save the binary file in Python
Get the desktop path in Python
Get the script path in Python
Python CSV file reading and writing
The story of the "hole" in the file
Get the desktop path in Python
Get the file path using Pathlib
Get the query string (query string) in Django
[Python Kivy] How to get the file path by dragging and dropping
Read and format a csv file mixed with comma tabs with Python pandas
Predict the amount of electricity used in 2 days and publish it in CSV
How to get a specific column name and index name in pandas DataFrame
Get, test, and submit test cases on the command line in the AtCoder contest
How to get the date and time difference in seconds with python
Sample code to get the Twitter API oauth_token and oauth_token_secret in Python 2.7