Use Python and the Gmail API to get the subject and body of the target email stored in your inbox and save it to a file.
-Create a Google Cloud Platform project and enable the Gmail API. -Obtaining authentication information to use Gmail API. The following articles will be helpful. https://qiita.com/muuuuuwa/items/822c6cffedb9b3c27e21
Python 3.9 Required library ・ Google-api-python-client ・ Google-auth-httplib2 ・ Google-auth-oauthlib
The code used this time can be downloaded from GitHub. https://github.com/kirinnsan/backup-gmail I also uploaded the Dockerfile, so if you can use Docker, you don't need to install the library with pip install.
Place the created credentials in the same directory with the file name client_id.json. The authentication flow is implemented in the InstalledAppFlow class, where the user opens the specified authentication URL, obtains the authentication code and pastes it on the console ** run_console **, and authenticates using the web server ** run_local_server. There are two types of methods available: **.
This time, authentication is performed using the run_console method.
If the first authentication is successful, a token.pickle containing the access token and update token will be created in the directory. After that, authentication will be performed using this.
auth.py
import pickle
import os.path
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.auth.exceptions import GoogleAuthError
def authenticate(scope):
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
try:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'client_id.json', scope)
creds = flow.run_console()
except GoogleAuthError as err:
print(f'action=authenticate error={err}')
raise
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
return creds
It uses the Gmail API to implement a method to get a list of emails in the inbox and a method to get the subject and body of the target email. In the list of emails, you can specify the maximum number of items to be retrieved and search conditions.
client.py
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
import util
class ApiClient(object):
def __init__(self, credential):
self.service = build('gmail', 'v1', credentials=credential)
def get_mail_list(self, limit, query):
# Call the Gmail API
try:
results = self.service.users().messages().list(
userId='me', maxResults=limit, q=query).execute()
except HttpError as err:
print(f'action=get_mail_list error={err}')
raise
messages = results.get('messages', [])
return messages
def get_subject_message(self, id):
# Call the Gmail API
try:
res = self.service.users().messages().get(userId='me', id=id).execute()
except HttpError as err:
print(f'action=get_message error={err}')
raise
result = {}
subject = [d.get('value') for d in res['payload']['headers'] if d.get('name') == 'Subject'][0]
result['subject'] = subject
# Such as text/plain
if 'data' in res['payload']['body']:
b64_message = res['payload']['body']['data']
# Such as text/html
elif res['payload']['parts'] is not None:
b64_message = res['payload']['parts'][0]['body']['data']
message = util.base64_decode(b64_message)
result['message'] = message
return result
The following is the process of decoding the base64-encoded body and the process of saving the retrieved message. The file will be saved in the specified directory in the form ** Email Subject.txt **.
util.py
import base64
import os
def base64_decode(b64_message):
message = base64.urlsafe_b64decode(
b64_message + '=' * (-len(b64_message) % 4)).decode(encoding='utf-8')
return message
def save_file(base_dir, result):
os.makedirs(base_dir, exist_ok=True)
file_name = base_dir + '/' + result['subject'] + '.txt'
with open(file_name, mode='w') as f:
f.write(result['message'])
The following is the source code of the execution part. As a processing flow,
main.py
from __future__ import print_function
import auth
from client import ApiClient
import util
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
# Number of emails retrieved
MAIL_COUNTS = 5
# Search criteria
SEARCH_CRITERIA = {
'from': "[email protected]",
'to': "",
'subject': "Email subject"
}
BASE_DIR = 'mail_box'
def build_search_criteria(query_dict):
query_string = ''
for key, value in query_dict.items():
if value:
query_string += key + ':' + value + ' '
return query_string
def main():
creds = auth.authenticate(SCOPES)
query = build_search_criteria(SEARCH_CRITERIA)
client = ApiClient(creds)
messages = client.get_mail_list(MAIL_COUNTS, query)
if not messages:
print('No message list.')
else:
for message in messages:
message_id = message['id']
# get subject and message
result = client.get_subject_message(message_id)
# save file
util.save_file(BASE_DIR, result)
if __name__ == '__main__':
main()
This time, the maximum number of acquisitions is 5, the recipient is ** [email protected] **, and the subject is ** email subject **. If you want to specify the recipient, set it in the form of ** from: [email protected] ** or ** from: Hanako **. For the subject, set it in the form of ** subject: subject **. The following official page describes the conditions and usage that can be used with Gmail. https://support.google.com/mail/answer/7190
The retrieved emails are saved in the mail_box directory.
Run the app.
python3 main.py
When executed, you will be instructed to open the authentication URL from the console as shown below, so open the URL.
When you open the URL, the following screen will appear. Click Details → Go to Unsafe Page.
Click Allow.
Click Allow.
The code will be displayed, copy it and paste it into the Enter the authorization code part of the console.
If the authentication is successful, subsequent processing is performed and the mail is saved.
The official reference has a sample of the app when using Python, so you can also refer to it. https://developers.google.com/gmail/api/quickstart/python
I was able to specify the search conditions and save the target email. This time, I only saved the title and body of the received email, but since there are various other APIs in the Gmail API, it seems that various things can be done depending on the method.
Recommended Posts