Get mail from Gmail and label it with Python3

I wanted to pull an email with a specific label from Gmail, so I made a memo in Python.

environment

Login and label

Connecting and labeling is simple.

import imaplib, re, email, six, dateutil.parser
email_default_encoding = 'iso-2022-jp'

def main():
    gmail = imaplib.IMAP4_SSL("imap.gmail.com")
    gmail.login("user","password")
    gmail.select('INBOX') #Specify your inbox
    gmail.select('register') #Specify the label

Get email

Get the mail with .search (). If you specify ALL, you can get all unread items in UNSEEN. For other settings, be sure to look at the IMAP4 manual. Although it is in English. INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1

    typ, [data] = gmail.search(None, "(UNSEEN)")
    #typ, [data] = gmail.search(None, "(ALL)")
    
    #Verification
    if typ == "OK":
        if data != '':
            print("New Mail")
        else:
            print("Non")
            
    #Processing of acquired mail list
    for num in data.split():
        ###Processing to each email###
    
    #Clean up
    gmail.close()
    gmail.logout()

The middle part is just checking if it was received. After that, write the processing for each mail, and it is the end flow. From the next, we will get the sender, title, and text in the mail processing part.

Get email content

Character code acquisition and perspective

Since only the id of the target email can be obtained by .search (), the entire email with the id specified by .fetch () can be accessed. First you need to get the character code of the email. After parsing the mail using ʻemail.message_from_string, access the title part, and if the character code is specified, set it, otherwise set the default value ʻiso2022-jp. After that, I'm decoding with that character code again, but I think there seems to be a better way to write it here ...

    for num in data.split():
        ###Processing to each email###
        result, d = gmail.fetch(num, "(RFC822)")
        raw_email = d[0][1]

        #For character code acquisition
        msg = email.message_from_string(raw_email.decode('utf-8'))
        msg_encoding = email.header.decode_header(msg.get('Subject'))[0][1] or 'iso-2022-jp'
        #Parse and prepare for analysis
        msg = email.message_from_string(raw_email.decode(msg_encoding))

        print(msg.keys())

You can check the items that can be obtained here with msg.keys ().

['Delivered-To', 'Received', 'X-Received', 'Return-Path', 'Received', 'Received-SPF', 'Authentication-Results', 'DKIM-Signature', 'Subject', 'From', 'To', 'Errors-To', 'MIME-Version', 'Date', 'X-Mailer', 'X-Priority', 'Content-Type', 'Message-ID', 'X-Antivirus', 'X-Antivirus-Status']

Get sender / title

Get the sender, but have a hard time here.

        fromObj = email.header.decode_header(msg.get('From'))
        addr = ""
        for f in fromObj:
            if isinstance(f[0],bytes):
                addr += f[0].decode(msg_encoding)
            else:
                addr += f[0]
        print(addr)

If you parse something like "Sender <[email protected]>"

fromObj [0] [0]: b'xxxxxxxxx' ・ ・ ・ Encoded "From" fromObj [0] [1]:'iso-2022-jp' ・ ・ ・ Character code fromObj [1] [0]: b'[email protected]' ・ ・ ・ Address part fromObj [1] [1]: None ・ ・ ・ No character code because only alphanumeric characters

It seems that you can get it in the format. Also, if there is no Japanese notation, it will not be decomposed if it is like "Sashidashi <[email protected]>"

fromObj[0][0] : 'Sashidashi<[email protected]>'
fromObj[0][1] : None

It becomes str type in the form of. So, I get the whole by loop + type judgment.

I did the same for the title and it worked.

        subject = email.header.decode_header(msg.get('Subject'))
        title = ""
        for sub in subject:
            if isinstance(sub[0],bytes):
                title += sub[0].decode(msg_encoding)
            else:
                title += sub[0]
        print(title)

In addition, the sender part may not be decoded well when Japanese is included in some emails. I will summarize it in another article.

Get date / body

Get the date and change the format. If you want the yyyyMMdd format, it's easy to use dateutil.

        date = dateutil.parser.parse(msg.get('Date')).strftime("%Y/%m/%d %H:%M:%S")
        print(date)

The acquisition of the text also has a branch. You can get it with .get_payload (), but in the case of mail sent in html format, both text and html are also obtained, so the text / plain one is taken out.

        body = ""
        if msg.is_multipart():
            for payload in msg.get_payload():
                if payload.get_content_type() == "text/plain":
                    body = payload.get_payload()
        else:
            if msg.get_content_type() == "text/plain":
                body = msg.get_payload()

Setting and deleting labels

I had a hard time not knowing how to set and delete labels.

        #Unread
        gmail.store(num, '-FLAGS','\\SEEN')

        #Add label
        gmail.store(num, '+X-GM-LABELS','added')
        #Remove label
        gmail.store(num, '-X-GM-LABELS','added')

It seems that it can be done by specifying "X-GM-LABELS". "+" To add, "-" to delete. Source: Gmail IMAP Extensions

reference

Including those mentioned above INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1 Gmail IMAP Extensions IMAP4 (Internet Mail Access Protocol version 4) -Part 1 IMAP4 (Internet Mail Access Protocol version 4) -Part 2 email — Package for Email and MIME Processing

No, it's difficult if you don't understand imap4 properly.

Recommended Posts

Get mail from Gmail and label it with Python3
Get Gmail subject and body with Python and Gmail API
Get html from element with Python selenium
Get data from MySQL on a VPS with Python 3 and SQLAlchemy
[Python / Ruby] Understanding with code How to get data from online and write it to CSV
2. Make a decision tree from 0 with Python and understand it (2. Python program basics)
Make a decision tree from 0 with Python and understand it (4. Data structure)
Create a decision tree from 0 with Python and understand it (5. Information Entropy)
[Python] I introduced Word2Vec and played with it.
Get git branch name and tag name with python
Get date with python
Get media timeline images and videos with Python + Tweepy
Install selenium on Mac and try it with python
Get schedule from Garoon SOAP API with Python + Zeep
Get comments on youtube Live with [python] and [pytchat]!
Read json file with Python, format it, and output json
Get files from Linux using paramiko and scp [Python]
[Python] Get user information and article information with Qiita API
Get data from database via ODBC with Python (Access)
Programming with Python and Tkinter
Encryption and decryption with Python
Python and hardware-Using RS232C with Python-
Get Twitter timeline with python
Get Youtube data with python
Get thread ID with python
Run Label with tkinter [Python]
Get started with Python! ~ ② Grammar ~
python with pyenv and venv
[python] Get quotient and remainder
Get stock price with Python
Get home directory with python
Get keyboard events with python
With skype, notify with skype from python!
Get Alembic information with Python
Works with Python and R
Send using Python with Gmail
Get an image from a web page and resize it
Compare HTTP GET / POST with cURL (command) and Python (programming)
[Python] Send an email from gmail with two-step verification set
Associate Python Enum with a function and make it Callable
Get a Python web page, character encode it, and display it
Get rid of dirty data with Python and regular expressions
[Python] I installed the game from pip and played it
Hash with python and escape from a certain minister's egosa
Get data from analytics API with Google API Client for python
Collecting information from Twitter with Python (MySQL and Python work together)
Sample of HTTP GET and JSON parsing with python of pepper
[Python x Zapier] Get alert information and notify with Slack
Install CaboCha in Ubuntu environment and call it with Python.
Get additional data to LDAP with python (Writer and Reader)
Get message from first offset with kafka consumer in python
Get the matched string with a regular expression and reuse it when replacing on Python3
Create a decision tree from 0 with Python and understand it (3. Data analysis library Pandas edition)
Recursively get the Excel list in a specific folder with python and write it to Excel.
I made a server with Python socket and ssl and tried to access it from a browser
Precautions when inputting from CSV with Python and outputting to json to make it an exe
Put Ubuntu in Raspi, put Docker on it, and control GPIO with python from the container
Communicate with FX-5204PS with Python and PyUSB
Shining life with Python and OpenCV
Get PowerShell commands from malware dynamic analysis site with BeautifulSoup + Python
Get started with Python! ~ ① Environment construction ~