[Python] Let LINE notify you of the ranking of search results on your site on a daily basis.

Thing you want to do

I want LINE to be notified of my ranking when my site searches Google for "a certain keyword".

Create an environment

This time, we will build it on Windows 10 home. Create a virtual environment and install the required libraries.

Create a virtual environment

Create a dedicated virtual environment so that your personal computer does not get messed up with the library. Please refer to the following for how to make it. Create a virtual environment for Python

After creating the virtual environment, activate it.

command prompt


Scripts\activate

Library installation

Install the library for web scraping and the library for regular execution on heroku.

command prompt


pip install requests
pip install BeautifulSoup4
pip install apscheduler

Issue tokens for LINE cooperation

Log in to LINE below and issue an access token. LINE Notify - Google Chrome 2020_06_16 14_02_59.png Select the token name displayed at the time of notification and the talk room to send to. This time, I will send it to myself, so I selected "Receive notifications from LINE Notify 1: 1" and issued it. Make sure to make a note of the issued tokens. (I can't see it anymore when I close it) LINE Notify - Google Chrome 2020_06_16 14_06_59.png

This completes the cooperation. LINE Notify - Google Chrome 2020_06_16 14_10_55.png

Program creation

Create a Python file. Since it will be created at the command prompt, move the path to the location of the created virtual environment.

command prompt


type nul > main_proc.py
type nul > clock.py

After deploying to heroku, it will be executed regularly, so create clock.py that describes the schedule information.

The actual coding is as follows. This post is the content of the LINE notification, so I will omit the explanation of the code.

main_proc.py


import requests
from bs4 import BeautifulSoup as bs
import os

line_notify_token = os.environ['LINE_NOTIFY_TOKEN']

def main_proc():
    mes = 'Out of service or unprocessed'
    targeturl = 'https://sentreseau.com/'
    targetur2 = 'http://sentreseau.com/'

    #Request header
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"}
    list_keyword = 'power platform Fukuoka'

    url = 'https://www.google.co.jp/search?num=100&q={}'.format(list_keyword)

    #Connect
    response = requests.get(url, headers=headers)

    try:
        #Check HTTP status code
        response.raise_for_status()
    except:
        mes = 'I couldn't get it today, sorry.'

        
    #Parse the retrieved HTML
    soup = bs(response.content, 'html.parser')
        
    #Get search result titles and links
    ret_link = soup.select('.r > a')
    mes = url
    for i in range(len(ret_link)):
        #Get only the link and remove the extra part
        url_txt = ret_link[i].get('href').replace('/url?q=','')
        
        if (targeturl in url_txt) or (targetur2 in url_txt):
            mes = '「{}As a result of searching with ""{}The ranking is{}It was a place.'.format(list_keyword, targeturl, i + 1)
            break
        
    
    #Notify LINE
    line_notify(mes)

#Notification function to LINE
def line_notify(message):
    line_notify_api = 'https://notify-api.line.me/api/notify'
    payload = {'message': message}
    headers = {'Authorization': 'Bearer ' + line_notify_token}
    requests.post(line_notify_api, data=payload, headers=headers)


if __name__ == "__main__":
    main_proc()

clock.py


import os,main_proc
from apscheduler.schedulers.blocking import BlockingScheduler

sched = BlockingScheduler()

if __name__ == "__main__":
    #07 every day:Run to 30
    sched.add_job(main_proc.main_proc, 'cron', hour=7, minute=30)
    sched.start()
    #Run every minute
    #sched.add_job(main_proc.main_proc, 'interval', minutes=1)

Deploy to heroku

The following site will be helpful for creating heroku. http://www.dcom-web.co.jp/technology/heroku1/

Log in to heroku

When you execute the following command, the login browser will start, so log in.

heroku login

Heroku _ Login - Google Chrome 2020_06_16 15_31_28.png

Create an app

heroku create app name

Add buildpack

Add buildpack to run Python on heroku.

heroku buildpacks:add heroku/python -a app name

Change the time zone.

The default is UTC, so set it to Asia / tokyo.

heroku config:add TZ=Asia/Tokyo -a app name

Store the LINE access token in an environment variable

It is not good to write the access token directly in the code, so store it in the environment variable.

heroku config:set LINE_NOTIFY_TOKEN=Access token-a app name

Check if it is set properly with the following command.

heroku config -a app name

Create a file to run on heroku

command prompt


pip freeze > requirements.txt

Deploy to heroku

(First time only) Create an initial file for git. A ".git" folder is created in the working folder.

git init

(First time only) Create a remote repository.

heroku git:remote -a app name

Add your changes to the index.

git add .
git add filename

Commit

git commit -m "Write if you have a comment"

Deploy on heroku.

git push heroku master

Let's check the operation after deployment.

command prompt


heroku run python clock.py

It's working properly. Apparently, when we search for "power platform Fukuoka", it seems to be 4th at the time of posting.

Start periodic execution

After deploying to heroku, we were able to confirm the operation, so start clock so that clock.py coded for periodic execution will be executed periodically.

command prompt


heroku ps:scale clock=1

If it says "Scaling dynos ... done, now running clock at 1: Free", it's OK.

That's it, every morning at 7:30, LINE will be notified of the ranking. Congratulations, congratulations.

PS

By the way, to stop the clock running on heroku, do the following:

command prompt


heroku ps:scale clock=0

Recommended Posts

[Python] Let LINE notify you of the ranking of search results on your site on a daily basis.
Use AWS lambda to scrape the news and notify LINE of updates on a regular basis [python]
You search commandlinefu on the command line
[Example of Python improvement] I learned the basics of Python on a free site in 2 weeks.
A Python script that allows you to check the status of the server from your browser
Stream Twitter search results like a certain video site [python]
Google search for the last line of the file in Python
[Python] LINE notification of the latest information using Twitter automatic search
Get the number of readers of a treatise on Mendeley in Python
[Python] Create a script that uses FeedParser and LINE Notify to notify LINE of the latest information on the new coronavirus of the Ministry of Health, Labor and Welfare.
[Python] Explore the characteristics of the titles of the top sites in Google search results
If you want a singleton in python, think of the module as a singleton
Have Alexa run Python to give you a sense of the future
Receive a list of the results of parallel processing in Python with starmap
Allow Slack to notify you of the end of a time-consuming program process
Create a python environment on your Mac
Search the maze with the python A * algorithm
[python] [meta] Is the type of python a type?
In search of the fastest FizzBuzz in Python
The story of blackjack A processing (python)
[Python] A progress bar on the terminal
Various ways to read the last line of a csv file in Python
Let Python measure the average score of a page using the PageSpeed Insights API
Creating a LINE BOT to notify you of additional AtCoder contests using AWS
Periodically notify the processing status of Raspberry Pi with python → Google Spreadsheet → LINE
How many types of Python do you have on your macOS? I had 401 types.
Separately install a version of Python that is not pre-installed on your Mac
Visualize the timeline of the number of issues on GitHub assigned to you in Python
[Python] A program that calculates the difference between the total numbers on the diagonal line.
Draw a line / scatter plot on the CSV file (2 columns) with python matplotlib