I made a list site of Kindle Prime Reading using Scrapy and GitHub Actions

Background

While refraining from going out, I participated in Amazon's prime membership, but I rarely use it except to buy rice and drinks. Just the other day, I started using the privilege called Prime Reading. However, I would like to see what kind of books I can read, but checking page by page is still troublesome, so I launched a list / search site using Scrapy.

Go here: https://kpr.gimo.me/

What you are using

-Scrapy (Get HTML, parse, etc.) --DataTables (Store data in tables) -GitHub Pages (Site Creation) -GitHub Actions (Automation)

Development flow

Scrapy You can write Spider to define any data acquisition, extraction, etc. Click here for details: https://github.com/masakichi/KindleSpider/blob/master/KindleSpider/spiders/PrimeReading.py

Once complete, you can get all the books for about a minute using the command scrapy crawl PrimeReading -o public / output.json.

Write a minimum index.html

Representing the data acquired by Scrapy must be stored in HTML, fortunately it is easy to use the jQuery Plugin called DataTables. You can make a highly complete table with about 20 lines of code. (Equipped with sorting and search functions)

$('#prime-reading').DataTable({
    "paging": false,
    "order": [[4, 'desc']],
    "ajax": { "url": "./output.json", "dataSrc": "", "cache": true },
    "language": {
        "url": "./Japanese.json"
    },
    "columns": [
        { "data": "asin", "visible": false },
        { "data": "title", "render": function (data, type, row) { return `<div><a class="title" data-image="${row.cover}" href="https://www.amazon.co.jp/dp/${row.asin}/" target="_blank">${data}</a></div>` }, "width": "40%" },
        { "data": "author" },
        { "data": "star" },
        { "data": "rating_count" },
        { "data": "price" },
        { "data": "publish_date" },
        { "data": "cover", "visible": false },
    ]
});

Published on GitHub Pages

You can publish to GitHub Pages based on the index.html and output.json above. There are many ways to publish it online, so I will omit it here.

All automated with the power of GitHub Actions

If you define the requirements in the form of yaml as shown below, you can automatically acquire, extract and launch the site when the code is pushed and every day at UTC 00:00 (9:00 am Japan time).

name: publish to gh-pages

on:
  push:
    branches:
      - master
  schedule:
    - cron: "0 0 * * *"

jobs:
  publish:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-python@v2
      - uses: dschep/install-pipenv-action@v1
      - run: pipenv install
      - run: TZ='Asia/Tokyo' date --iso-8601="minutes" > public/update_time.txt
      - run: pipenv run scrapy crawl PrimeReading -o public/output.json
      - name: Deploy to GitHub Pages
        if: success()
        uses: crazy-max/ghaction-github-pages@v2
        with:
          target_branch: gh-pages
          build_dir: public
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Impressions

――It seems that there are many magazines in Prime Reading. ――It's great that GitHub Actions is convenient and free for 2000 minutes a month. ――Eiji Yoshikawa's Sangokushi can now be read for free on Prime Reading.

Recommended Posts

I made a list site of Kindle Prime Reading using Scrapy and GitHub Actions
I built a Wheel for Windows using Github Actions
I made a Chatbot using LINE Messaging API and Python
I made a C ++ learning site
I made a Line-bot using Python!
I tried to create a list of prime numbers with python
I tried to get a list of AMI Names using Boto3
I made a Chatbot using LINE Messaging API and Python (2) ~ Server ~
[Kaggle] I made a collection of questions using the Titanic tutorial
Beginner: I made a launcher using dictionary
I made a tool to notify Slack of Connpass events and made it Terraform
I made a script to record the active window using win32gui of Python
I made a github action that notifies Slack of the visual regression test
I tried reading a CSV file using Python
Prepare a pseudo API server using GitHub Actions
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"
I compared the speed of the reference of the python in list and the reference of the dictionary comprehension made from the in list.
I want to take a screenshot of the site on Docker using any font
Get a list of GA accounts, properties, and views as vertical data using API
What I learned by launching a photo site using administrative data and multiple APIs
I made a prime number generation program in Python
I made a login / logout process using Python Bottle.
I made a threshold change box of Pepper's Dialog
I made a LINE BOT with Python and Heroku
I made a prime number generation program in Python 2
I tried reading data from a file using Node.js.
I made a school festival introduction game using Ren’py
I tried using Python (3) instead of a scientific calculator
I want to make a voice changer using Python and SPTK with reference to a famous site