While refraining from going out, I participated in Amazon's prime membership, but I rarely use it except to buy rice and drinks. Just the other day, I started using the privilege called Prime Reading. However, I would like to see what kind of books I can read, but checking page by page is still troublesome, so I launched a list / search site using Scrapy.
Go here: https://kpr.gimo.me/
-Scrapy (Get HTML, parse, etc.) --DataTables (Store data in tables) -GitHub Pages (Site Creation) -GitHub Actions (Automation)
Scrapy You can write Spider to define any data acquisition, extraction, etc. Click here for details: https://github.com/masakichi/KindleSpider/blob/master/KindleSpider/spiders/PrimeReading.py
Once complete, you can get all the books for about a minute using the command scrapy crawl PrimeReading -o public / output.json
.
Representing the data acquired by Scrapy must be stored in HTML, fortunately it is easy to use the jQuery Plugin called DataTables. You can make a highly complete table with about 20 lines of code. (Equipped with sorting and search functions)
$('#prime-reading').DataTable({
"paging": false,
"order": [[4, 'desc']],
"ajax": { "url": "./output.json", "dataSrc": "", "cache": true },
"language": {
"url": "./Japanese.json"
},
"columns": [
{ "data": "asin", "visible": false },
{ "data": "title", "render": function (data, type, row) { return `<div><a class="title" data-image="${row.cover}" href="https://www.amazon.co.jp/dp/${row.asin}/" target="_blank">${data}</a></div>` }, "width": "40%" },
{ "data": "author" },
{ "data": "star" },
{ "data": "rating_count" },
{ "data": "price" },
{ "data": "publish_date" },
{ "data": "cover", "visible": false },
]
});
You can publish to GitHub Pages based on the index.html and output.json above. There are many ways to publish it online, so I will omit it here.
If you define the requirements in the form of yaml as shown below, you can automatically acquire, extract and launch the site when the code is pushed and every day at UTC 00:00 (9:00 am Japan time).
name: publish to gh-pages
on:
push:
branches:
- master
schedule:
- cron: "0 0 * * *"
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
- uses: dschep/install-pipenv-action@v1
- run: pipenv install
- run: TZ='Asia/Tokyo' date --iso-8601="minutes" > public/update_time.txt
- run: pipenv run scrapy crawl PrimeReading -o public/output.json
- name: Deploy to GitHub Pages
if: success()
uses: crazy-max/ghaction-github-pages@v2
with:
target_branch: gh-pages
build_dir: public
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
――It seems that there are many magazines in Prime Reading. ――It's great that GitHub Actions is convenient and free for 2000 minutes a month. ――Eiji Yoshikawa's Sangokushi can now be read for free on Prime Reading.
Recommended Posts