This time, we aim to put the scraping program of previous on the cloud and execute it automatically, but first, put the test PGM on the cloud and make it operate normally. I will bring it to that point.
(1) Succeed in scraping the desired stuff locally for the time being. (2) Link the result of scraping locally to Google Spreadsheet. (3) cron is automatically executed locally. (4) Challenge free automatic execution on the cloud server. (Google Compute Engine) (4) -1 Put the test PGM on the cloud and run it normally on CloudShell ← Now here </ font> (4) -2 Add scraping PGM to the repository and run it normally on CloudShell. (4) -3 Create a VM instance of Compute Engine and have it automatically execute scraping. (5) Challenge free automatic execution without a server on the cloud. (Maybe Cloud Functions + Cloud Scheduler)
(1) Create a git repository on GCP using git (GitHub account required) (2) Create a clone locally (3) Add the program you want to upload to GCP to the local repository and commit (4) Push to master on GCP
If you do not have the Gcloud SDK installed, install it. Make sure the gcloudl command is set for the desired project. (For a new project, set the project with the gcloud init command.)
zsh
16:03:04 [~] % gcloud config list
[core]
account = [email protected]
disable_usage_reporting = False
project = my-hoge-app
Your active configuration is: [default]
Create a new repository in Cloud Source Repositories.
zsh
16:41:59 [~] %
16:42:00 [~] % gcloud source repos create gce-cron-test
Created [gce-cron-test].
WARNING: You may be billed for this repository. See https://cloud.google.com/source-repositories/docs/pricing for details.
An empty repository will be created in the target project like this.
Clone the repository you created in Cloud Source Repositories locally.
zsh
16:44:10 [~] %
16:44:10 [~] % gcloud source repos clone gce-cron-test
Cloning into '/Users/hoge/gce-cron-test'...
warning: You appear to have cloned an empty repository.
Project [my-hoge-app] repository [gce-cron-test] was cloned to [/Users/hoge/gce-cron-test].
The state where the py file is stored in the created local repository. (You can see that it is a git repository.)
zsh
16:46:15 [~] %
16:46:15 [~] % cd gce-cron-test
16:46:44 [~/gce-cron-test] % ls -la
total 8
drwxr-xr-x 4 hoge staff 128 9 23 16:45 .
drwxr-xr-x+ 45 hoge staff 1440 9 23 16:45 ..
drwxr-xr-x 9 hoge staff 288 9 23 16:45 .git
-rw-r--r-- 1 hoge staff 146 9 21 15:29 cron-test.py
Add the file to the index with the git add command Commit to your local repository with the git commit command.
zsh
16:47:21 [~/gce-cron-test] %
16:47:21 [~/gce-cron-test] % git add .
16:48:03 [~/gce-cron-test] %
16:48:04 [~/gce-cron-test] % git commit -m "Add cron-test to Cloud Source Repositories"
[master (root-commit) 938ea70] Add cron-test to Cloud Source Repositories
1 file changed, 5 insertions(+)
create mode 100644 cron-test.py
Push to master (Cloud Source Repositories).
zsh
16:50:15 [~/gce-cron-test] %
16:50:15 [~/gce-cron-test] % git push origin master
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 4 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 349 bytes | 116.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://source.developers.google.com/p/my-hoge-app/r/gce-cron-test
* [new branch] master -> master
You can see that you were able to push to master along with the commit message.
Let's test it on Cloud Shell on GCP.
Select the desired project and launch Cloud Shell.
The terminal will start.
Clone the git repository from master as you would local.
bash
cloudshell:09/25/20 02:59:00 ~ $ gcloud source repos clone gce-cron-test
Cloning into '/home/hoge/gce-cron-test'...
remote: Total 3 (delta 0), reused 3 (delta 0)
Unpacking objects: 100% (3/3), done.
Project [my-xxx-app] repository [gce-cron-test] was cloned to [/home/hoge/gce-cron-test].
It was cloned.
bash
cloudshell:09/25/20 03:01:49 ~ $ cd gce-cron-test
cloudshell:09/25/20 03:02:09 ~/gce-cron-test $ ls -la
total 20
drwxr-xr-x 3 hoge hoge 4096 Sep 23 10:59 .
drwxr-xr-x 13 hoge rvm 4096 Sep 23 11:18 ..
-rw-r--r-- 1 hoge hoge 146 Sep 23 09:03 cron-test.py
drwxr-xr-x 8 hoge hoge 4096 Sep 23 09:03 .git
Check the python path and version. 3.8.5 is pre-installed in this environment with pyenv.
bash
cloudshell:09/25/20 03:02:21 ~/gce-cron-test $ which python
/home/hoge/.pyenv/shims/python
cloudshell:09/25/20 03:02:42 ~/gce-cron-test $ python -V
Python 3.8.5
As shown below, it works normally on CloudShell.
bash
cloudshell:09/25/20 03:02:50 ~/gce-cron-test $ python cron-test.py
2020/09/25 03:03:11 cron works!
cloudshell:09/25/20 03:03:12 ~/gce-cron-test $
However, crontab didn't work. The Cloud Shell environment seems to be an environment that only accepts interactive interactive commands. .. .. Next time, I will add the scraping PGM to the repository and run it normally on CloudShell.
CloudShell is an IDE environment that can be used on google's cloud, a kind of virtual VM environment with a 5GB Disk, and a Theia-based code editor can also be used.
You can also edit hidden files with an editor
bash
$ cloudshell edit $HOME/.bashrc
You can also download it.
bash
$ cloudshell download $HOME/.bashrc
[CloudShell] https://cloud.google.com/shell/?hl=ja
Recommended Posts