--Chase some textual content in chronological order to see if it has been updated compared to the last time. --If it has been updated, use the GitLab API to git commit & push the content to GitLab. --Implemented in Python. --This article uses EtherCalc as the content to keep track of updates.
GitLab Docs > API Docs > API resources
Ubuntu 20.04
$ python3 --version
Python 3.8.2
Now let's get started. Use the on-premise GitLab as the git server. This time, I will introduce it locally with Docker quickly. After moving to a suitable directory on the terminal, execute the following.
git clone https://github.com/sameersbn/docker-gitlab
cd docker-gitlab
docker-compose up -d
When I waited for a while after launching the container and executed "docker-compose ps", the following was displayed.
Name Command State Ports
----------------------------------------------------------------------------------------------------------------------------------
docker-gitlab_gitlab_1 /sbin/entrypoint.sh app:start Up (healthy) 0.0.0.0:10022->22/tcp, 443/tcp, 0.0.0.0:10080->80/tcp
docker-gitlab_postgresql_1 /sbin/entrypoint.sh Up 5432/tcp
docker-gitlab_redis_1 docker-entrypoint.sh --log ... Up 6379/tcp
Since the http port of the GitLab container is 10080, access http: // localhost: 10080 / with a browser. When the screen below appears, change the root password.
Go to the screen below and create a development user (username can be anything).
After logging in as a development user, select Create a project from the screen below.
The project name is "ethercalc_backup" as shown below, but the name can be anything. Press the Create project button to create the project.
The project was created as follows. The Project ID is displayed below the project name (3 in this article). You will use this ID later in your Python code.
Later you will use the GitLab API from your Python code. At that time, you will need an access token, so get it. As shown below, bring up the pull-down menu from the upper right of the browser and select "Settings".
On the screen below, select "Access Tokens" from the menu on the left.
On the screen below, enter an arbitrary name in Name, Scopes will check api, and press the button "Create Personal access token".
The access token was created as follows. Copy it to the clipboard and save it. In this article, we will use "6f8YXyrZ1SCSADHTJ2L9" as an access token.
As a textual content provider, we will use EtherCalc for this article. Like GitLab, it will be deployed locally with Docker. After moving to a suitable directory different from GitLab on the terminal, create docker-compose.yml with the same contents as https://github.com/audreyt/ethercalc/blob/master/docker-compose.yml and create a container. Start up.
wget https://raw.githubusercontent.com/audreyt/ethercalc/master/docker-compose.yml
docker-compose up -d
When I waited for a while after launching the container and executed "docker-compose ps", the following was displayed.
Name Command State Ports
--------------------------------------------------------------------------------------------
docker-ethercalc_ethercalc_1 sh -c REDIS_HOST=$REDIS_PO ... Up 0.0.0.0:80->8000/tcp
docker-ethercalc_redis_1 docker-entrypoint.sh redis ... Up 6379/tcp
Since the http port of the EtherCalc container is 80, try accessing http: // localhost / with a browser.
Make two EtherCalc sheets for testing. Launch a text editor, create new foo.sc and bar.sc and save.
editor foo.sc
foo.sc
socialcalc:version:1.0
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=SocialCalcSpreadsheetControlSave
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8
# SocialCalc Spreadsheet Control Save
version:1.0
part:sheet
part:edit
part:audit
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8
version:1.5
cell:A1:t:foo1
cell:A2:t:foo2
sheet:c:1:r:2:tvf:1
valueformat:1:text-wiki
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8
version:1.0
rowpane:0:1:1
colpane:0:1:1
ecell:A1
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8
--SocialCalcSpreadsheetControlSave--
editor bar.sc
bar.sc
socialcalc:version:1.0
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=SocialCalcSpreadsheetControlSave
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8
# SocialCalc Spreadsheet Control Save
version:1.0
part:sheet
part:edit
part:audit
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8
version:1.5
cell:A1:t:bar1
cell:A2:t:bar2
sheet:c:1:r:2:tvf:1
valueformat:1:text-wiki
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8
version:1.0
rowpane:0:1:1
colpane:0:1:1
ecell:A1
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8
--SocialCalcSpreadsheetControlSave--
The foo.sc and bar.sc above are text files in SocialCalc format that can be imported into EtherCalc. Exporting / importing in SocialCalc format has the advantage that the sheet format (appearance) can also be recovered. You can also import CSV format files, but you cannot recover the sheet format.
Do the following to import.
curl -X PUT -H 'Content-Type: text/x-socialcalc' --data-binary @foo.sc http://localhost/_/foo
curl -X PUT -H 'Content-Type: text/x-socialcalc' --data-binary @bar.sc http://localhost/_/bar
Go to http: // localhost / foo and http: // localhost / bar in your browser. If the cell of the sheet contains the following data, the import is successful.
We will download the files in SocialCalc and CSV formats from these URLs and manage them in GitLab.
Please forgive me for the code being miscellaneous and poorly behaved.
The logical processing procedure is as follows. For each content of foo and bar
--Download files from EtherCalc and GitLab (both SocialCalc and csv formats) --If the file does not exist in GitLab, add a new one to the git repository --If there is a file in GitLab, compare both EtherCalc and GitLab files, and if there is a difference, update the git repository. --Git commit & push using GitLab API --After that, using GitLab API, git diff and output the result to the log --Create a directory called ethercalc under the git repository and back it up under it.
Below is the Python code. The code uses a variable called logger, but please note that the code around logging has been omitted.
ethercalc_backup.py
import time
import datetime
import urllib.request
import urllib.parse
import json
import pprint
import re
import base64
#URL of ethercalc content managed by git
ethercalc_uris = [ "http://localhost/foo", "http://localhost/bar" ]
#GitLab related
gitlab_base_uri = "http://localhost:10080/"
#Backup destination in the git repository
gitlab_backup_directory = "ethercalc"
gitlab_private_token = "6f8YXyrZ1SCSADHTJ2L9"
gitlab_project_id = 3
#now
str_now = datetime.datetime.today().strftime("%Y%m%d_%H%M%S")
#new line
LF = '\n'
def get_gitlab_file(private_token, file_path):
"""
Get 1 file from GitLab repository
Parameters
----------
private_token : str
Access token for GitLab API
file_path : str
File path from the top of the git repository
Returns
-------
anonymous : json
Response from GitLab
"""
# https://docs.gitlab.com/ee/api/repository_files.html
gitlab_uri = f"{gitlab_base_uri}api/v4/projects/{gitlab_project_id}/repository/files/{urllib.parse.quote(file_path, safe='')}?ref=master"
logger.info(f"gitlab_uri={gitlab_uri}")
headers = {
"PRIVATE-TOKEN": private_token
}
request = urllib.request.Request(gitlab_uri, headers=headers)
try:
with urllib.request.urlopen(request) as res:
res_files = json.loads(res.read())
except urllib.error.HTTPError as ee:
if ee.code == 404:
return {}
else:
raise
except:
raise
else:
# logger.debug(f"gitlab res_commit={LF}{pprint.pformat(res_files)}")
return res_files
def compare_ethercalc_and_gitlab(actions, ethercalc_uri, git_filename):
"""
Get files from EtherCalc and GitLab repositories, compare and add actions to actions variable if there are differences
Parameters
----------
actions : list
Actions variable later passed to GitLab's commits API
ethercalc_uri : str
EtherCalc URI
git_filename : str
Filename in git repository
Returns
-------
None
"""
logger.info(f"ethercalc URL={ethercalc_uri}")
#Download from EtherCalc
request = urllib.request.Request(ethercalc_uri)
with urllib.request.urlopen(request) as res:
content_ethercalc = res.read().decode("utf-8")
# logger.debug(f"content_ethercalc={LF}{content_ethercalc}")
#Download from GitLab
action_str = ""
file_path = f"{gitlab_backup_directory}/{git_filename}"
res_gitlab_file = get_gitlab_file(gitlab_private_token, file_path)
try:
content_gitlab = base64.b64decode(res_gitlab_file["content"]).decode("utf-8")
except KeyError:
#If there is no file in GitLab, create a new one later and git commit & push
action_str = "create"
except:
raise
else:
# logger.debug(f"content_gitlab={LF}{content_gitlab}")
#Compare files downloaded from EtherCalc and GitLab
if content_ethercalc == content_gitlab:
logger.info("content_ethercalc == content_gitlab")
else:
logger.info("content_ethercalc != content_gitlab")
#When there is a difference in the file contents, git commit & push later
action_str = "update"
#Registered in actions variable when action is create or update
if 0 < len(action_str):
action = {
"action": action_str,
"file_path": file_path,
"content": content_ethercalc
}
actions.append(action)
def main():
# ethercalc_Process each uris URL
actions = list()
count_commit = 0
re_compile = re.compile(r".*/(.*?)$")
for index, ethercalc_uri in enumerate(ethercalc_uris):
basename, = re_compile.match(ethercalc_uri).groups() #String"foo"、"bar"Take out
socialcalc_uri = ethercalc_uri[::-1].replace(basename[::-1], basename[::-1] + "/_", 1)[::-1]
csv_uri = ethercalc_uri + ".csv"
logger.info(f"[{index}] {basename}")
#Download from EtherCalc and GitLab in SocialCalc format and compare file contents
time.sleep(0.5) #Sleep properly so as not to be a DoS attack
compare_ethercalc_and_gitlab(actions, socialcalc_uri, f"{basename}.sc")
#Download from EtherCalc and GitLab in csv format and compare file contents
time.sleep(0.5) #Sleep properly so as not to be a DoS attack
compare_ethercalc_and_gitlab(actions, csv_uri, f"{basename}.csv")
if len(actions) == 0:
#Do not git commit if there is no difference in the file contents of EtherCalc and GitLab
continue
# git commit & push
# https://docs.gitlab.com/ee/api/commits.html
gitlab_uri = f"{gitlab_base_uri}api/v4/projects/{gitlab_project_id}/repository/commits"
commit_message = datetime.datetime.today().strftime(f"backup {str_now} {basename}")
logger.info(f'git commit -m "{commit_message}"')
headers = {
"method": "POST",
"PRIVATE-TOKEN": gitlab_private_token,
"Content-Type": "application/json"
}
payload = {
"branch": "master",
"commit_message": commit_message,
"actions": actions
}
logger.debug(f"payload={LF}{pprint.pformat(payload)}")
request = urllib.request.Request(gitlab_uri, json.dumps(payload).encode("utf-8"), headers=headers)
with urllib.request.urlopen(request) as res:
res_commit = json.loads(res.read())
logger.debug(f"gitlab res_commit={LF}{pprint.pformat(res_commit)}")
#git diff and output to log
# https://docs.gitlab.com/ee/api/commits.html
gitlab_uri = f"{gitlab_base_uri}api/v4/projects/{gitlab_project_id}/repository/commits/{res_commit['id']}/diff"
logger.info(f"git diff ( {res_commit['id']} )")
headers = {
"PRIVATE-TOKEN": gitlab_private_token,
}
request = urllib.request.Request(gitlab_uri, headers=headers)
with urllib.request.urlopen(request) as res:
res_diff = json.loads(res.read())
logger.info(f"gitlab res_diff={LF}{pprint.pformat(res_diff)}")
count_commit += 1
actions = list()
logger.info(f"{count_commit}Git commit")
if __name__ == '__main__':
try:
main()
except Exception as ee:
logger.exception(ee)
The first time, run it with the GitLab repository empty. Do the following on your terminal:
python3 ethercalc_backup.py
The following is displayed at the end of the execution message.
2 git commits
Check the project on the GitLab screen. It has 2 Commits as shown below, and a new ethercalc directory has been created.
When I went under ethercalc, there were two types of commits, foo and bar, as shown below, and two new files, SocialCalc format and csv format, were created for each commit.
You can check the contents by clicking the file name.
The second time, I will change only the contents of foo with EtherCalc and run the Python code. I added Hello etc. as follows.
Do the following on your terminal:
python3 ethercalc_backup.py
The following is displayed at the end of the execution message.
1 git commit
Check the project on the GitLab screen. It is 3 Commits as shown below.
3 When I click Commits, the recent foo commits have been added as below, while the bar hasn't added anything.
When I clicked on the commit of the added foo, the difference from the previous commit was displayed as shown below.
The third time, I'll try running the Python code without changing EtherCalc.
python3 ethercalc_backup.py
The following is displayed at the end of the execution message.
0 git commits
that's all.
Recommended Posts