Someone in Python at https://qiita.com/ozaki25/items/33a57ad7eea55822c764. .. .. Because it was a story I wrote it in Slack for a moment, but I got a story saying "Explanation ~" (probably) and I will write it quickly.
The writing style is my hobby, so there is a better way to write it! Please let me know.
A brief description of the Python code
ubuntu 18.04 + python 3.7.6.
$ cat /proc/version
Linux version 4.15.0-111-generic (buildd@lcy01-amd64-011) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #112-Ubuntu SMP Thu Jul 9 20:32:34 UTC 2020
$ python3 -V
Python 3.7.6
import os
import re
from glob import glob
p_file = glob("./*.csv")
_ = [ os.rename(file, re.sub(r'^.*_([0-9]{1,3}).*(\.csv)$', r'\1\2', file)) for file in p_file ]
For some reason, the code pasted in Slack was converted to strange code. : nerd: Please see this code correctly.
As a result, this happened because the entry owner insisted on "one liner". .. Of course Python also has a CLI so I can do it, but I don't understand why so I stopped here.
Is data creation one-liner? So it seems a little foul.
!touch aaa_1.csv;touch aaa_2.csv;touch aaa_4.csv;touch aaa_9.csv;touch aaa_10.csv;touch aaa_99.csv;touch aaa_100.csv;touch aaa_999.csv;touch aaa_1000.csv
Now for the explanation.
import os
import re
from glob import glob
File rename is os.rename Regular expression re for file name replacement I used.
I used glob to get the file list object. I've been using similar pathlibs these days, I don't use join anymore.
p_file = glob("./*.csv")
Use glob to get the file list object. I didn't want to target extra files, but glob doesn't support multiple matches, so for the time being, I'm not good at fetching all the csv extensions. ... I was a little addicted to it.
_ = [ os.rename(file, re.sub(r'^.*_([0-9]{1,3}).*(\.csv)$', r'\1\2', file)) for file in p_file ]
Yes, I don't know unless I'm used to it. I used comprehensions to reduce the number of lines (although there are different benefits).
It looks like this if you write it without using intensions.
for file in p_file:
re_file = re.sub(r'^.*_([0-9]{1,3}),*(\.csv)$', r'\1\2',p_file)
os.rename(file, re_file)
It doesn't make much sense to assign the result to _, and if you don't specify it, the console log isn't beautiful, so it's a painstaking measure to prevent it from appearing.
[None, None, None, None, None, None, None, None, None, None]
what is it. .. .. this.
The last is a regular expression The part containing 1 to 3 digits and the extension part are extracted and assembled (standard match + replacement). It may be a little sweet to narrow down by regular expressions.
r'^.*_([0-9]{1,3}).*(\.csv)$' ⇛ r'\1\2'
It is recommended to use r (row string expression) because it can be expressed without extra escaping.
I'm worried about the read permission of the file or something, but I'm omitting it this time. However, when operating unattended, it is better to include basic existence check / read / write authority.
It may fail with os.rename, so you should also try ~ except. This time, I skipped it because I focused on quickness (excuse: umbrella2 :)
In the code used in the company, the CSV file is read and imported into the DataFrame, but a Class that wraps the existence check and read / write check is created and operated.
I have recently been transferred, and as I can see the whole picture of the project I am in charge of, I feel pressured. .. .. I'm relieved to write Python code (just kidding: sweat_smile :).
Recommended Posts