background

Someone in Python at https://qiita.com/ozaki25/items/33a57ad7eea55822c764. .. .. Because it was a story I wrote it in Slack for a moment, but I got a story saying "Explanation ~" (probably) and I will write it quickly.

The writing style is my hobby, so there is a better way to write it! Please let me know.

things to do

A brief description of the Python code

environment

ubuntu 18.04 + python 3.7.6.

$ cat /proc/version
Linux version 4.15.0-111-generic (buildd@lcy01-amd64-011) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #112-Ubuntu SMP Thu Jul 9 20:32:34 UTC 2020

$ python3 -V
Python 3.7.6

Posted code

import os
import re
from glob import glob
p_file = glob("./*.csv")
_ = [ os.rename(file, re.sub(r'^.*_([0-9]{1,3}).*(\.csv)$', r'\1\2', file)) for file in p_file ]

For some reason, the code pasted in Slack was converted to strange code. : nerd: Please see this code correctly.

As a result, this happened because the entry owner insisted on "one liner". .. Of course Python also has a CLI so I can do it, but I don't understand why so I stopped here.

Is data creation one-liner? So it seems a little foul.

!touch aaa_1.csv;touch aaa_2.csv;touch aaa_4.csv;touch aaa_9.csv;touch aaa_10.csv;touch aaa_99.csv;touch aaa_100.csv;touch aaa_999.csv;touch aaa_1000.csv

Now for the explanation.

import os
import re
from glob import glob

File rename is os.rename Regular expression re for file name replacement I used.

I used glob to get the file list object. I've been using similar pathlibs these days, I don't use join anymore.

p_file = glob("./*.csv")

Use glob to get the file list object. I didn't want to target extra files, but glob doesn't support multiple matches, so for the time being, I'm not good at fetching all the csv extensions. ... I was a little addicted to it.

_ = [ os.rename(file, re.sub(r'^.*_([0-9]{1,3}).*(\.csv)$', r'\1\2', file)) for file in p_file ]

Yes, I don't know unless I'm used to it. I used comprehensions to reduce the number of lines (although there are different benefits).

It looks like this if you write it without using intensions.

for file in p_file:
  re_file = re.sub(r'^.*_([0-9]{1,3}),*(\.csv)$', r'\1\2',p_file)
  os.rename(file, re_file)

It doesn't make much sense to assign the result to _, and if you don't specify it, the console log isn't beautiful, so it's a painstaking measure to prevent it from appearing.

[None, None, None, None, None, None, None, None, None, None]

what is it. .. .. this.

The last is a regular expression The part containing 1 to 3 digits and the extension part are extracted and assembled (standard match + replacement). It may be a little sweet to narrow down by regular expressions.

r'^.*_([0-9]{1,3}).*(\.csv)$' ⇛ r'\1\2'

It is recommended to use r (row string expression) because it can be expressed without extra escaping.

Detour

I'm worried about the read permission of the file or something, but I'm omitting it this time. However, when operating unattended, it is better to include basic existence check / read / write authority.

It may fail with os.rename, so you should also try ~ except. This time, I skipped it because I focused on quickness (excuse: umbrella2 :)

In the code used in the company, the CSV file is read and imported into the DataFrame, but a Class that wraps the existence check and read / write check is created and operated.

At the end

I have recently been transferred, and as I can see the whole picture of the project I am in charge of, I feel pressured. .. .. I'm relieved to write Python code (just kidding: sweat_smile :).

Recommended Posts

<Python> A quiz to batch convert file names separated by a specific character string as part of the file name

Get the variable name of the variable as a character string.

I want to batch convert the result of "string" .split () in Python

Convert the character code of the file with Python3

[Python] I tried to get the type name as a string from the type function

Cut a part of the string using a Python slice

[Ansible] Example of playbook that adds a character string to the first line of the file

[Ruby] How to replace only a part of the string matched by the regular expression?

[Python3] Format the character string using the variable name as the key.

[Python] How to make a list of character strings character by character

[python] Change the image file name to a serial number