Converting UTF-8 files with BOM to UTF-8 files without BOM due to business needs I created a script to convert all at once.
It was unexpectedly convenient, so I will publish it. Please refer to the comments in the source for the processing contents.
convert.py
#---BOM removal tool
import sys, os, codecs, shutil
#---Get command line arguments
args = sys.argv
#---Get current path
current_path = os.getcwd()
#---Setting temporary file name for work
tempfile = ''.join([current_path,'\\tempfile'])
#---Variable for counting the number of cases processed
conv_count = 0
print("BOM file removal script")
#---If there is no argument, an error is output and the process ends.
if len(sys.argv) == 1:
print("There are no arguments.")
sys.exit(1)
else:
#---Get the first argument
filepath = args[1]
#---Existence check of file path of specified argument
if(os.path.exists(filepath)):
#--- os.walk function loop
for (current, subfolders, files) in os.walk(filepath):
#---List of acquired files(files)Loop with
#---Get individually from files to fileName
for fileName in files:
#---File path generation to be processed
target_path = '\\'.join([current, fileName])
#--- UTF-8BOM to UTF-8 Conversion process to NOBOM
#---UTF the file to be processed-Open in read mode as 8BOM
with codecs.open(target_path, 'r', 'utf_8_sig') as r:
#---UTF temporary files-Open in write mode with 8NOBOM
with codecs.open(tempfile, 'w', 'utf-8') as w:
#---Read line by line from the file to be processed(Assign to line)
for line in r:
#---Output the contents of line to a temporary file in one line
w.write(line)
#---File replacement process
#---Overwrite and copy the temporary file to the file to be processed
shutil.copyfile(tempfile, target_path)
#---Delete temporary files
os.remove(tempfile)
#Count up the number of cases processed
conv_count += 1
else:
#---If the path of the specified argument does not exist, an error is output and the process ends.
print("The specified folder does not exist.")
sys.exit(1)
#---End message
print(filepath + "Destroyered files under it (converted without BOM)")
print('Number of converted files:{}'.format(conv_count))
sys.exit(0)
Scripts are machine-independent, so they work on Windows, Mac, and Linux. Suppose you save your Python script with the file name convert.py.
・ For Windows and Linux
> py convert.py dir_path
・ For Mac
> python convert.py dir_path
Specify dir_path as the full path.
The script created this time recursively scans under the specified folder. Performs processing on all files.
For example, targeting a specific type of file in the current code, Give a file name as an argument and only match files I think it would be more convenient to add the improvement of targeting.
I think there is a smarter way to convert BOM, Python, which can write such processing with only basic instructions, After all it is a convenient language.
Recommended Posts