I'm an old person, so I'm basically a "make things that don't exist" group.
This time, as the title suggests, the images are sorted. Probably from around 2000, I don't know how many files there are, such as photos taken with a digital camera or smartphone, downloaded wallpapers, screenshots, etc., and a snapshot-like backup at a certain point is stored in multiple HDDs, so it is in a very chaotic state. Well, it can be said that storing with distribution and redundancy is a risk hedge in a sense.
--HDD1: Up to 2000 sheets --HDD2: Up to 2500 sheets --HDD3: Up to 3000 sheets
There is a lot of duplication, and the wreckage that I had sorted a little before remains in filename (1) .jpg, so I thought I should omit the duplication and put it in a new HDD and upload it to Amazon Photos.
Only the place to get crc32 may be reused later, so put it in another module.
mycrc32.py
import binascii
import glob
import os
import sys
#Get CRC32 of file
def get_crc32(file):
with open(file, "rb") as f:
barray = f.read()
return binascii.crc32(barray, 0)
# {File name, CRC32, file size}To output
def output_info(srcFolder):
files = glob.glob(os.path.join(srcFolder, "*.*"), recursive=True)
infofile = os.path.join(srcFolder, "info.csv")
if os.path.isfile(infofile):
os.remove(infofile)
for file in files:
with open(infofile, "a") as f:
efull = os.path.join(srcFolder, file)
crc32 = get_crc32(efull)
f.write(f"{os.path.basename(efull)},{hex(crc32)},{os.path.getsize(file)}\n")
This is the main body.
PictureSorter.py
import datetime
import os
import glob
from PIL import Image
from PIL.ExifTags import TAGS
import shutil
import sys
import mycrc32
#Get Exif information
def get_exif_of_image(file):
img = Image.open(file)
try:
exif = img._getexif()
except AttributeError:
return {}
exifTable = {}
if exif is not None:
for key in exif.keys():
tag = TAGS.get(key, key)
exifTable[tag] = exif[key]
return exifTable
#Get the last modified date and time of the file
def get_last_write_time(file):
t = os.path.getmtime(file)
return datetime.datetime.fromtimestamp(t)
#Get the destination subfolder name
def get_destination_folder(file):
exifTable = get_exif_of_image(file)
if exifTable is not None:
t = exifTable.get('DateTimeOriginal')
if t is not None:
t = datetime.datetime.strptime(t, '%Y:%m:%d %H:%M:%S')
else:
t = get_last_write_time(file)
else:
t = get_last_write_time(file)
return t.strftime("%Y%m")
#Decide on a unique file name when duplicate file names
def ensure_filename(dfull):
path = os.path.dirname(dfull)
pureName = os.path.splitext(os.path.basename(dfull))[0]
ext = os.path.splitext(os.path.basename(dfull))[1]
newName = f"{os.path.join(path, pureName)}{ext}"
i = 0
while os.path.isfile(newName):
i += 1
newName = f"{os.path.join(path, pureName)}({i}){ext}"
return newName
if __name__ == "__main__":
_SOURCE_FOLDER = sys.argv[1]
_DESTINATION_FOLDER = sys.argv[2]
def get_ext(file):
return os.path.splitext(os.path.basename(file))[1].lower()
if (len(sys.argv) != 3):
print("PictureSorter.py srcFolder dstFolder")
x = input()
exit
else:
print(f"srcFolder={_SOURCE_FOLDER}")
print(f"dstFolder={_DESTINATION_FOLDER}")
print("any key to go!")
x = input()
files = glob.glob(os.path.join(_SOURCE_FOLDER, "*.*"), recursive=True)
for file in filter(lambda file: get_ext(file) in [ ".jpg ", ".png " ], files):
dstfol = get_destination_folder(file)
dfol = os.path.join(_DESTINATION_FOLDER, dstfol)
if not os.path.exists(dfol):
os.makedirs(dfol, exist_ok=True)
#Move file(While saying that, the lost is scary, so copy it once and erase it later)
dfull = os.path.join(dfol, os.path.basename(file))
efull = ensure_filename(dfull)
print(f"{file} -> {efull}")
shutil.copy2(file, efull) #If you want to move, use move instead of copy2
#File name / CRC32 / file size output
mycrc32.output_info(dfol)
This works as required, but what about the reality? Everyday photos taken with a smartphone (so-called family photos) and screenshots of the game cannot be in the same folder just because they have the same date. Folders for each category are required at a higher level. Once the category is decided, the requirement for moving to the subordinate folder is satisfied with this code.
In that case, it is very troublesome, but it is appropriate to D & D the image file you want to move after selecting the destination category on the GUI form while looking at the reduced image with Explorer. I think that it is cool to let machine learning determine the destination category, but with the current knowledge, that is (sweat)
That's why I did a little research, but it seems that it will not end during the holidays because there are various things such as the implementation using wxPython from the method of using the Windows API and the implementation in 3.9.0 (actually I am doing it during the summer vacation) I will write only the GUI part in C # and make the tea muddy (I will omit it because the C # side is not a big content)
Maybe there's more Python-like code, but it's obvious, so around here. Recently, the feel of the membrane keyboard has become unpleasant, so I replaced it with a mechanical keyboard (brown axis) for the first time in several years. It's fun to hit the keys!
Recommended Posts