I thought I'd write the 4th day as it is after the 3rd day of starting Python, but it seems to be quite long, so I decided to prepare a new 4th day story and move it. This time, the main story is to use PHP's exec in Python's subprocess. I think it will be helpful for applications where Python is used as part of Linux functions. It's close to batch processing, but it also has the benefit of running it in Python.
Those who are studying (researching) programming languages well are writing systematically organized contents. I think everyone is wonderful. In my case, I am in the position of using a computer or a programming language as a tool, so it is different from the general flow because I research and shape the necessary contents each time. If you start "something like", it will be necessary, so please pick up and use only the reference part.
Since it is necessary to check the format of the file dropped from the WEB, in PHP, the following was executed to check the return value and add the file extension. Many programs still determine what the file is by looking at the file extension, so you need to do it here. It's likely to be unnecessary in the future, but it's still needed now.
file.php
exec('file -i -b data/default.xlsx', $out, $ret);
echo $out[0];
//The return value is as follows
// application/vnd.ms-excel; charset=binary
file.py
import subprocess
args = ['file','-i','-b','default.xlsx']
proc = subprocess.run(args,stdout = subprocess.PIPE, stderr = subprocess.PIPE)
string = proc.stdout.decode("utf8")
none = 'no'
if string.find('excel') < 0:
print(none)
elif string.find('excel') > 0:
print(string)
#The return value is as follows
# application/vnd.ms-excel; charset=binary
Both PHP and Python have the same return value as a matter of course.
I was trying to do this with subprocess as well, but I found a way to use the pdftotext module by embedding it in Linux, and I was wondering what happened. You can find the method at the following link. Convert PDF to text with pdftotext Is this an article, isn't the code complicated? If you think about it, it even prepares a display screen, so I will simply use subprocess as a reference later. Let's still compare it with the PHP code.
pdftotext.php
$command ="pdftotext -layout -nopgbrk data/*.pdf";
shell_exec($command);
pdftotext.py
import subprocess
args = ['pdftotext','-layout','-nopgbrk','data/sjd23d_mn.pdf'] # (Does not convert unless a file name is specified)
args = ['pdftotext','-layout','-nopgbrk',"data/*.pdf"] #(Can be specified in asterics with double quotes)
proc = subprocess.run(args)
It didn't work right away, but I changed it a little and it worked, but at first I couldn't convert by picking up the file name with asterics. I wrote this in the code, but if you enclose the file name specification with double quotes, you can specify it with asterics. It was useless because I wrote it in the form of specifying the parameters of subprocess, but this difference is important. (This is for me)
pdftoppm.php
$command ="pdftoppm -jpeg data/*.pdf data/";
shell_exec($command);
pdftoppm.py
import subprocess
args = ['pdftoppm','-jpeg',"data/*.pdf","data/"]
args = ['pdftoppm','-jpeg','data/sjd23d_mn.pdf','data/']
proc = subprocess.run(args)
The result seems to be useless if you do not specify the file name when converting PDF to image. It works a little differently from PHP. Perhaps there is a way to import pdftoppm into Python and use it, but that's another time.
unar.php
$command = 'unar -f data/data/selenium-master.zip -D -o data';
shell_exec($command);
unar.py
import subprocess
args = ['unar','-f','data/selenium-master.zip','-D','-o','data/']
#If you need a password
args = ['unar','-f','-p','passw','data/selenium-master.zip','-D','-o','data/']
proc = subprocess.run(args)
Decompression of the compressed file looks good with this. 5.selenium This doesn't seem to be easy, so I'll go a little further.
This was absolutely necessary to erase the traces of the unar unzipped files. First I'll show you how to do it in PHP. I'm going to do this with Python, but there seem to be various ways to do it, so I'll consider it from now on.
delete.php
foreach (glob($relative.'/data/*.*') as $file) {
unlink($file);
}
foreach (glob($relative.'/data/*') as $file) {
unlink($file);
remove_directory($file);
}
function remove_directory($dir) {
$files = array_diff(scandir($dir), array('.','..'));
foreach ($files as $file) {
//Separate processing by file or directory
if (is_dir("$dir/$file")) {
//If it is a directory, call the same function again
remove_directory("$dir/$file");
} else {
//Delete file
unlink("$dir/$file");
//echo "File:" . $dir . "/" . $file . "Delete\n";
}
}
//Delete the specified directory
return rmdir($dir);
}
--In the case of Python, there is a way to delete the contents of the directory in one shot by specifying the directory name.
delete.py
import pathlib
import shutil
p = pathlib.Path('selenium-master')
shutil.rmtree(p)
--Delete files and directories in the directory specified by Python at once (well, it's a beautiful person)
all_del.py
import shutil
import glob
import os
#A list of files and folders in the directory my_Put it in the list.
my_list = glob.glob("./data/*")
# my_Flow the contents of list to the end.
for value in my_list:
#Separate the commands to be deleted depending on whether they are files or folders.
if os.path.isfile(value):
os.remove(value)
elif os.path.isdir(value):
shutil.rmtree(value)
There is no other site that publishes this method and code. You can also use this to select the extension of the file you want to erase. Above all, I wanted to erase the decompressed file. If there is a more beautiful method, please let me know.
Functions are provided as standard in PHP, but it seems that Python is rarely seen.
sanitizing.php
$wfull = htmlspecialchars($wtarget, ENT_QUOTES);
There seems to be such a method, but I would like to consider it a little more.
sanitizing.py
import cgi
inlist = 'https://www.yahoo.co.jp/'
transform = cgi.escape(inlist)
print(transform)
# https://www.yahoo.co.jp/
inlist = '"><script>alert(document.cookie);</script>'
transform = cgi.escape(inlist)
print(transform)
#Well you can deal with sanitizing
# "><script>alert(document.cookie);</script>
Recommended Posts