I want to do something new. What you can use at work anyway → I wonder if I should shape the OPLOG.
Oplog is our product term. It is an operation log that contains extra data such as date and time, operator, operation function in csv. Sometimes I get it when something goes wrong. "Operation log".
Pursuing your own happiness is the best motivation. For the time being, I'm happy just to sort by date and time. I decided to start at my own pace.
This is our calendar: Works Human Intelligence Advent Calendar 2020 that embodies Develop fun! Works Human Intelligence # 2 Advent Calendar 2020 that embodies Develop fun!
Candidates are around here
https://qiita.com/noborus/items/f253961cca6f4465f20c https://golang.org/
https://qiita.com/ysdyt/items/9ccca82fc5b504e7913a
https://www.delftstack.com/ja/howto/java/how-to-sort-objects-in-arraylist-by-date-in-java/
That's all for the first day.
https://www.python.org/downloads/ I installed Python 3.9 for the time being.
https://qiita.com/AI_Academy/items/b97b2178b4d10abe0adb I don't really want to use it for 5 hours, but let's take a look at it.
test.py
#This line is a comment. This line will not be executed.
print("Hello, Python")
#This line is a comment. This line will not be executed.
#This line is a comment. This line will not be executed.
I got angry when I made and typed test.py
at the command prompt
C:\workspace\Python\playground>test.py
SyntaxError: Non-UTF-8 code starting with '\x82' in file C:\Users\works\Desktop\workspace\Python\playground\test.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
Is it an amateur? Resave in UTF-8 and Hello, Python
succeed
from?
py -m pip install pandas
https://data-flair.training/blogs/install-pandas-on-windows/ Did you install Pandas now?
https://note.nkmk.me/python-pandas-to-csv/ I want to try this area for the time being, but before that, what is a good python editor? → I put VS code. https://code.visualstudio.com/
Without thinking
firstpandas.py
import pandas as pd
I tried to write, but I got an error
RuntimeError: The current Numpy installation ('C:\\Users\\works\\AppData\\Local\\Programs\\Python\\Python39\\lib\\site-packages\\numpy\\__init__.py') fails to pass a sanity check due to a bug in the windows runtime. See this issue for more information: https://tinyurl.com/y3dm3h86
RuntimeError: The current Numpy installation (Omission) fails to pass a sanity check due to a bug in the windows runtime. See this issue for more information: https://tinyurl.com/y3dm3h86
Investigate the cause of.
See for more information below.
https://developercommunity.visualstudio.com/content/problem/1207405/fmod-after-an-update-to-windows-2004-is-causing-a.html
use numpy==1.19.3 works
?
https://qiita.com/bear_montblanc/items/b4b75dfd77da98076da5 If you google this I don't know much about what I believe → The error disappeared. Eh.
https://note.nkmk.me/python-pandas-sort-values-sort-index/ Return to. As per the tutorial
firstpandas.py
import pandas as pd
df = pd.read_csv('sample_pandas_normal.csv', index_col=0)
print(df)
Then run.
C:\workspaces\playground>firstpandas.py
age state point
name
Alice 24 NY 64
Bob 42 CA 92
Charlie 18 CA 70
Dave 68 TX 70
Ellen 24 CA 88
Frank 30 NY 57
Oh, it was. Let's sort this. Is it okay to do this?
firstpandas.py
import pandas as pd
df = pd.read_csv('sample_pandas_normal.csv', index_col=0)
df.sort_values('age')
print(df)
As a result, it doesn't change. It's not good to do it with a sense. .. First of all, the official document is around here https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html
once again. Writing from the conclusion, this gave me an image.
firstpandas.py
import pandas as pd
df = pd.read_csv('sample_pandas_normal.csv')
print(df)
df_s = df.sort_values('age')
print(df_s)
C:\workspaces\playground>firstpandas.py
name age state point
0 Alice 24 NY 64
1 Bob 42 CA 92
2 Charlie 18 CA 70
3 Dave 68 TX 70
4 Ellen 24 CA 88
5 Frank 30 NY 57
name age state point
2 Charlie 18 CA 70
0 Alice 24 NY 64
4 Ellen 24 CA 88
5 Frank 30 NY 57
1 Bob 42 CA 92
3 Dave 68 TX 70
It ’s better to review the Python grammar.
Python3 cheat sheet (basic) Python cheat sheet basic elements (@IT) Translated Pandas Official Cheat Sheet
Once here.
Now, I changed the CSV used a little and added the date column.
sample_pandas_date.csv
name,age,state,point,birthday
Alice,24,NY,64,1996/1/2
Bob,42,CA,92,1978/2/2
Charlie,18,CA,70,2002/3/4
Dave,68,TX,70,1952/1/1
Ellen,24,CA,88,1996/1/5
Frank,30,NY,57,1990/5/15
firstpandas.py
import pandas as pd
df = pd.read_csv('sample_pandas_date.csv')
print(df)
df_s = df.sort_values('birthday')
print(df_s)
What happens with this is below the result.
C:\workspaces\playground>firstpandas.py
name age state point birthday
0 Alice 24 NY 64 1996/1/2
1 Bob 42 CA 92 1978/2/2
2 Charlie 18 CA 70 2002/3/4
3 Dave 68 TX 70 1952/1/1
4 Ellen 24 CA 88 1996/1/5
5 Frank 30 NY 57 1990/5/15
name age state point birthday
3 Dave 68 TX 70 1952/1/1
1 Bob 42 CA 92 1978/2/2
5 Frank 30 NY 57 1990/5/15
0 Alice 24 NY 64 1996/1/2
4 Ellen 24 CA 88 1996/1/5
2 Charlie 18 CA 70 2002/3/4
It's becoming like that.
It's time to use the real operation log. When I put the actual CSV, I got an error. This is the end of the excitement.
C:\workspaces\playground>firstpandas.py
Traceback (most recent call last):
File "C:\workspaces\playground\firstpandas.py", line 3, in <module>
df = pd.read_csv('oplog20201112.csv')
File "C:\Users\works\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\Users\works\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "C:\Users\works\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\parsers.py", line 948, in __init__
self._make_engine(self.engine)
File "C:\Users\works\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Users\works\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\parsers.py", line 2010, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 537, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 740, in pandas._libs.parsers.TextReader._get_header
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 0: invalid start byte
I got an error when I read the CSV
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 0: invalid start byte
CSV is like this (masked)
oplog20201112.csv
User ID,client name,Windows login ID,Terminal ID,IP Address,MAC Address,Domain name,Login time,Logout time,Login status,action,Function name,Executable file name(Shell),argument(Command line),Execution time,Execution state
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Program","Job monitoring","Job.exe","-context:*****","2020/11/12 13:19:23","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Program","Main menu","Companyxx.exe","-cfg","2020/11/12 13:18:56","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Program","System setting","Maintenance.exe","-context:*****","2020/11/12 13:19:19","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Batch job","mst13","svc.sh","userid/password","2020/11/12 13:19:32","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Batch job","mst13","test.sh","userid/password 0 0","2020/11/12 13:19:29","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Batch job","mst13","test.sh","userid/password 0 0","2020/11/12 13:19:30","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Batch job","mst13","out.sh","userid/password %JAVA% 0","2020/11/12 13:19:31","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Program","Job management","quevw.exe","-context:*****","2020/11/12 13:19:20","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Program","batchjob jobid:498298","test.sh","userid/password 0 0","2020/11/12 13:19:56","success"
"all","client-name","works","client-name","xx.xx.xx.xx","xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx xx-xx-xx-xx-xx-xx ","123-456","2020/11/12 13:18:56","2020/11/12 13:23:38","success","Program","batchjob jobid:498301","svc.sh","userid/password","2020/11/12 13:21:39","success"
Well, Japanese was included. Encoding required.
https://techacademy.jp/magazine/21128
It doesn't matter at all
Process data with pandas based on the csv file of the new Python corona
That's why it looks interesting.
Once the text was changed to UTF-8 with an editor for confirmation, the reading was successful. Oh, would you like to sort by date and time column as it is?
firstpandas.py
import pandas as pd
df = pd.read_csv('oplog20201112.csv')
print(df)
df_s = df.sort_values('Execution time')
print(df_s)
As a result, they sorted by execution time quite normally. So that's it. Let's save the CSV with a different name.
Return to this page. https://note.nkmk.me/python-pandas-to-csv/
firstpandas.py
import pandas as pd
df = pd.read_csv('oplog20201112.csv')
# print(df)
df_s = df.sort_values('Execution time')
# print(df_s)
df_s.to_csv('out.csv')
The encoding problem mentioned earlier is dealt with below. https://note.sngklab.jp/?p=435
firstpandas.py
import pandas as pd
df = pd.read_csv('oplog20201112.csv',encoding="SHIFT-JIS")
print(df)
df_s = df.sort_values('Execution time')
print(df_s)
df_s.to_csv('out.csv')
Then how to move the column. https://note.nkmk.me/python-pandas-reindex/
As you can see, it looks like that.
firstpandas.py
import pandas as pd
df = pd.read_csv('oplog20201112.csv',encoding="SHIFT-JIS")
# print(df)
df_s = df.sort_values('Execution time')
# print(df_s)
df_s = df_s.reindex(columns=['Execution time',
'Function name',
'User ID',
'client name',
'Windows login ID',
'Terminal ID',
'Login time',
'Logout time'])
df_s.to_csv('out.csv')
I also want to see how to make it easier to use. There is something called pyintaller.
pyInstaller # Let's actually create it
set path=C:\Users\works\AppData\Local\Programs\Python\Python39\Scripts;%path%
Please note that is required first. At the command prompt
C:\workspaces\playground>pyinstaller firstpandas.py --onefile
67 INFO: PyInstaller: 4.0
67 INFO: Python: 3.9.0
69 INFO: Platform: Windows-10-10.0.19041-SP0
70 INFO: wrote C:\workspaces\playground\firstpandas.spec
(Omission)
Then, Exe was successfully completed.
https://news.mynavi.jp/article/python-28/
test.bat
cd /d %~dp0
call firstpandas.exe
The same folder looks like this
There is a firstpandas.exe
that just reads in.csv
and makes it out.csv
.
The file I actually get is not in.csv, so I wonder if I should change the csv received as an argument with * .bat to in.csv.
By the way, there were several types of file formats, so I changed them.
How to check if a column exists in Pandas https://stackoverflow.com/questions/24870306/how-to-check-if-a-column-exists-in-pandas
firstpandas.py
import pandas as pd
df = pd.read_csv('oplog20201112.csv',encoding="SHIFT-JIS")
# print(df)
if 'Execution time' in df:
df_s = df.sort_values('Execution time')
df_s = df_s.reindex(columns=['Execution time',
'Function name',
'User ID',
'client name',
'Windows login ID',
'Terminal ID',
'Login time',
'Logout time'])
if 'PRC_DATE' in df:
df_s = df.sort_values('PRC_DATE')
df_s = df_s.reindex(columns=['PRC_DATE',
'DETAIL1',
'USERID',
'TERM_ID'])
df_s.to_csv('out.csv')
The following may appear.
UnicodeDecodeError: 'shift_jis' codec can't decode byte 0x87 in position 22224: illegal multibyte sequence
I thought,
Points to note when letting pandas read csv of excel output
https://minus9d.hatenablog.com/entry/2015/07/30/225841 https://stackoverflow.com/questions/6729016/decoding-shift-jis-illegal-multibyte-sequence
The encoding is further changed according to.
firstpandas.py
import pandas as pd
# df = pd.read_csv('in.csv',encoding="SHIFT-JIS")
df = pd.read_csv('in.csv',encoding="shift_jisx0213")
# print(df)
if 'Execution time' in df:
df_s = df.sort_values('Execution time')
df_s = df_s.reindex(columns=['Execution time',
'Function name',
'User ID',
'client name',
'Windows login ID',
'Terminal ID',
'Login time',
'Logout time'])
if 'PRC_DATE' in df:
df_s = df.sort_values('PRC_DATE')
df_s = df_s.reindex(columns=['PRC_DATE',
'DETAIL1',
'USERID',
'TERM_ID'])
df_s.to_csv('out.csv')
Other reference https://techacademy.jp/magazine/23367
formatter.bat
cd /d %~dp0
copy %1 in.csv
call ofmt.exe
echo "see out.csv!"
pause
It looks like this
Recommended Posts