Overview / Goals
I can't query the csv file in tabular format because I don't have Excel installed on my portable laptop.
A spreadsheet can be used as a substitute, but you need an internet connection to display it in a spreadsheet, and you don't want to use unnecessary packet communication.
Offline is enough for querying CSV files.
When I googled variously, I found that python's pandas library can also read CSV files and output html.
Create a program that reads a csv file and displays it on a browser for the purpose of studying python.
environment
- Windows10
- python 3.8.1
- VSCode
Things to do, order to do
- Install pandas with pip
- Create a python program
- Display in browser
Install pandas with pip
Run the usual command pip install pandas
in the VS Code terminal.
For some reason an error occurs ... Because it is a character that draws an unexpected error before writing the source, it is expected that such an error will occur.
(1)
ERROR: Could not install packages due to an EnvironmentError: [WinError 5]Access denied.: 'c:\\program files\\python38\\lib\\site-packages\\pip-19.2.3.dist-info\\entry_points.txt'
Consider using the `--user` option or check the permissions.
(2)
WARNING: You are using pip version 19.2.3, however version 19.3.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
(1) ... Access is denied or permissions are displayed, so it's probably an authority.
I gave up the terminal of VSCode, started the command prompt as an administrator and executed the pip command, and it was installed normally.
(2) A warning is issued because the version of pip installed with python3.8.1 is old.
The command displayed in'' was executed, and the pip upgrade was performed before the execution of (1).
Create a python program
Read the CSV file with read_csv () of pandas, convert the CSV data to
with to_html (), and write it to html.
If there is a space in the read CSV file, it will be treated as a missing value and converted to NaN, so do not convert it with na_filter = False
.
The CSV file name is displayed in the title of html.
main.py
import os
import pandas as pd
import sys
def main():
#Get CSV file from argument 1
args = sys.argv
haserror = False
csvfile = ''
#No argument is an error
if 1 == len(args):
print("No CSV file is specified.")
haserror = True
else:
csvfile = args[1]
#Error if the file name is incorrect
if os.path.exists(csvfile) == False:
print("The specified file does not exist.")
haserror = True
#If it is not a CSV file, an error will occur.
if csvfile.endswith(".csv") == False:
print("A file other than the CSV file is specified.")
haserror = True
if haserror == True:
return()
else:
print("Convert CSV file to html.")
#Read CSV file
csvdata = pd.read_csv(csvfile, na_filter=False)
#Read html file
htmldata = ''
with open('templete.html',mode='r',encoding='utf-8') as htmlfile:
htmldata = htmlfile.read()
#Convert CSV file to html
rpdict = { "filename" : os.path.basename(csvfile), "table" : csvdata.to_html() }
htmldata = htmldata.format(**rpdict)
#html file output
with open('csvconverted.html',mode='w',encoding='utf-8') as outputhtml:
outputhtml.write(htmldata)
if __name__== '__main__':
main()
Before conversion html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title> {filename} </title>
<link rel="stylesheet" type="text/css" href="style.css"/>
</head>
<body>
<div class="csvlist">
{table}
</div>
</body>
</html>
After conversion html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title> test.csv </title>
<link rel="stylesheet" type="text/css" href="style.css"/>
</head>
<body>
<div class="csvlist">
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>date</th>
<th>Holiday classification</th>
<th>Attendance time</th>
<th>Leave time</th>
<th>working time</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>2019/12/1</td>
<td>Law holiday</td>
<td>09:00</td>
<td>13:00</td>
<td>04:00</td>
</tr>
<tr>
<th>1</th>
<td>2019/12/2</td>
<td></td>
<td>09:00</td>
<td>21:30</td>
<td>11:30</td>
</tr>
<tr>
<th>2</th>
<td>2019/12/3</td>
<td></td>
<td>09:00</td>
<td>21:30</td>
<td>11:30</td>
</tr>
<tr>
<th>3</th>
<td>2019/12/4</td>
<td></td>
<td>09:00</td>
<td>21:00</td>
<td>11:00</td>
</tr>
<tr>
<th>4</th>
<td>2019/12/5</td>
<td></td>
<td>09:00</td>
<td>20:00</td>
<td>10:00</td>
</tr>
<tr>
<th>5</th>
<td>2019/12/6</td>
<td></td>
<td>09:00</td>
<td>20:00</td>
<td>10:00</td>
</tr>
<tr>
<th>6</th>
<td>2019/12/7</td>
<td>Public holiday</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<th>7</th>
<td>2019/12/8</td>
<td>Law holiday</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</div>
</body>
</html>
Looking back
- The contents of the CSV file output to the html file were garbled. Since encoding was not specified in open (), it will be processed with the default (shift-jis) character code. Changed to prevent garbled characters by specifying encoding ='utf-8'.
- KeyError occurred when executing format () for the field that outputs the file name and csv contents in templete.html. Initially, format (key = "") was coded in two places, but if you format one by one, it will be replaced with the specified key item from the target character string, so there must be a key item that has not been replaced. Was a KeyError. It is difficult to use format ... If you declare the replacement target in the dictionary format of key-value format and specify the dictionary for format, you can replace it correctly.
- I was surprised that it was easy to handle because the execution result of read_csv () can be referred to as a two-dimensional array type.
Click here for the product
https://github.com/otowmoyarng/FreeStudyRepo/tree/master/20191228-pythoncsvviewer