When I read an Excel date with pandas.read_excel, the data became a numeric serial number due to formatting. I decided to convert it because I want to process it uniformly with datetime.datetime.
--Reference -Python --Change column by column from serial number to date in python pandas | teratail -59 days of Excel confused by a phantom day | Nikkei Crosstech (xTECH)
# date_int :Serial number
if date_int < 60:
# 1900-03-Before 01
days = date_int -1
else:
# 1900-03-For 01 or later
days = date_int - 2
date_time = pandas.to_datetime('1900/01/01') + datetime.timedelta(days=days)
--Environment --macOS Catalina version 10.15.7 --Microsoft Excel for Mac version 16.42 - Python 3.8.5 - pandas 1.1.3 --Reference -Process date / time columns with pandas (string conversion, date extraction, etc.) | note.nkmk.me - pandas.Timestamp — pandas 1.1.3 documentation
The Excel file to read looks like this
import datetime
import pandas
def get_datetime(val: object) -> datetime.datetime:
"""Get the date.
:param val:The original value of the date
:return:Date if date or serial number, None otherwise
"""
val_type = type(val)
# datetime.If it is datetime, return it as it is
if val_type is datetime.datetime:
return val
# pandas.Timestamp is datetime.It inherits datetime and datetime.It seems that it can be processed as datetime, so return it as it is
if issubclass(val_type, datetime.datetime):
return val
#If it is int, datetime as a serial value.Convert to datetime and return
if val_type is int:
if val < 60:
# 1900-03-Before 01
days = val -1
else:
# 1900-03-For 01 or later
days = val - 2
return pandas.to_datetime('1900/01/01') + datetime.timedelta(days=days)
return None
if __name__ == '__main__':
#Load excel with pandas
sheet = pandas.read_excel('Book.xlsx', sheet_name='Sheet1', header=None)
for index, row in sheet.iterrows():
date = get_datetime(row[0])
if date is not None:
print(date.strftime('%Y/%m/%d'))
output
2020/10/14
2020/10/15
2020/10/16
2020/10/17
Recommended Posts