When analyzing time series data, use columns such as timestamps 「2016-12-17 09:59:17」 It may be saved in a format like this. From this format, so that it can be handled by machine learning ** How to split a date into year, month, day, hour, and day of the week **
This time, we will use the following dummy data.
time.py
import pandas as pd
df = pd.read_csv('df.csv')
df.head()
#output
patient Last UpDdated
0 5.0 2020-03-22 10:00:00
1 4.0 2020-03-22 11:00:00
2 6.0 2020-03-22 12:00:00
3 10.0 2020-03-23 10:00:00
4 3.0 2020-03-23 11:00:00
df.info()
#output
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21 entries, 0 to 20
Data columns (total 3 columns):
patient 21 non-null float64
Last UpDdated 21 non-null object
dtypes: float64(2), object(1)
memory usage: 800.0+ bytes
Splits the date in the Last Up Dated column.
■ Order
** ① Convert from object type to datetime64 [ns] type **
pd.to_datetime(df['Last UpDdated'])
** ②. dt. Get year, month, day, time, day of the week with ~ **
df['Last UpDdated'].dt.month
time.py
df['Last UpDdated'] = pd.to_datetime(df['Last UpDdated']) #Convert type
df.dtypes
#output
patient float64
Last UpDdated datetime64[ns]
dtype: object
#Added column "manth"
df['month'] = df['Last UpDdated'].dt.month
#Added column "day"
df['day'] = df['Last UpDdated'].dt.day
#Added column "hour"
df['hour'] = df['Last UpDdated'].dt.hour
#Added column "week"
df['week'] = df['Last UpDdated'].dt.dayofweek
#Removed Last Up Ddated
df = df.drop(['Last UpDdated'],axis=1)
df.head()
#output
patient month day hour week
0 5.0 3 22 10 6
1 4.0 3 22 11 6
2 6.0 3 22 12 6
3 10.0 3 23 10 0
4 3.0 3 23 11 0
Month, day, hour, and day of the week columns have been added based on the values in the Last UpDated column! The day of the week is an int type from 0 to 6.
Recommended Posts