For analysis, I thought that artificial data that is easy to analyze is better than dealing with unreadable data, but [graph created last time](https://qiita. com / waka_taka / items / ab2f3b8fc6475d1c1a51 #% E5% AE% 9F% E8% A1% 8C% E7% B5% 90% E6% 9E% 9C-2) is not so realistic, so I'm a little discouraged. I did.
I think it was really stupid because I made the data myself and was not motivated.
So, I would like to analyze the actual Nikkei 225 trends (January 4, 2016 to November 8, 2019).
For the time being, in this article, I would like to keep the program up to the previous time as it is, change the read data, and investigate the details of the program.
Study_Code.py
import pandas as pd
import logging
#[Stock price analysis] Learning pandas with fictitious data(003)Add more
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mpl_finance import candlestick_ohlc
#Specifying the log format
# %(asctime)s :A human-readable representation of the time the LogRecord was generated.
# %(funcName)s :The name of the function that contains the logging call
# %(levelname)s :Character logging level for messages
# %(lineno)d :Source line number where the logging call was issued
# %(message)s : msg %Log message requested as args
fomatter = logging.Formatter('%(asctime)s:%(funcName)s:%(levelname)s:%(lineno)d:\n%(message)s')
#Logger settings(INFO log level)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
#Handler settings(Change output file/Log level settings/Log format settings)
handler = logging.FileHandler('info_log.log')
handler.setLevel(logging.INFO)
handler.setFormatter(fomatter)
logger.addHandler(handler)
#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('NikkeiAverage.csv', encoding='SJIS', \
header=1, sep='\t')
#Convert to date type
dframe['date'] = pd.to_datetime(dframe['date'])
#Specify date column as index
dframe = dframe.set_index('date')
#Convert open to close prices to numbers
dframe = dframe.apply(lambda x: x.str.replace(',','')).astype(np.float32)
#Change to use logger
logger.info(dframe)
#Output index
logger.info(dframe.columns)
#Output only open and close prices
logger.info(dframe[['Open price','closing price']])
#Checking the index
logger.info(dframe.index)
#Type confirmation
logger.info(dframe.dtypes)
#Creating data for plotting
ohlc = zip(mdates.date2num(dframe.index), dframe['Open price'], dframe['closing price'], dframe['High price'], dframe['closing price'])
logger.info(ohlc)
#Creating a campus
fig = plt.figure()
#Format the X-axis
ax = plt.subplot()
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y/%m/%d'))
#Draw a candlestick chart
candlestick_ohlc(ax, ohlc, width=0.7, colorup='g', colordown='r')
#Save the image
plt.savefig('Candle_Chart.png')
Obviously, the graph looks nice. (Although it is necessary to correct the appearance of the graph itself ...)
I wrote it casually, but I do not understand the following part well, so I will disassemble it one by one and check it.
Confirm_Code.py
ohlc = zip(mdates.date2num(dframe.index), dframe['Open price'], dframe['closing price'], dframe['High
First, try to create the following code by scraping the parts that are unnecessary for confirmation.
Study_Code.py
import pandas as pd
import logging
#[Stock price analysis] Learning pandas with fictitious data(003)Add more
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mpl_finance import candlestick_ohlc
#Specifying the log format
# %(asctime)s :A human-readable representation of the time the LogRecord was generated.
# %(funcName)s :The name of the function that contains the logging call
# %(levelname)s :Character logging level for messages
# %(lineno)d :Source line number where the logging call was issued
# %(message)s : msg %Log message requested as args
fomatter = logging.Formatter('%(asctime)s:%(funcName)s:%(levelname)s:%(lineno)d:\n%(message)s')
#Logger settings(INFO log level)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
#Handler settings(Change output file/Log level settings/Log format settings)
handler = logging.FileHandler('info_log.log')
handler.setLevel(logging.INFO)
handler.setFormatter(fomatter)
logger.addHandler(handler)
#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('NikkeiAverage.csv', encoding='SJIS', \
header=1, sep='\t')
#Convert to date type
dframe['date'] = pd.to_datetime(dframe['date'])
#Specify date column as index
dframe = dframe.set_index('date')
#Convert open to close prices to numbers
dframe = dframe.apply(lambda x: x.str.replace(',','')).astype(np.float32)
#Creating data for plotting
#ohlc = zip(mdates.date2num(dframe.index), dframe['Open price'], dframe['closing price'], \
# dframe['High price'], dframe['closing price'])
# dframe.Check the contents of index
logger.info(dframe.index)
The contents of dframe.index usually store index data.
info_log
2019-11-11 23:27:00,953:<module>:INFO:46:
DatetimeIndex(['2016-01-04', '2016-01-05', '2016-01-06', '2016-01-07',
'2016-01-08', '2016-01-12', '2016-01-13', '2016-01-14',
'2016-01-15', '2016-01-18',
...
'2019-10-25', '2019-10-28', '2019-10-29', '2019-10-30',
'2019-10-31', '2019-11-01', '2019-11-05', '2019-11-06',
'2019-11-07', '2019-11-08'],
dtype='datetime64[ns]', name='date', length=942, freq=None)
This is as expected.
Next, the contents of mdates.date2num (dframe.index) were the following numbers.
info_log
2019-11-11 23:31:04,163:<module>:INFO:47:
[735967. 735968. 735969. 735970. 735971. 735975. 735976. 735977. 735978.
735981. 735982. 735983. 735984. 735985. 735988. 735989. 735990. 735991.
(Omitted)
737349. 737350. 737353. 737355. 737356. 737357. 737360. 737361. 737362.
737363. 737364. 737368. 737369. 737370. 737371.]
this is
I'm not good at datetime related python ... (I'm not good at file I / O, but I'm not good at date processing ...)
The date, open price, high price, low price, and close price are only stored in tuple type, but I will check it just in case. I don't use the zip function when I make my own program.
Confirmation code
Study_Code.py
import pandas as pd
import logging
#[Stock price analysis] Learning pandas with fictitious data(003)Add more
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mpl_finance import candlestick_ohlc
#Logger settings(INFO log level)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
#Handler settings(Change output file/Log level settings/Log format settings)
handler = logging.FileHandler('info_log.log')
handler.setLevel(logging.INFO)
logger.addHandler(handler)
#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('NikkeiAverage.csv', encoding='SJIS', \
header=1, sep='\t')
#Convert to date type
dframe['date'] = pd.to_datetime(dframe['date'])
#Specify date column as index
dframe = dframe.set_index('date')
#Convert open to close prices to numbers
dframe = dframe.apply(lambda x: x.str.replace(',','')).astype(np.float32)
#Creating data for plotting
ohlc = zip(mdates.date2num(dframe.index), dframe['Open price'], dframe['closing price'], \
dframe['High price'], dframe['closing price'])
#Confirmation of the contents of ohlc
for output_data in ohlc :
logger.info(output_data)
I was satisfied for the time being because the results were as expected.
info_log
2019-11-11 23:48:26,636:<module>:INFO:47:
(735967.0, 18818.580078125, 18450.98046875, 18951.119140625, 18450.98046875)
(735968.0, 18398.759765625, 18374.0, 18547.380859375, 18374.0)
(735969.0, 18410.5703125, 18191.3203125, 18469.380859375, 18191.3203125)
"abridgement"
(737369.0, 23343.509765625, 23303.8203125, 23352.560546875, 23303.8203125)
(737370.0, 23283.140625, 23330.3203125, 23336.0, 23330.3203125)
(737371.0, 23550.0390625, 23391.869140625, 23591.08984375, 23391.869140625)
I've read through the matplotlib sample page, but I haven't used the ** candlestick_ohlc function ** so much, so I'll try to check the operation with a few samples.
November 11, 2019 23:54 ・ ・ ・ Writing
Recommended Posts