From the continuation of the last time (until the candlestick chart was created)

For analysis, I thought that artificial data that is easy to analyze is better than dealing with unreadable data, but [graph created last time](https://qiita. com / waka_taka / items / ab2f3b8fc6475d1c1a51 #% E5% AE% 9F% E8% A1% 8C% E7% B5% 90% E6% 9E% 9C-2) is not so realistic, so I'm a little discouraged. I did.

I think it was really stupid because I made the data myself and was not motivated.

So, I would like to analyze the actual Nikkei 225 trends (January 4, 2016 to November 8, 2019).

For the time being, in this article, I would like to keep the program up to the previous time as it is, change the read data, and investigate the details of the program.

Program up to the last time (repost)

`Study_Code.py`


import pandas as pd
import logging
#[Stock price analysis] Learning pandas with fictitious data(003)Add more
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mpl_finance import candlestick_ohlc

#Specifying the log format
# %(asctime)s :A human-readable representation of the time the LogRecord was generated.
# %(funcName)s :The name of the function that contains the logging call
# %(levelname)s :Character logging level for messages
# %(lineno)d :Source line number where the logging call was issued
# %(message)s : msg %Log message requested as args
fomatter = logging.Formatter('%(asctime)s:%(funcName)s:%(levelname)s:%(lineno)d:\n%(message)s')

#Logger settings(INFO log level)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

#Handler settings(Change output file/Log level settings/Log format settings)
handler = logging.FileHandler('info_log.log')
handler.setLevel(logging.INFO)
handler.setFormatter(fomatter)

logger.addHandler(handler)

#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('NikkeiAverage.csv', encoding='SJIS', \
	header=1, sep='\t')

#Convert to date type
dframe['date'] = pd.to_datetime(dframe['date'])
#Specify date column as index
dframe = dframe.set_index('date')

#Convert open to close prices to numbers
dframe =  dframe.apply(lambda x: x.str.replace(',','')).astype(np.float32)

#Change to use logger
logger.info(dframe)
#Output index
logger.info(dframe.columns)
#Output only open and close prices
logger.info(dframe[['Open price','closing price']])
#Checking the index
logger.info(dframe.index)
#Type confirmation
logger.info(dframe.dtypes)


#Creating data for plotting
ohlc = zip(mdates.date2num(dframe.index), dframe['Open price'], dframe['closing price'], dframe['High price'], dframe['closing price'])
logger.info(ohlc)

#Creating a campus
fig = plt.figure()

#Format the X-axis
ax = plt.subplot()
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y/%m/%d'))

#Draw a candlestick chart
candlestick_ohlc(ax, ohlc, width=0.7, colorup='g', colordown='r')

#Save the image
plt.savefig('Candle_Chart.png')

Execution result (graph when reading the Nikkei average)

Obviously, the graph looks nice. (Although it is necessary to correct the appearance of the graph itself ...)

About the code of the plot creation function part

I wrote it casually, but I do not understand the following part well, so I will disassemble it one by one and check it.

`Confirm_Code.py`


ohlc = zip(mdates.date2num(dframe.index), dframe['Open price'], dframe['closing price'], dframe['High

Check the data in the dframe.index and mdates.date2num variables

First, try to create the following code by scraping the parts that are unnecessary for confirmation.

`Study_Code.py`


import pandas as pd
import logging
#[Stock price analysis] Learning pandas with fictitious data(003)Add more
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mpl_finance import candlestick_ohlc

#Specifying the log format
# %(asctime)s :A human-readable representation of the time the LogRecord was generated.
# %(funcName)s :The name of the function that contains the logging call
# %(levelname)s :Character logging level for messages
# %(lineno)d :Source line number where the logging call was issued
# %(message)s : msg %Log message requested as args
fomatter = logging.Formatter('%(asctime)s:%(funcName)s:%(levelname)s:%(lineno)d:\n%(message)s')

#Logger settings(INFO log level)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

#Handler settings(Change output file/Log level settings/Log format settings)
handler = logging.FileHandler('info_log.log')
handler.setLevel(logging.INFO)
handler.setFormatter(fomatter)

logger.addHandler(handler)

#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('NikkeiAverage.csv', encoding='SJIS', \
	header=1, sep='\t')

#Convert to date type
dframe['date'] = pd.to_datetime(dframe['date'])
#Specify date column as index
dframe = dframe.set_index('date')

#Convert open to close prices to numbers
dframe =  dframe.apply(lambda x: x.str.replace(',','')).astype(np.float32)

#Creating data for plotting
#ohlc = zip(mdates.date2num(dframe.index), dframe['Open price'], dframe['closing price'], \
#	dframe['High price'], dframe['closing price'])

# dframe.Check the contents of index
logger.info(dframe.index)

Execution result

The contents of dframe.index usually store index data.

`info_log`


2019-11-11 23:27:00,953:<module>:INFO:46:
DatetimeIndex(['2016-01-04', '2016-01-05', '2016-01-06', '2016-01-07',
               '2016-01-08', '2016-01-12', '2016-01-13', '2016-01-14',
               '2016-01-15', '2016-01-18',
               ...
               '2019-10-25', '2019-10-28', '2019-10-29', '2019-10-30',
               '2019-10-31', '2019-11-01', '2019-11-05', '2019-11-06',
               '2019-11-07', '2019-11-08'],
              dtype='datetime64[ns]', name='date', length=942, freq=None)

This is as expected.

Next, the contents of mdates.date2num (dframe.index) were the following numbers.

`info_log`


2019-11-11 23:31:04,163:<module>:INFO:47:
[735967. 735968. 735969. 735970. 735971. 735975. 735976. 735977. 735978.
 735981. 735982. 735983. 735984. 735985. 735988. 735989. 735990. 735991.
(Omitted)
 737349. 737350. 737353. 737355. 737356. 737357. 737360. 737361. 737362.
 737363. 737364. 737368. 737369. 737370. 737371.]

this is

Converting '2016-01-04' to numbers 735967
Converting '2016-01-05' to numbers 735968
Converting '2016-01-06' to numbers 735969 ︙
Converting '2019-01-04' to numbers 737370
Converting '2019-01-04' to numbers 737371 Does that mean ...

I'm not good at datetime related python ... (I'm not good at file I / O, but I'm not good at date processing ...)

Check the contents of the ohlc object

The date, open price, high price, low price, and close price are only stored in tuple type, but I will check it just in case. I don't use the zip function when I make my own program.

Confirmation code

`Study_Code.py`


import pandas as pd
import logging
#[Stock price analysis] Learning pandas with fictitious data(003)Add more
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mpl_finance import candlestick_ohlc

#Logger settings(INFO log level)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

#Handler settings(Change output file/Log level settings/Log format settings)
handler = logging.FileHandler('info_log.log')
handler.setLevel(logging.INFO)

logger.addHandler(handler)

#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('NikkeiAverage.csv', encoding='SJIS', \
	header=1, sep='\t')

#Convert to date type
dframe['date'] = pd.to_datetime(dframe['date'])
#Specify date column as index
dframe = dframe.set_index('date')

#Convert open to close prices to numbers
dframe =  dframe.apply(lambda x: x.str.replace(',','')).astype(np.float32)

#Creating data for plotting
ohlc = zip(mdates.date2num(dframe.index), dframe['Open price'], dframe['closing price'], \
	dframe['High price'], dframe['closing price'])

#Confirmation of the contents of ohlc
for output_data in ohlc :
	logger.info(output_data)

Execution result

I was satisfied for the time being because the results were as expected.

`info_log`


2019-11-11 23:48:26,636:<module>:INFO:47:
(735967.0, 18818.580078125, 18450.98046875, 18951.119140625, 18450.98046875)
(735968.0, 18398.759765625, 18374.0, 18547.380859375, 18374.0)
(735969.0, 18410.5703125, 18191.3203125, 18469.380859375, 18191.3203125)

"abridgement"

(737369.0, 23343.509765625, 23303.8203125, 23352.560546875, 23303.8203125)
(737370.0, 23283.140625, 23330.3203125, 23336.0, 23330.3203125)
(737371.0, 23550.0390625, 23391.869140625, 23591.08984375, 23391.869140625)

Write a candlestick chart appropriately and check the operation.

I've read through the matplotlib sample page, but I haven't used the ** candlestick_ohlc function ** so much, so I'll try to check the operation with a few samples.

November 11, 2019 23:54 ・・・ Writing

[Stock price analysis] Learn pandas with Nikkei 225 (004: Change read data to Nikkei 225)

From the continuation of the last time (until the candlestick chart was created)

Program up to the last time (repost)

Study_Code.py

Execution result (graph when reading the Nikkei average)

About the code of the plot creation function part

Confirm_Code.py

Check the data in the dframe.index and mdates.date2num variables

Study_Code.py

Execution result

info_log

info_log

Check the contents of the ohlc object

Study_Code.py

Execution result

info_log

Write a candlestick chart appropriately and check the operation.

`Study_Code.py`

`Confirm_Code.py`

`Study_Code.py`

`info_log`

`info_log`

`Study_Code.py`

`info_log`