Create a program that changes the 1-minute data of FX to any hourly bar (such as 1-hour bar).
1-minute data ... Change to any time and output to CSV ↓ (The image below is for 1 hour)
The 1-minute data was downloaded from GMO Click Securities. If you have an account, you can download even if the deposit amount is 0. I am very grateful. (* For other companies, if the deposit amount is not a certain amount, it will be useless, or there will be only daily level data.)
This time, we downloaded the US dollar / yen data from January 2007 to September 2020.
When you unzip the downloaded data, it has the following folder structure.
/ Currency name_yyyymm/yyyymm/
/ USDJPY_202008 / 202008
The CSV of each day (currency name_yyyymmdd.csv) is saved in that folder.
The CSV data downloaded this time is I plan to use it in future programs, so I will save it in the following tree structure.
With such a tree structure, it is easy to handle even if the types of currencies to be downloaded in the future increase. Also, the program created this time is in the same hierarchy as the csv folder, I will create a folder called "Make_OHCL" and save it there.
csv has the following field structure in 2007.
However, since around 2016, the number of fields has increased as follows. Spreads (trading fees) are taken into consideration.
It's a bit annoying, but you need to support two types of CSV.
The program created this time is in the same hierarchy as the csv folder, Create a folder called "Make_OHCL" and create it directly under it. Data processing uses numpy instead of pandas to speed it up.
from copy import copy
import glob
import numpy as np
import pandas as pd
def make_ohlc(ashi, arr=None):
"""
Function Description: Creates an OHLC for the specified timeframe and returns an array.
ashi:Timeframe after change. If it is 60, it is 1 hour.
arr:A 1-minute csv file converted to an array.
"""
#If there are 6 or more columns of the read CSV file, read only the 1st to 5th columns.
if arr.shape[1] > 5:
arr = arr[:,0:5]
arr = np.c_[arr, np.zeros((len(arr),4))] #4 columns added
for i in range(0, len(arr), ashi):
try:
arr[i,5] = arr[i,1] #Open price
max_tmp = arr[i:i+ashi,2].astype(np.float) #Get a list of high prices for a specified period
arr[i,6] = max_tmp.max() #High price
min_tmp = arr[i:i+ashi,3].astype(np.float) #Get a list of low prices for a specified period
arr[i,7] = min_tmp.min() #Low price
arr[i,8] = arr[i+ashi-1,4] #closing price
except IndexError:
pass
arr = np.delete(arr, [1,2,3,4], axis=1) #Delete the 2nd to 5th columns because they are no longer needed
arr = arr[arr[:,4] != 0] #Delete line 0
return arr
currency = 'USDJPY' #Currency pair name
ashi = 60 #The length of the foot you want to get(60 minutes for 60 minutes)
arr = None #Initialize arr
csv_dir = '../csv/' + currency + '/' # /csv/Currency name folder
dir_list = glob.glob(csv_dir + '*') # csv/Currency name/Currency name_Get a list of yyyymm folders
for i in range(len(dir_list)):
file_list = glob.glob(dir_list[i] + '/' + dir_list[i][-6:] + '/*') #Get the path list of csv files
for j in range(len(file_list)):
pre_arr = copy(arr) #Pre the previous arr_Evacuate to arr
csv_arr = np.loadtxt(file_list[j], delimiter=",", skiprows=1, dtype='object') #Load csv into array
arr = make_ohlc(ashi, csv_arr) #Change foot length
if pre_arr is not None:
#Concatenate the previous arr and the converted arr
arr = np.vstack([pre_arr,arr])
filename = currency + '_ashi=' + str(ashi) + '.csv'
np.savetxt(filename, arr , delimiter="," , header="Date,Open,High,Low,Close" ,fmt="%s") #Save to CSV
This time there were multiple CSV files in multiple directories, so The code became a little longer due to the concatenation process etc. If you already have CSV in one file, just the following code is fine.
def make_ohlc(ashi, arr=None):
"""
Function Description: Creates an OHLC for the specified timeframe and returns an array.
ashi:Timeframe after change. If it is 60, it is 1 hour.
arr:A 1-minute csv file converted to an array.
"""
#If there are 6 or more columns of the read CSV file, read only the 1st to 5th columns.
if arr.shape[1] > 5:
arr = arr[:,0:5]
arr = np.c_[arr, np.zeros((len(arr),4))] #4 columns added
for i in range(0, len(arr), ashi):
try:
arr[i,5] = arr[i,1] #Open price
max_tmp = arr[i:i+ashi,2].astype(np.float) #Get a list of high prices for a specified period
arr[i,6] = max_tmp.max() #High price
min_tmp = arr[i:i+ashi,3].astype(np.float) #Get a list of low prices for a specified period
arr[i,7] = min_tmp.min() #Low price
arr[i,8] = arr[i+ashi-1,4] #closing price
except IndexError:
pass
arr = np.delete(arr, [1,2,3,4], axis=1) #Delete the 2nd to 5th columns because they are no longer needed
arr = arr[arr[:,4] != 0] #Delete line 0
return arr
currency = 'USDJPY' #Currency pair name
ashi = 60 #The length of the foot you want to get(60 minutes for 60 minutes)
csv_arr = np.loadtxt(<csv file path>, delimiter=",", skiprows=1, dtype='object') #Load csv into array
arr = make_ohlc(ashi, csv_arr) #Change foot length
filename = currency + '_ashi=' + str(ashi) + '.csv'
np.savetxt(filename, arr , delimiter="," , header="Date,Open,High,Low,Close" ,fmt="%s") #Save to CSV
Confirm that the changed CSV file is output for 1 hour.
If you find it helpful, please click LGTM. It will be encouraging of the update.
The following article introduces how to create a chart image from a CSV file. https://qiita.com/sw1394/items/b2a86cfc663d89915e28