Examination of exchange rate forecasting method using deep learning and wavelet transform

Introduction

This method has not yet obtained good results. I'm at the stage of trying out ideas as a hobby, so I don't think it will be useful for those who are looking for a tool that can be used immediately. please note that. m (__) m In the previous and two previous articles, I used a sound analysis technology called spectrogram to image exchange data (USD / JPY) and learned it on CNN. The result was a terrible defeat. The accuracy rate for the test data did not increase.

[Previous article] Examination of exchange rate forecasting method using Deep Learning and Spectrogram [Previous article] Examination of exchange rate forecasting method using Deep Learning and Spectrogram-Part 2-

When I was thinking about what to do next, I found an article that the wavelet transform is more compatible than the FFT for analyzing financial data. Therefore, this time, I examined whether it is possible to predict "the exchange rate will rise or fall after 30 minutes" by combining the wavelet transform and CNN.

What is wavelet transform?

Figure 1 shows a schematic diagram of the wavelet transform. The FFT used up to the previous article is an analysis method that expresses a complex waveform by adding infinitely continuous sine waves. On the other hand, the wavelet transform expresses a complicated waveform by adding the localized waves (wavelets). While the FFT is good at analyzing stationary signals, the wavelet transform is suitable for analyzing irregular and non-stationary waveforms.

image.png Figure 1. Schematic diagram of wavelet transform Source: https://www.slideshare.net/ryosuketachibana12/ss-42388444

The mapping of the wavelet strength at each shift (time) and each scale (frequency) is called a scalogram. Figure 2 is a scalogram created from the wavelet transform result of y = sin (πx / 16). Arbitrary waveforms can be imaged by using the wavelet transform in this way.

image.png Figure 2. Scalogram example, y = sin (πx / 16)

There are two types of wavelet transform, continuous wavelet transform (CWT) and discrete wavelet transform (DWT), but this time we used continuous wavelet transform. There are various shapes of wavelets, but for the time being, the Gaussian function is used.

Create a scalogram from currency exchange data

We created a scalogram from the closing price of the 5-minute bar of USD / JPY. The procedure of extracting 24-hour data from a huge amount of data for several years and creating a single scalogram was repeated many times. One data set is one scalogram and the price movement (up or down) 30 minutes after the last time. There was one problem here. It is that the boundary (edge) of the scalogram is distorted. This happens because you lose data when you cross the boundary.

image.png Figure 3. Distortion at the scalogram boundary

Therefore, in order to remove the distortion, we added left-right inverted data to both ends of the raw data. After the wavelet transform, only the central part corresponding to the raw data was extracted. This method is generally used to remove the distortion, but it seems that there are pros and cons because it means that fictitious data is added.

image.png Figure 4. How to remove distortion at the boundary

CNN structure and learning flow

I devised a little learning flow. Until the last time, we trained the data of the past 10 years and verified the accuracy with the data of the last 3 months. This time, after training the data for the past 10 years, we trained the data for the past 5 years, and then shortened the training data period to 2 years and 1 year. The reason for doing this is that we should emphasize the latest price movements in order to predict the future. We also increased the test data period to 5 months. Test data is not used for training. In other words, it is unknown data for AI.

image.png Figure 5. Learning flow

Figure 6 shows the structure of the CNN used this time.

image.png Figure 6. Structure of CNN used this time

Calculation result

So, I tried it while thinking that it should go well, but the result is as shown in Fig. 7. This time too, the accuracy rate for the test data did not increase. By the way, the correct answer rate for the training data drops at Iterations = 20000, 30000, which coincides with the timing when the training data period is switched.

image.png Figure 7. Calculation result

in conclusion

I think that the fact that the time information contained in one scalogram is constant is one of the reasons why it does not work. This time, every scalogram is created from 24-hour waveform data. People who are actually trading change the period of the waveform to be evaluated as needed. Recently, I've become interested in "game theory" and I'm studying it, so I'll take a break from currency analysis for a while.

Yu-Nie

Appendix The data used for the analysis can be downloaded from the following. Training data USDJPY_20070301_20170228_5min.csv USDJPY_20120301_20170228_5min.csv USDJPY_20150301_20170228_5min.csv USDJPY_20160301_20170228_5min.csv test data USDJPY_20170301_20170731_5min.csv

Below is the code used for the analysis.

Jack_for_qiita_TF_version.py



# 20170821
# y.izumi

import tensorflow as tf
import numpy as np
import scalogram2 as sca
import time

"""Functions that perform parameter initialization, convolution operations, and pooling operations"""
#=============================================================================================================================================
#Weight initialization function
def weight_variable(shape, stddev=1e-4): # default stddev = 1e-4
    initial = tf.truncated_normal(shape, stddev=stddev)
    return tf.Variable(initial)
#Bias initialization function
def bias_variable(shape):
    initial = tf.constant(0.0, shape=shape)
    return tf.Variable(initial)
#Convolution operation
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")
# pooling
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
#=============================================================================================================================================

"""Functions that perform learning"""
#=============================================================================================================================================
def train(x_train, t_train, x_test, t_test, iters, acc_list, num_data_each_conf, acc_each_conf, total_cal_time, train_step, train_batch_size, test_batch_size):
    """
    x_train  :Training data
    t_train  :Learning label, one-hot
    x_test   :test data
    t_test   :Test label, one-hot
    iters    :Number of learning
    acc_list :List to save the progress of the correct answer rate
    num_data_each_conf :A list that stores the progress of the number of data for each conviction
    acc_each_conf      :A list that saves the progress of the correct answer rate for each conviction
    total_cal_time     :Total calculation time
    train_step         :Learning class
    train_batch_size   :Batch size of training data
    test_batch_size    :Batch size of test data
    """
    train_size = x_train.shape[0] #Number of training data
    test_size = x_test.shape[0]   #Number of test data
    start_time = time.time()
    
    iters = iters + 1    
    for step in range(iters):
        batch_mask = np.random.choice(train_size, train_batch_size)
        tr_batch_xs = x_train[batch_mask]
        tr_batch_ys = t_train[batch_mask]

        #Confirmation of accuracy during learning
        if step%100 == 0:
            
            cal_time = time.time() - start_time #Calculation time count
            total_cal_time += cal_time
            
            # train
            train_accuracy = accuracy.eval(feed_dict={x: tr_batch_xs, y_: tr_batch_ys, keep_prob: 1.0})
            train_loss = cross_entropy.eval(feed_dict={x: tr_batch_xs, y_: tr_batch_ys, keep_prob: 1.0})
            
            # test
            # use all data
            test_accuracy = accuracy.eval(feed_dict={x: x_test, y_: t_test, keep_prob: 1.0})
            test_loss = cross_entropy.eval(feed_dict={x: x_test, y_: t_test, keep_prob: 1.0})
            
            # use test batch
            # batch_mask = np.random.choice(test_size, test_batch_size)
            # te_batch_xs = x_test[batch_mask]
            # te_batch_ys = t_test[batch_mask]
            # test_accuracy = accuracy.eval(feed_dict={x: te_batch_xs, y_: te_batch_ys, keep_prob: 1.0})
            # test_loss = cross_entropy.eval(feed_dict={x: te_batch_xs, y_: te_batch_ys, keep_prob: 1.0})        

            print("calculation time %d sec, step %d, training accuracy %g, training loss %g, test accuracy %g, test loss %g"%(cal_time, step, train_accuracy, train_loss, test_accuracy, test_loss))
            acc_list.append([step, train_accuracy, test_accuracy, train_loss, test_loss])
            
            AI_prediction = y_conv.eval(feed_dict={x: x_test, y_: t_test, keep_prob: 1.0}) #AI prediction results
            # print("AI_prediction.shape " + str(AI_prediction.shape)) # for debag
            # print("AI_prediction.type" + str(type(AI_prediction)))
            
            AI_correct_prediction = correct_prediction.eval(feed_dict={x: x_test, y_: t_test, keep_prob: 1.0}) #Correct answer:TRUE,Incorrect answer:FALSE
            # print("AI_prediction.shape " + str(AI_prediction.shape)) # for debag
            # print("AI_prediction.type" + str(type(AI_prediction)))
            AI_correct_prediction_int = AI_correct_prediction.astype(np.int) #Correct answer:1,Incorrect answer:0
            
            #Calculate the number of data and accuracy rate for each conviction
            # 50%that's all,60%The following confidence(or 40%that's all,50%The following confidence)
            a = AI_prediction[:,0] >= 0.5
            b = AI_prediction[:,0] <= 0.6
            # print("a " + str(a)) # for debag
            # print("a.shape " + str(a.shape))
            cnf_50to60 = np.logical_and(a, b)
            # print("cnf_50to60 " + str(cnf_50to60)) # for debag
            # print("cnf_50to60.shape " + str(cnf_50to60.shape))
            
            a = AI_prediction[:,0] >= 0.4
            b = AI_prediction[:,0] < 0.5
            cnf_40to50 = np.logical_and(a, b)
            
            cnf_50to60 = np.logical_or(cnf_50to60, cnf_40to50)
            cnf_50to60_int = cnf_50to60.astype(np.int)
            # print("cnf_50to60_int " + str(cnf_50to60)) # for debag
            # print("cnf_50to60.shape " + str(cnf_50to60.shape))
            
            correct_prediction_50to60 = np.logical_and(cnf_50to60, AI_correct_prediction)
            correct_prediction_50to60_int = correct_prediction_50to60.astype(np.int)
            
            sum_50to60 = np.sum(cnf_50to60_int)                             #Conviction is 50%From 60%Number of data
            acc_50to60 = np.sum(correct_prediction_50to60_int) / sum_50to60 #Conviction is 50%From 60%Correct answer rate
            
            # 60%Greater,70%The following confidence(or 30%that's all,40%Less certainty)
            a = AI_prediction[:,0] > 0.6
            b = AI_prediction[:,0] <= 0.7
            cnf_60to70 = np.logical_and(a, b)
            
            a = AI_prediction[:,0] >= 0.3
            b = AI_prediction[:,0] < 0.4
            cnf_30to40 = np.logical_and(a, b)
            
            cnf_60to70 = np.logical_or(cnf_60to70, cnf_30to40)
            cnf_60to70_int = cnf_60to70.astype(np.int)
            
            correct_prediction_60to70 = np.logical_and(cnf_60to70, AI_correct_prediction)
            correct_prediction_60to70_int = correct_prediction_60to70.astype(np.int)
            
            sum_60to70 = np.sum(cnf_60to70_int)
            acc_60to70 = np.sum(correct_prediction_60to70_int) / sum_60to70
            
            # 70%Greater,80%The following confidence(or 20%that's all,30%Less certainty)
            a = AI_prediction[:,0] > 0.7
            b = AI_prediction[:,0] <= 0.8
            cnf_70to80 = np.logical_and(a, b)
            
            a = AI_prediction[:,0] >= 0.2
            b = AI_prediction[:,0] < 0.3
            cnf_20to30 = np.logical_and(a, b)
            
            cnf_70to80 = np.logical_or(cnf_70to80, cnf_20to30)
            cnf_70to80_int = cnf_70to80.astype(np.int)
            
            correct_prediction_70to80 = np.logical_and(cnf_70to80, AI_correct_prediction)
            correct_prediction_70to80_int = correct_prediction_70to80.astype(np.int)
            
            sum_70to80 = np.sum(cnf_70to80_int)
            acc_70to80 = np.sum(correct_prediction_70to80_int) / sum_70to80
            
            # 80%Greater,90%The following confidence(or 10%that's all,20%Less certainty)
            a = AI_prediction[:,0] > 0.8
            b = AI_prediction[:,0] <= 0.9
            cnf_80to90 = np.logical_and(a, b)
            
            a = AI_prediction[:,0] >= 0.1
            b = AI_prediction[:,0] < 0.2
            cnf_10to20 = np.logical_and(a, b)
            
            cnf_80to90 = np.logical_or(cnf_80to90, cnf_10to20)
            cnf_80to90_int = cnf_80to90.astype(np.int)
            
            correct_prediction_80to90 = np.logical_and(cnf_80to90, AI_correct_prediction)
            correct_prediction_80to90_int = correct_prediction_80to90.astype(np.int)
            
            sum_80to90 = np.sum(cnf_80to90_int)
            acc_80to90 = np.sum(correct_prediction_80to90_int) / sum_80to90
            
            # 90%Greater,100%The following confidence(or 0%that's all,10%Less certainty)
            a = AI_prediction[:,0] > 0.9
            b = AI_prediction[:,0] <= 1.0
            cnf_90to100 = np.logical_and(a, b)
            
            a = AI_prediction[:,0] >= 0
            b = AI_prediction[:,0] < 0.1
            cnf_0to10 = np.logical_and(a, b)
            
            cnf_90to100 = np.logical_or(cnf_90to100, cnf_0to10)
            cnf_90to100_int = cnf_90to100.astype(np.int)
            
            correct_prediction_90to100 = np.logical_and(cnf_90to100, AI_correct_prediction)
            correct_prediction_90to100_int = correct_prediction_90to100.astype(np.int)
            
            sum_90to100 = np.sum(cnf_90to100_int)
            acc_90to100 = np.sum(correct_prediction_90to100_int) / sum_90to100
            
            print("Number of data of each confidence 50to60:%g, 60to70:%g, 70to80:%g, 80to90:%g, 90to100:%g "%(sum_50to60, sum_60to70, sum_70to80, sum_80to90, sum_90to100))
            print("Accuracy rate of each confidence  50to60:%g, 60to70:%g, 70to80:%g, 80to90:%g, 90to100:%g "%(acc_50to60, acc_60to70, acc_70to80, acc_80to90, acc_90to100))
            print("")
            
            num_data_each_conf.append([step, sum_50to60, sum_60to70, sum_70to80, sum_80to90, sum_90to100])
            acc_each_conf.append([step, acc_50to60, acc_60to70, acc_70to80, acc_80to90, acc_90to100])
            
            #Exporting files for tensorboard
            result = sess.run(merged, feed_dict={x:tr_batch_xs, y_: tr_batch_ys, keep_prob: 1.0})
            writer.add_summary(result, step)
            
            start_time = time.time()

        #Execution of learning
        train_step.run(feed_dict={x: tr_batch_xs, y_: tr_batch_ys, keep_prob: 0.5})
            
    return acc_list, num_data_each_conf, acc_each_conf, total_cal_time
#==============================================================================================================================================

"""Functions that create scalograms and labels"""
#==============================================================================================================================================
def make_scalogram(train_file_name, test_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc):
    """
    train_file_name :File name of training data
    test_file_name  :Test data file name
    scales  :Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
    wavelet :Wavelet name,Use one of the following
              'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
    height  :Image height, num of time lines
    width   :Image width,  num of freq lines
    predict_time_inc :Increment of time to predict price movement
    ch_flag      :Number of channels to use, ch_flag=1:close, ch_flag=5:start, high, low, close, volume
    save_flag    : save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
    over_lap_inc :Incremental CWT start time
    """
    #Creating scalograms and labels
    # train
    x_train, t_train, freq_train = sca.merge_scalogram(train_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc)
    # x_train, t_train, freq_train = sca.merge_scalogram(test_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc) # for debag
    # test
    x_test, t_test, freq_test = sca.merge_scalogram(test_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc)
    print("x_train shape " + str(x_train.shape))
    print("t_train shape " + str(t_train.shape))
    print("x_test shape " + str(x_test.shape))
    print("t_test shape " + str(t_test.shape))
    print("frequency " + str(freq_test))
    
    #Swap dimensions for tensorflow
    x_train = x_train.transpose(0, 2, 3, 1) # (num_data, ch, height(time_lines), width(freq_lines)) ⇒ (num_data, height(time_lines), width(freq_lines), ch)
    x_test = x_test.transpose(0, 2, 3, 1)

    train_size = x_train.shape[0]   #Number of training data
    test_size = x_test.shape[0]     #Number of test data

    # labes to one-hot
    t_train_onehot = np.zeros((train_size, 2))
    t_test_onehot = np.zeros((test_size, 2))
    t_train_onehot[np.arange(train_size), t_train] = 1
    t_test_onehot[np.arange(test_size), t_test] = 1
    t_train = t_train_onehot
    t_test = t_test_onehot

    # print("t train shape onehot" + str(t_train.shape)) # for debag
    # print("t test shape onehot" + str(t_test.shape))
    
    return x_train, t_train, x_test, t_test
#==============================================================================================================================================

"""Scalogram creation conditions"""
#=============================================================================================================================================
predict_time_inc = 6                      #Increment of time to predict price movement
height = 288                              #Image height, num of time lines
width = 128                               #Image width,  num of freq lines
ch_flag = 1                               #Number of channels to use, ch_flag=1:close, ch_flag=5:start, high, low, close, volume
input_dim = (ch_flag, height, width)      # channel = (1, 5), height(time_lines), width(freq_lines)
save_flag = 0                             # save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
scales = np.linspace(0.2,80,width)        #Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
# scales = np.arange(1,129)
wavelet = "gaus1"                         #Wavelet name, 'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
over_lap_inc = 72                         #Incremental CWT start time
#==============================================================================================================================================

"""Build CNN"""
#==============================================================================================================================================
x  = tf.placeholder(tf.float32, [None, input_dim[1], input_dim[2], input_dim[0]]) # (num_data, height(time), width(freq_lines), ch)
y_ = tf.placeholder(tf.float32, [None, 2]) # (num_data, num_label)
print("input shape ", str(x.get_shape()))

with tf.variable_scope("conv1") as scope:
    W_conv1 = weight_variable([5, 5, input_dim[0], 16])
    b_conv1 = bias_variable([16])
    h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)
    print("conv1 shape ", str(h_pool1.get_shape()))

with tf.variable_scope("conv2") as scope:
    W_conv2 = weight_variable([5, 5, 16, 32])
    b_conv2 = bias_variable([32])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)
    print("conv2 shape ", str(h_pool2.get_shape()))
    h_pool2_height = int(h_pool2.get_shape()[1])
    h_pool2_width = int(h_pool2.get_shape()[2])

with tf.variable_scope("fc1") as scope:
    W_fc1 = weight_variable([h_pool2_height*h_pool2_width*32, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, h_pool2_height*h_pool2_width*32])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    print("fc1 shape ", str(h_fc1.get_shape()))
    keep_prob = tf.placeholder(tf.float32)
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

with tf.variable_scope("fc2") as scope:
    W_fc2 = weight_variable([1024, 2])
    b_fc2 = bias_variable([2])
    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
    print("output shape ", str(y_conv.get_shape()))

#Visualize parameters with tensorboard
W_conv1 = tf.summary.histogram("W_conv1", W_conv1)
b_conv1 = tf.summary.histogram("b_conv1", b_conv1)
W_conv2 = tf.summary.histogram("W_conv2", W_conv2)
b_conv2 = tf.summary.histogram("b_conv2", b_conv2)
W_fc1 = tf.summary.histogram("W_fc1", W_fc1)
b_fc1 = tf.summary.histogram("b_fc1", b_fc1)
W_fc2 = tf.summary.histogram("W_fc2", W_fc2)
b_fc2 = tf.summary.histogram("b_fc2", b_fc2)
#==============================================================================================================================================

"""Specifying the error function"""
#==============================================================================================================================================
# cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))
cross_entropy = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels = y_, logits = y_conv))
loss_summary = tf.summary.scalar("loss", cross_entropy) # for tensorboard
#==============================================================================================================================================

"""Specify optimizer"""
#==============================================================================================================================================
optimizer = tf.train.AdamOptimizer(1e-4)
train_step = optimizer.minimize(cross_entropy)

#Visualize the gradient with a tensorboard
grads = optimizer.compute_gradients(cross_entropy)
dW_conv1 = tf.summary.histogram("dW_conv1", grads[0]) # for tensorboard
db_conv1 = tf.summary.histogram("db_conv1", grads[1])
dW_conv2 = tf.summary.histogram("dW_conv2", grads[2])
db_conv2 = tf.summary.histogram("db_conv2", grads[3])
dW_fc1 = tf.summary.histogram("dW_fc1", grads[4])
db_fc1 = tf.summary.histogram("db_fc1", grads[5])
dW_fc2 = tf.summary.histogram("dW_fc2", grads[6])
db_fc2 = tf.summary.histogram("db_fc2", grads[7])

# for i in range(8): # for debag
#     print(grads[i])
#==============================================================================================================================================

"""Parameters for accuracy verification"""
#==============================================================================================================================================
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
accuracy_summary = tf.summary.scalar("accuracy", accuracy) # for tensorboard
#==============================================================================================================================================

"""Execution of learning"""
#==============================================================================================================================================
acc_list = []            #List to save the accuracy rate and the progress of the error
num_data_each_conf = []  #A list that stores the progress of the number of data for each conviction
acc_each_conf = []       #A list that saves the progress of the correct answer rate for each conviction
start_time = time.time() #Calculation time count
total_cal_time = 0
iters = 10000            #Number of trainings for each training data
train_batch_size = 100   #Learning batch size
test_batch_size = 100    #Test batch size

with tf.Session() as sess:
    saver = tf.train.Saver()
    sess.run(tf.global_variables_initializer())

    #Exporting files for tensorboard
    merged = tf.summary.merge_all()
    writer = tf.summary.FileWriter(r"temp_result", sess.graph)

    print("learning term = 10year")
    train_file_name = "USDJPY_20070301_20170228_5min.csv" #Exchange data file name, train
    # train_file_name = "USDJPY_20170301_20170731_5min.csv" # for debag
    test_file_name = "USDJPY_20170301_20170731_5min.csv"  #Exchange data file name, test
    #Creating a scalogram
    x_train, t_train, x_test, t_test = make_scalogram(train_file_name, test_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc)
    #Execution of learning
    acc_list, num_data_each_conf, acc_each_conf, total_cal_time = train(x_train, t_train, x_test, t_test, iters, acc_list, num_data_each_conf, acc_each_conf, total_cal_time, train_step, train_batch_size, test_batch_size)

    print("learning term = 5year")
    train_file_name = "USDJPY_20120301_20170228_5min.csv" #Exchange data file name, train
    # train_file_name = "USDJPY_20170301_20170731_5min.csv" # for debag
    test_file_name = "USDJPY_20170301_20170731_5min.csv"  #Exchange data file name, test
    #Creating a scalogram
    x_train, t_train, x_test, t_test = make_scalogram(train_file_name, test_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc)
    #Execution of learning
    acc_list, num_data_each_conf, acc_each_conf, total_cal_time = train(x_train, t_train, x_test, t_test, iters, acc_list, num_data_each_conf, acc_each_conf, total_cal_time, train_step, train_batch_size, test_batch_size)

    print("learning term = 2year")
    train_file_name = "USDJPY_20150301_20170228_5min.csv" #Exchange data file name, train
    # train_file_name = "USDJPY_20170301_20170731_5min.csv" # for debag
    test_file_name = "USDJPY_20170301_20170731_5min.csv"  #Exchange data file name, test
    #Creating a scalogram
    x_train, t_train, x_test, t_test = make_scalogram(train_file_name, test_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc)
    #Execution of learning
    acc_list, num_data_each_conf, acc_each_conf, total_cal_time = train(x_train, t_train, x_test, t_test, iters, acc_list, num_data_each_conf, acc_each_conf, total_cal_time, train_step, train_batch_size, test_batch_size)

    print("learning term = 1year")
    train_file_name = "USDJPY_20160301_20170228_5min.csv" #Exchange data file name, train
    # train_file_name = "USDJPY_20170301_20170731_5min.csv" # for debag
    test_file_name = "USDJPY_20170301_20170731_5min.csv"  #Exchange data file name, test
    #Creating a scalogram
    x_train, t_train, x_test, t_test = make_scalogram(train_file_name, test_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc)
    #Execution of learning
    acc_list, num_data_each_conf, acc_each_conf, total_cal_time = train(x_train, t_train, x_test, t_test, iters, acc_list, num_data_each_conf, acc_each_conf, total_cal_time, train_step, train_batch_size, test_batch_size)
    
    #Final accuracy rate for test data
    # use all data
    print("test accuracy %g"%accuracy.eval(feed_dict={x: x_test, y_: t_test, keep_prob: 1.0}))
    
    # use test batch
    # batch_mask = np.random.choice(test_size, test_batch_size)
    # te_batch_xs = x_test[batch_mask]
    # te_batch_ys = t_test[batch_mask]
    # test_accuracy = accuracy.eval(feed_dict={x: te_batch_xs, y_: te_batch_ys, keep_prob: 1.0})
    
    print("total calculation time %g sec"%total_cal_time)
    
    np.savetxt(r"temp_result\acc_list.csv", acc_list, delimiter = ",")                                 #Writing out the correct answer rate and the progress of the error
    np.savetxt(r"temp_result\number_of_data_each_confidence.csv", num_data_each_conf, delimiter = ",") #Exporting the progress of the number of data for each conviction
    np.savetxt(r"temp_result\accuracy_rate_of_each_confidence.csv", acc_each_conf, delimiter = ",")    #Writing out the progress of the correct answer rate for each conviction
    saver.save(sess, r"temp_result\spectrogram_model.ckpt")                                            #Export final parameters
#==============================================================================================================================================

scalogram2.py



# -*- coding: utf-8 -*-
"""
Created on Tue Jul 25 11:24:50 2017

@author: izumiy
"""

import pywt
import numpy as np
import matplotlib.pyplot as plt

def create_scalogram_1(time_series, scales, wavelet, predict_time_inc, save_flag, ch_flag, height, width):
    """
A function that performs a continuous wavelet transform
Use closing price
    time_series      :Currency data,closing price
    scales           :Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
    wavelet          :Wavelet name,Use one of the following
     'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
    predict_time_inc :Increment of time to predict price movement
    save_flag        : save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
    ch_flag          :Number of channels to use, ch_flag=1 : close
    height           :Image height num of time lines
    width            :Image width num of freq lines
    """
    
    """Reading exchange time series data"""
    num_series_data = time_series.shape[0] #Get the number of data
    print("number of the series data : " + str(num_series_data))
    close = time_series

    """Performing continuous wavelet transform"""
    # https://pywavelets.readthedocs.io/en/latest/ref/cwt.html
    print("carry out cwt...")
    time_start = 0
    time_end = time_start + height
    scalogram = np.empty((0, ch_flag, height, width))
    
    # hammingWindow = np.hamming(height)    #Humming window
    # hanningWindow = np.hanning(height)    #Hanning window
    # blackmanWindow = np.blackman(height)  #Blackman window
    # bartlettWindow = np.bartlett(height)  #Bartlett window

    while(time_end <= num_series_data - predict_time_inc):
        # print("time start " + str(time_start)) for debag
        temp_close = close[time_start:time_end]

        #With window function
        # temp_close = temp_close * hammingWindow

        #mirror,Add inverted data before and after the data
        mirror_temp_close = temp_close[::-1]
        x = np.append(mirror_temp_close, temp_close)
        temp_close = np.append(x, mirror_temp_close)
        
        temp_cwt_close, freq_close = pywt.cwt(temp_close, scales, wavelet)        #Performing continuous wavelet transform
        temp_cwt_close = temp_cwt_close.T                                         #Transposed CWT(freq, time) ⇒ CWT(time, freq)
        
        #mirror,Extract only the central data
        temp_cwt_close = temp_cwt_close[height:2*height,:]
        
        temp_cwt_close = np.reshape(temp_cwt_close, (-1, ch_flag, height, width)) # num_data, ch, height(time), width(freq)
        # print("temp_cwt_close_shape " + str(temp_cwt_close.shape)) # for debag
        scalogram = np.append(scalogram, temp_cwt_close, axis=0)
        # print("cwt_close_shape " + str(cwt_close.shape)) # for debag
        time_start = time_end
        time_end = time_start + height
    
    """Creating a label"""
    print("make label...")
    
    #How to compare two sequences
    last_time = num_series_data - predict_time_inc
    corrent_close = close[:last_time]
    predict_close = close[predict_time_inc:]
    label_array = predict_close > corrent_close
    # print(label_array[:30]) # for debag            
            
    """
    #How to use while,slow
    label_array = np.array([])
    print(label_array)
    time_start = 0
    time_predict = time_start + predict_time_inc
    
    while(time_predict < num_series_data):
        if close[time_start] >= close[time_predict]:
            label = 0 #Go down
        else:
            label = 1 #Go up
            
        label_array = np.append(label_array, label)
        time_start = time_start + 1
        time_predict = time_start + predict_time_inc
    # print(label_array[:30]) # for debag
    """
    
    """label_array(time),Slice so that time is divisible by height"""
    raw_num_shift = label_array.shape[0]
    num_shift = int(raw_num_shift / height) * height
    label_array = label_array[0:num_shift]
    
    """Extraction of labels corresponding to each scalogram, (The number of data,label)"""
    col = height - 1
    label_array = np.reshape(label_array, (-1, height))
    label_array = label_array[:, col]
      
    """File output"""
    if save_flag == 1:
        print("output the files")
        save_cwt_close = np.reshape(scalogram, (-1, width))
        np.savetxt("scalogram.csv", save_cwt_close, delimiter = ",")
        np.savetxt("label.csv", label_array.T, delimiter = ",")
        
    print("CWT is done")
    return scalogram, label_array, freq_close

def create_scalogram_5(time_series, scales, wavelet, predict_time_inc, save_flag, ch_flag, height, width):
    """
A function that performs a continuous wavelet transform
Use closing price
    time_series      :Currency data,closing price
    scales           :Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
    wavelet          :Wavelet name,Use one of the following
     'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
    predict_time_inc :Increment of time to predict price movement
    save_flag        : save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
    ch_flag          :Number of channels to use, ch_flag=5 : start, high, low, close, volume
    height           :Image height num of time lines
    width            :Image width num of freq lines
    """
    
    """Reading exchange time series data"""
    num_series_data = time_series.shape[0] #Get the number of data
    print("number of the series data : " + str(num_series_data))
    start = time_series[:,0]
    high = time_series[:,1]
    low = time_series[:,2]
    close = time_series[:,3]
    volume = time_series[:,4]

    """Performing continuous wavelet transform"""
    # https://pywavelets.readthedocs.io/en/latest/ref/cwt.html
    print("carry out cwt...")
    time_start = 0
    time_end = time_start + height
    scalogram = np.empty((0, ch_flag, height, width))
    
    while(time_end <= num_series_data - predict_time_inc):
        # print("time start " + str(time_start)) for debag
        temp_start = start[time_start:time_end]
        temp_high = high[time_start:time_end]
        temp_low = low[time_start:time_end]
        temp_close = close[time_start:time_end]
        temp_volume = volume[time_start:time_end]

        temp_cwt_start, freq_start = pywt.cwt(temp_start, scales, wavelet)        #Performing continuous wavelet transform
        temp_cwt_high, freq_high = pywt.cwt(temp_high, scales, wavelet)
        temp_cwt_low, freq_low = pywt.cwt(temp_low, scales, wavelet)
        temp_cwt_close, freq_close = pywt.cwt(temp_close, scales, wavelet)
        temp_cwt_volume, freq_volume = pywt.cwt(temp_volume, scales, wavelet)
        
        temp_cwt_start = temp_cwt_start.T                                         #Transposed CWT(freq, time) ⇒ CWT(time, freq)
        temp_cwt_high = temp_cwt_high.T
        temp_cwt_low = temp_cwt_low.T
        temp_cwt_close = temp_cwt_close.T
        temp_cwt_volume = temp_cwt_volume.T
        
        temp_cwt_start = np.reshape(temp_cwt_start, (-1, 1, height, width)) # num_data, ch, height(time), width(freq)
        temp_cwt_high = np.reshape(temp_cwt_high, (-1, 1, height, width))
        temp_cwt_low = np.reshape(temp_cwt_low, (-1, 1, height, width))
        temp_cwt_close = np.reshape(temp_cwt_close, (-1, 1, height, width))
        temp_cwt_volume = np.reshape(temp_cwt_volume, (-1, 1, height, width))
        # print("temp_cwt_close_shape " + str(temp_cwt_close.shape)) # for debag
        
        temp_cwt_start = np.append(temp_cwt_start, temp_cwt_high, axis=1)
        temp_cwt_start = np.append(temp_cwt_start, temp_cwt_low, axis=1)
        temp_cwt_start = np.append(temp_cwt_start, temp_cwt_close, axis=1)
        temp_cwt_start = np.append(temp_cwt_start, temp_cwt_volume, axis=1)
        # print("temp_cwt_start_shape " + str(temp_cwt_start.shape)) for debag
        
        scalogram = np.append(scalogram, temp_cwt_start, axis=0)
        # print("cwt_close_shape " + str(cwt_close.shape)) # for debag
        time_start = time_end
        time_end = time_start + height
    
    """Creating a label"""
    print("make label...")
    
    #How to compare two sequences
    last_time = num_series_data - predict_time_inc
    corrent_close = close[:last_time]
    predict_close = close[predict_time_inc:]
    label_array = predict_close > corrent_close
    # print(label_array[:30]) # for debag            
            
    """
    #How to use while,slow
    label_array = np.array([])
    print(label_array)
    time_start = 0
    time_predict = time_start + predict_time_inc
    
    while(time_predict < num_series_data):
        if close[time_start] >= close[time_predict]:
            label = 0 #Go down
        else:
            label = 1 #Go up
            
        label_array = np.append(label_array, label)
        time_start = time_start + 1
        time_predict = time_start + predict_time_inc
    # print(label_array[:30]) # for debag
    """
    
    """label_array(time),Slice so that time is divisible by height"""
    raw_num_shift = label_array.shape[0]
    num_shift = int(raw_num_shift / height) * height
    label_array = label_array[0:num_shift]
    
    """Extraction of labels corresponding to each scalogram, (The number of data,label)"""
    col = height - 1
    label_array = np.reshape(label_array, (-1, height))
    label_array = label_array[:, col]
      
    """File output"""
    if save_flag == 1:
        print("output the files")
        save_cwt_close = np.reshape(scalogram, (-1, width))
        np.savetxt("scalogram.csv", save_cwt_close, delimiter = ",")
        np.savetxt("label.csv", label_array.T, delimiter = ",")
        
    print("CWT is done")
    return scalogram, label_array, freq_close
    
def CWT_1(time_series, scales, wavelet, predict_time_inc, save_flag):
    """
A function that performs a continuous wavelet transform
Use closing price
    time_series      :Currency data,closing price
    scales           :Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
    wavelet          :Wavelet name,Use one of the following
     'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
    predict_time_inc :Increment of time to predict price movement
    save_flag        : save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
    """
    
    """Reading exchange time series data"""
    num_series_data = time_series.shape[0] #Get the number of data
    print("number of the series data : " + str(num_series_data))
    close = time_series

    """Performing continuous wavelet transform"""
    # https://pywavelets.readthedocs.io/en/latest/ref/cwt.html
    print("carry out cwt...")
    cwt_close, freq_close = pywt.cwt(close, scales, wavelet)
    
    #Transposed CWT(freq, time) ⇒ CWT(time, freq)
    cwt_close = cwt_close.T
    
    """Creating a label"""
    print("make label...")
    
    #How to compare two sequences
    last_time = num_series_data - predict_time_inc
    corrent_close = close[:last_time]
    predict_close = close[predict_time_inc:]
    label_array = predict_close > corrent_close
    # print(label_array[:30]) # for debag
    
    """
    #How to use while
    label_array = np.array([])
    print(label_array)
    time_start = 0
    time_predict = time_start + predict_time_inc
    
    while(time_predict < num_series_data):
        if close[time_start] >= close[time_predict]:
            label = 0 #Go down
        else:
            label = 1 #Go up
            
        label_array = np.append(label_array, label)
        time_start = time_start + 1
        time_predict = time_start + predict_time_inc
    # print(label_array[:30]) # for debag
    """
      
    """File output"""
    if save_flag == 1:
        print("output the files")
        np.savetxt("CWT_close.csv", cwt_close, delimiter = ",")
        np.savetxt("label.csv", label_array.T, delimiter = ",")
        
    print("CWT is done")
    return [cwt_close], label_array, freq_close

def merge_CWT_1(cwt_list, label_array, height, width):
    """
Use closing price
    cwt_list    :CWT result list
    label_array :Numpy array containing labels
    height      :Image height num of time lines
    width       :Image width num of freq lines
    """
    print("merge CWT")
    
    cwt_close = cwt_list[0]  #Closing price CWT(time, freq)
    
    """CWT(time, freq),Slice so that time is divisible by height"""
    raw_num_shift = cwt_close.shape[0]
    num_shift = int(raw_num_shift / height) * height
    cwt_close = cwt_close[0:num_shift]
    label_array = label_array[0:num_shift]
    
    """Shape change, (The number of data,Channel,height(time),width(freq))"""
    cwt_close = np.reshape(cwt_close, (-1, 1, height, width))
    
    """Extraction of labels corresponding to each scalogram, (The number of data,label)"""
    col = height - 1
    label_array = np.reshape(label_array, (-1, height))
    label_array = label_array[:, col]

    return cwt_close, label_array

def CWT_2(time_series, scales, wavelet, predict_time_inc, save_flag):
    """
A function that performs a continuous wavelet transform
closing price,Use Volume
    time_series      :Currency data,closing price, volume
    scales           :Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
    wavelet          :Wavelet name,Use one of the following
     'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
    predict_time_inc :Increment of time to predict price movement
    save_flag        : save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
    """
    
    """Reading exchange time series data"""
    num_series_data = time_series.shape[0] #Get the number of data
    print("number of the series data : " + str(num_series_data))
    close = time_series[:,0]
    volume = time_series[:,1]

    """Performing continuous wavelet transform"""
    # https://pywavelets.readthedocs.io/en/latest/ref/cwt.html
    print("carry out cwt...")
    cwt_close, freq_close = pywt.cwt(close, scales, wavelet)
    cwt_volume, freq_volume = pywt.cwt(volume, scales, wavelet)
    
    #Transposed CWT(freq, time) ⇒ CWT(time, freq)
    cwt_close = cwt_close.T
    cwt_volume = cwt_volume.T
    
    """Creating a label"""
    print("make label...")
    
    #How to compare two sequences
    last_time = num_series_data - predict_time_inc
    corrent_close = close[:last_time]
    predict_close = close[predict_time_inc:]
    label_array = predict_close > corrent_close
    # print(label_array[:30]) # for debag
    
    """
    #How to use while
    label_array = np.array([])
    print(label_array)
    time_start = 0
    time_predict = time_start + predict_time_inc
    
    while(time_predict < num_series_data):
        if close[time_start] >= close[time_predict]:
            label = 0 #Go down
        else:
            label = 1 #Go up
            
        label_array = np.append(label_array, label)
        time_start = time_start + 1
        time_predict = time_start + predict_time_inc
    # print(label_array[:30]) # for debag
    """
        
    """File output"""
    if save_flag == 1:
        print("output the files")
        np.savetxt("CWT_close.csv", cwt_close, delimiter = ",")
        np.savetxt("CWT_volume.csv", cwt_volume, delimiter = ",")
        np.savetxt("label.csv", label_array.T, delimiter = ",")
        
    print("CWT is done")
    return [cwt_close, cwt_volume], label_array, freq_close

def merge_CWT_2(cwt_list, label_array, height, width):
    """
closing price,Use Volume
    cwt_list    :CWT result list
    label_array :Numpy array containing labels
    height      :Image height num of time lines
    width       :Image width num of freq lines
    """
    print("merge CWT")
    
    cwt_close = cwt_list[0]  #Closing price CWT(time, freq)
    cwt_volume = cwt_list[1] #Volume
    
    """CWT(time, freq),Slice so that time is divisible by height"""
    raw_num_shift = cwt_close.shape[0]
    num_shift = int(raw_num_shift / height) * height
    cwt_close = cwt_close[0:num_shift]
    cwt_volume = cwt_volume[0:num_shift]
    label_array = label_array[0:num_shift]
    
    """Shape change, (The number of data,Channel,height(time),width(freq))"""
    cwt_close = np.reshape(cwt_close, (-1, 1, height, width))
    cwt_volume = np.reshape(cwt_volume, (-1, 1, height, width))
    
    """Merge"""
    cwt_close = np.append(cwt_close, cwt_volume, axis=1)
    
    """Extraction of labels corresponding to each scalogram, (The number of data,label)"""
    col = height - 1
    label_array = np.reshape(label_array, (-1, height))
    label_array = label_array[:, col]

    return cwt_close, label_array

def CWT_5(time_series, scales, wavelet, predict_time_inc, save_flag):
    """
A function that performs a continuous wavelet transform
Open price, high price, low price, close price,Use Volume
    time_series      :Currency data,Open price,High price,Low price,closing price, volume
    scales           :Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
    wavelet          :Wavelet name,Use one of the following
     'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
    predict_time_inc :Increment of time to predict price movement
    save_flag        : save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
    """
    
    """Reading exchange time series data"""
    num_series_data = time_series.shape[0] #Get the number of data
    print("number of the series data : " + str(num_series_data))
    start = time_series[:,0]
    high = time_series[:,1]
    low = time_series[:,2]
    close = time_series[:,3]
    volume = time_series[:,4]

    """Performing continuous wavelet transform"""
    # https://pywavelets.readthedocs.io/en/latest/ref/cwt.html
    print("carry out cwt...")
    cwt_start, freq_start = pywt.cwt(start, scales, wavelet)
    cwt_high, freq_high = pywt.cwt(high, scales, wavelet)
    cwt_low, freq_low = pywt.cwt(low, scales, wavelet)
    cwt_close, freq_close = pywt.cwt(close, scales, wavelet)
    cwt_volume, freq_volume = pywt.cwt(volume, scales, wavelet)
    
    #Transposed CWT(freq, time) ⇒ CWT(time, freq)
    cwt_start = cwt_start.T
    cwt_high = cwt_high.T
    cwt_low = cwt_low.T
    cwt_close = cwt_close.T
    cwt_volume = cwt_volume.T
    
    """Creating a label"""
    print("make label...")
    
    #How to compare two sequences
    last_time = num_series_data - predict_time_inc
    corrent_close = close[:last_time]
    predict_close = close[predict_time_inc:]
    label_array = predict_close > corrent_close
    # print(label_array.dtype) >>> bool
    
    """
    #How to use while
    label_array = np.array([])
    print(label_array)
    time_start = 0
    time_predict = time_start + predict_time_inc
    
    while(time_predict < num_series_data):
        if close[time_start] >= close[time_predict]:
            label = 0 #Go down
        else:
            label = 1 #Go up
            
        label_array = np.append(label_array, label)
        time_start = time_start + 1
        time_predict = time_start + predict_time_inc
    # print(label_array[:30]) # for debag
    """
          
    """File output"""
    if save_flag == 1:
        print("output the files")
        np.savetxt("CWT_start.csv", cwt_start, delimiter = ",")
        np.savetxt("CWT_high.csv", cwt_high, delimiter = ",")
        np.savetxt("CWT_low.csv", cwt_low, delimiter = ",")
        np.savetxt("CWT_close.csv", cwt_close, delimiter = ",")
        np.savetxt("CWT_volume.csv", cwt_volume, delimiter = ",")
        np.savetxt("label.csv", label_array.T, delimiter = ",")
        
    print("CWT is done")
    return [cwt_start, cwt_high, cwt_low, cwt_close, cwt_volume], label_array, freq_close

def merge_CWT_5(cwt_list, label_array, height, width):
    """
    cwt_list    :CWT result list
    label_array :Numpy array containing labels
    height      :Image height num of time lines
    width       :Image width num of freq lines
    """
    print("merge CWT")
    
    cwt_start = cwt_list[0]  #Open price
    cwt_high = cwt_list[1]   #High price
    cwt_low = cwt_list[2]    #Low price
    cwt_close = cwt_list[3]  #Closing price CWT(time, freq)
    cwt_volume = cwt_list[4] #Volume
    
    """CWT(time, freq),Slice so that time is divisible by height"""
    raw_num_shift = cwt_close.shape[0]
    num_shift = int(raw_num_shift / height) * height
    cwt_start = cwt_start[0:num_shift]
    cwt_high = cwt_high[0:num_shift]
    cwt_low = cwt_low[0:num_shift]
    cwt_close = cwt_close[0:num_shift]
    cwt_volume = cwt_volume[0:num_shift]
    label_array = label_array[0:num_shift]
    
    """Shape change, (The number of data,Channel,height(time),width(freq))"""
    cwt_start = np.reshape(cwt_start, (-1, 1, height, width))
    cwt_high = np.reshape(cwt_high, (-1, 1, height, width))
    cwt_low = np.reshape(cwt_low, (-1, 1, height, width))
    cwt_close = np.reshape(cwt_close, (-1, 1, height, width))
    cwt_volume = np.reshape(cwt_volume, (-1, 1, height, width))
    
    """Merge"""
    cwt_start = np.append(cwt_start, cwt_high, axis=1)
    cwt_start = np.append(cwt_start, cwt_low, axis=1)
    cwt_start = np.append(cwt_start, cwt_close, axis=1)
    cwt_start = np.append(cwt_start, cwt_volume, axis=1)
    
    """Extraction of labels corresponding to each scalogram, (The number of data,label)"""
    col = height - 1
    label_array = np.reshape(label_array, (-1, height))
    label_array = label_array[:, col]
    # print(label_array.dtype) >>> bool

    return cwt_start, label_array
    
def make_scalogram(input_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc):
    """
    input_file_name  :Exchange data file name
    scales           :Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
    wavelet          :Wavelet name,Use one of the following
     'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
    predict_time_inc :Increment of time to predict price movement
    height           :Image height num of time lines
    width            :Image width num of freq lines
    ch_flag          :Number of channels to use, ch_flag=1:close, ch_flag=2:close and volume, ch_flag=5:start, high, low, close, volume
    save_flag        : save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
    over_lap_inc     :Incremental CWT start time
    """

    scalogram = np.empty((0, ch_flag, height, width)) #Array to store all scalograms and labels
    label = np.array([])
    over_lap_start = 0
    over_lap_end = int((height - 1) / over_lap_inc) * over_lap_inc + 1
    
    if ch_flag==1:
        
        print("reading the input file...")    
        time_series = np.loadtxt(input_file_name, delimiter = ",", usecols = (5,), skiprows = 1) #Get the closing price as a numpy array
        
        for i in range(over_lap_start, over_lap_end, over_lap_inc):
            print("over_lap_start " + str(i))
            temp_time_series = time_series[i:] #Change the start time of CWT
            cwt_list, label_array, freq = CWT_1(temp_time_series, scales, wavelet, predict_time_inc, save_flag) #Run CWT
            temp_scalogram, temp_label = merge_CWT_1(cwt_list, label_array, height, width)                      #Creating a scalogram
            
            scalogram = np.append(scalogram, temp_scalogram, axis=0) #Combine all scalograms and labels into one array
            label = np.append(label, temp_label)
        
        print("scalogram_shape " + str(scalogram.shape))
        print("label shape " + str(label.shape))
        print("frequency " + str(freq))
        
    elif ch_flag==2:
        
        print("reading the input file...")    
        time_series = np.loadtxt(input_file_name, delimiter = ",", usecols = (5,6), skiprows = 1) #closing price,Get volume as a numpy array
        
        for i in range(over_lap_start, over_lap_end, over_lap_inc):
            print("over_lap_start " + str(i))
            temp_time_series = time_series[i:] #Change the start time of CWT
            cwt_list, label_array, freq = CWT_2(temp_time_series, scales, wavelet, predict_time_inc, save_flag) #Run CWT
            temp_scalogram, temp_label = merge_CWT_2(cwt_list, label_array, height, width)                      #Creating a scalogram
            
            scalogram = np.append(scalogram, temp_scalogram, axis=0) #Combine all scalograms and labels into one array
            label = np.append(label, temp_label)
        
        print("scalogram_shape " + str(scalogram.shape))
        print("label shape " + str(label.shape))
        print("frequency " + str(freq))
        
    elif ch_flag==5:
        
        print("reading the input file...")    
        time_series = np.loadtxt(input_file_name, delimiter = ",", usecols = (2,3,4,5,6), skiprows = 1) #Open price,High price,Low price,closing price,Get volume as a numpy array
        
        for i in range(over_lap_start, over_lap_end, over_lap_inc):
            print("over_lap_start " + str(i))
            temp_time_series = time_series[i:] #Change the start time of CWT
            cwt_list, label_array, freq = CWT_5(temp_time_series, scales, wavelet, predict_time_inc, save_flag) #Run CWT
            temp_scalogram, temp_label = merge_CWT_5(cwt_list, label_array, height, width)                      #Creating a scalogram
            
            scalogram = np.append(scalogram, temp_scalogram, axis=0) #Combine all scalograms and labels into one array
            label = np.append(label, temp_label)
            # print(temp_label.dtype) >>> bool
            # print(label.dtype)      >>> float64
        
        print("scalogram_shape " + str(scalogram.shape))
        print("label shape " + str(label.shape))
        print("frequency " + str(freq))
    
    label = label.astype(np.int)
    return scalogram, label

def merge_scalogram(input_file_name, scales, wavelet, height, width, predict_time_inc, ch_flag, save_flag, over_lap_inc):
    """
    input_file_name  :Exchange data file name
    scales           :Specify the scale to use with a numpy array,The scale corresponds to the frequency of the wavelet used for analysis,High scales and low frequencies,If it is small, it will be high frequency
    wavelet          :Wavelet name,Use one of the following
     'gaus1', 'gaus2', 'gaus3', 'gaus4', 'gaus5', 'gaus6', 'gaus7', 'gaus8', 'mexh', 'morl'
    predict_time_inc :Increment of time to predict price movement
    height           :Image height num of time lines
    width            :Image width num of freq lines
    ch_flag          :Number of channels to use, ch_flag=1:close, ch_flag=2:close and volume, ch_flag=5:start, high, low, close, volume
    save_flag        : save_flag=1 :Save the CWT coefficient as a csv file, save_flag=0 :Do not save CWT coefficients as a csv file
    over_lap_inc     :Incremental CWT start time
    """

    scalogram = np.empty((0, ch_flag, height, width)) #Array to store all scalograms and labels
    label = np.array([])
    over_lap_start = 0
    over_lap_end = int((height - 1) / over_lap_inc) * over_lap_inc + 1
    
    if ch_flag==1:
        
        print("reading the input file...")    
        time_series = np.loadtxt(input_file_name, delimiter = ",", usecols = (5,), skiprows = 1) #Get the closing price as a numpy array
        
        for i in range(over_lap_start, over_lap_end, over_lap_inc):
            print("over_lap_start " + str(i))
            temp_time_series = time_series[i:] #Change the start time of CWT
            temp_scalogram, temp_label, freq = create_scalogram_1(temp_time_series, scales, wavelet, predict_time_inc, save_flag, ch_flag, height, width)
            scalogram = np.append(scalogram, temp_scalogram, axis=0) #Combine all scalograms and labels into one array
            label = np.append(label, temp_label)
        
        # print("scalogram_shape " + str(scalogram.shape))
        # print("label shape " + str(label.shape))
        # print("frequency " + str(freq))
        
    if ch_flag==5:
        
        print("reading the input file...")    
        time_series = np.loadtxt(input_file_name, delimiter = ",", usecols = (2,3,4,5,6), skiprows = 1) #Get the closing price as a numpy array
        
        for i in range(over_lap_start, over_lap_end, over_lap_inc):
            print("over_lap_start " + str(i))
            temp_time_series = time_series[i:] #Change the start time of CWT
            temp_scalogram, temp_label, freq = create_scalogram_5(temp_time_series, scales, wavelet, predict_time_inc, save_flag, ch_flag, height, width)
            scalogram = np.append(scalogram, temp_scalogram, axis=0) #Combine all scalograms and labels into one array
            label = np.append(label, temp_label)
        
    label = label.astype(np.int)
    return scalogram, label, freq

Recommended Posts

Examination of exchange rate forecasting method using deep learning and wavelet transform
Collection and automation of erotic images using deep learning
[Deep Learning from scratch] Implementation of Momentum method and AdaGrad method
Noise removal method using wavelet transform
[Anomaly detection] Try using the latest method of deep distance learning
Meaning of deep learning models and parameters
A memorandum of studying and implementing deep learning
Wavelet transform of images with PyWavelets and OpenCV
Linear regression method using Numpy
Removal of haze using Python detailEnhanceFilter
Noise removal method using wavelet transform
Derivatives Learned Using Python-(1) Calculation of Forward Exchange Rate-
DNN (Deep Learning) Library: Comparison of chainer and TensorFlow (1)
A concrete method of predicting horse racing by machine learning and simulating the recovery rate
Classify CIFAR-10 image datasets using various models of deep learning
Verification and implementation of video reconstruction method using GRU and Autoencoder
Deep running 2 Tuning of deep learning
Deep reinforcement learning 2 Implementation of reinforcement learning