sin (t + 1)
(next step) from sin (t)
.sin (t + n)
(multiple steps).*** Added on May 27, 2016: ***
I wrote the sequel "I tried to predict by letting RNN learn the sine wave: Hyperparameter adjustment".
I will omit it roughly. I think that the tutorial of TensorFlow and the articles referenced from it will be helpful.
A sine wave with 50 steps per cycle was generated for 100 cycles, for a total of 5,000 steps, and used as training data. In addition, we have prepared two types of training data, noise-free and noise-free.
The training data consists of a pair of sin (t)
(sin value at time t) and sin (t + 1)
(sin value at time t + 1).
For details on the generation of training data, please refer to the ʻipynb file (IPython Notebook). (As an aside, I was surprised to see the ʻipynb
file previewed on GitHub)
This time, we are learning and predicting with one code. The source code is shown in the appendix at the end of the sentence.
The flow of learning and prediction is as follows.
sin (t + 1)
using the initial data (the beginning of the training data)sin (t + 2)
using the predicted sin (t + 1)
4.3 Repeat 3I used a network called "input layer-hidden layer-RNN cell-output layer". We also used LSTMs for RNN cells.
The hyperparameters used for learning and prediction are as follows.
Variable name | meaning | value |
---|---|---|
num_of_input_nodes | Number of nodes in the input layer | 1 node |
num_of_hidden_nodes | Number of nodes in the hidden layer | 2 nodes |
num_of_output_nodes | Number of nodes in the output layer | 1 node |
length_of_sequences | RNN sequence length | 50 steps |
num_of_training_epochs | Number of learning repetitions | 2,000 times |
length_of_initial_sequences | Initial data sequence length | 50 steps |
num_of_prediction_epochs | Number of repetitions of prediction | 100 times |
size_of_mini_batch | Number of samples per mini-batch | 100 samples |
learning_rate | Learning rate | 0.1 |
forget_bias | (I'm not sure) | 1.0 (default value) |
The figure below plots the prediction results. The legend is as follows.
A waveform like that is output. The overall amplitude is shallow, the vertices are distorted, and the frequency is a little lower. Please refer to basic / output.ipynb for specific values.
The amplitude is even shallower and the frequency is slightly higher than without noise. Also, it seems that the noise component contained in the training data has been reduced. See noised / output.ipynb for specific values.
I would like to try changing the network configuration and hyperparameters to see what kind of prediction results will be obtained.
*** Added on May 27, 2016: ***
I wrote the sequel "I made RNN learn sin waves and predicted: hyperparameter adjustment".
The source code for the noise-free version is shown below. Please refer to GitHub for the source code of the noisy version. The noisy version and the noisy version differ only in the input file name.
rnn.py
import tensorflow as tf
from tensorflow.models.rnn import rnn, rnn_cell
import numpy as np
import random
def make_mini_batch(train_data, size_of_mini_batch, length_of_sequences):
inputs = np.empty(0)
outputs = np.empty(0)
for _ in range(size_of_mini_batch):
index = random.randint(0, len(train_data) - length_of_sequences)
part = train_data[index:index + length_of_sequences]
inputs = np.append(inputs, part[:, 0])
outputs = np.append(outputs, part[-1, 1])
inputs = inputs.reshape(-1, length_of_sequences, 1)
outputs = outputs.reshape(-1, 1)
return (inputs, outputs)
def make_prediction_initial(train_data, index, length_of_sequences):
return train_data[index:index + length_of_sequences, 0]
train_data_path = "../train_data/normal.npy"
num_of_input_nodes = 1
num_of_hidden_nodes = 2
num_of_output_nodes = 1
length_of_sequences = 50
num_of_training_epochs = 2000
length_of_initial_sequences = 50
num_of_prediction_epochs = 100
size_of_mini_batch = 100
learning_rate = 0.1
forget_bias = 1.0
print("train_data_path = %s" % train_data_path)
print("num_of_input_nodes = %d" % num_of_input_nodes)
print("num_of_hidden_nodes = %d" % num_of_hidden_nodes)
print("num_of_output_nodes = %d" % num_of_output_nodes)
print("length_of_sequences = %d" % length_of_sequences)
print("num_of_training_epochs = %d" % num_of_training_epochs)
print("length_of_initial_sequences = %d" % length_of_initial_sequences)
print("num_of_prediction_epochs = %d" % num_of_prediction_epochs)
print("size_of_mini_batch = %d" % size_of_mini_batch)
print("learning_rate = %f" % learning_rate)
print("forget_bias = %f" % forget_bias)
train_data = np.load(train_data_path)
print("train_data:", train_data)
#Fix the random number seed.
random.seed(0)
np.random.seed(0)
tf.set_random_seed(0)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
with tf.Graph().as_default():
input_ph = tf.placeholder(tf.float32, [None, length_of_sequences, num_of_input_nodes], name="input")
supervisor_ph = tf.placeholder(tf.float32, [None, num_of_output_nodes], name="supervisor")
istate_ph = tf.placeholder(tf.float32, [None, num_of_hidden_nodes * 2], name="istate") #Requires two values per cell.
with tf.name_scope("inference") as scope:
weight1_var = tf.Variable(tf.truncated_normal([num_of_input_nodes, num_of_hidden_nodes], stddev=0.1), name="weight1")
weight2_var = tf.Variable(tf.truncated_normal([num_of_hidden_nodes, num_of_output_nodes], stddev=0.1), name="weight2")
bias1_var = tf.Variable(tf.truncated_normal([num_of_hidden_nodes], stddev=0.1), name="bias1")
bias2_var = tf.Variable(tf.truncated_normal([num_of_output_nodes], stddev=0.1), name="bias2")
in1 = tf.transpose(input_ph, [1, 0, 2]) # (batch, sequence, data) -> (sequence, batch, data)
in2 = tf.reshape(in1, [-1, num_of_input_nodes]) # (sequence, batch, data) -> (sequence * batch, data)
in3 = tf.matmul(in2, weight1_var) + bias1_var
in4 = tf.split(0, length_of_sequences, in3) # sequence * (batch, data)
cell = rnn_cell.BasicLSTMCell(num_of_hidden_nodes, forget_bias=forget_bias)
rnn_output, states_op = rnn.rnn(cell, in4, initial_state=istate_ph)
output_op = tf.matmul(rnn_output[-1], weight2_var) + bias2_var
with tf.name_scope("loss") as scope:
square_error = tf.reduce_mean(tf.square(output_op - supervisor_ph))
loss_op = square_error
tf.scalar_summary("loss", loss_op)
with tf.name_scope("training") as scope:
training_op = optimizer.minimize(loss_op)
summary_op = tf.merge_all_summaries()
init = tf.initialize_all_variables()
with tf.Session() as sess:
saver = tf.train.Saver()
summary_writer = tf.train.SummaryWriter("data", graph=sess.graph)
sess.run(init)
for epoch in range(num_of_training_epochs):
inputs, supervisors = make_mini_batch(train_data, size_of_mini_batch, length_of_sequences)
train_dict = {
input_ph: inputs,
supervisor_ph: supervisors,
istate_ph: np.zeros((size_of_mini_batch, num_of_hidden_nodes * 2)),
}
sess.run(training_op, feed_dict=train_dict)
if (epoch + 1) % 10 == 0:
summary_str, train_loss = sess.run([summary_op, loss_op], feed_dict=train_dict)
summary_writer.add_summary(summary_str, epoch)
print("train#%d, train loss: %e" % (epoch + 1, train_loss))
inputs = make_prediction_initial(train_data, 0, length_of_initial_sequences)
outputs = np.empty(0)
states = np.zeros((num_of_hidden_nodes * 2)),
print("initial:", inputs)
np.save("initial.npy", inputs)
for epoch in range(num_of_prediction_epochs):
pred_dict = {
input_ph: inputs.reshape((1, length_of_sequences, 1)),
istate_ph: states,
}
output, states = sess.run([output_op, states_op], feed_dict=pred_dict)
print("prediction#%d, output: %f" % (epoch + 1, output))
inputs = np.delete(inputs, 0)
inputs = np.append(inputs, output)
outputs = np.append(outputs, output)
print("outputs:", outputs)
np.save("output.npy", outputs)
saver.save(sess, "data/model")
Recommended Posts