RNN that processes input one by one, holds the state value inside the network, and outputs the final output after all input is completed for continuous values such as time series data that are likely to be correlated. Mr.
Real-world time-series data is too noisy, and language-based data such as Tensorflow's tutorials are difficult to imagine as "time-series".
And I can't grasp the specifications of the RNNCell
class, so I'll give up and make an experimentally simple self-feedback model.
Chaos is famous for its butterfly effect, but it looks random at first glance, but it is actually useful for analyzing events that have something like a certain law.
Among them, I will talk about the Logistic Map, which is the easiest to understand personally, as an example.
There is a super-simple formula below that allows the value of $ x (t) $ at a point in time $ t $ to be derived from the value of the previous point in time $ x (t-1) $ and the unique parameter $ r $.
x_t = rx_{t-1}(1-x_{t-1})
Or if you want to predict the next point in time, it will be like this.
x_{t+1} = rx_t(1-x_t)
$ x $ is 0 ≤ x ≤ 1
$ r $ is 0 ≤ r ≤ 4
Somewhere in this chaos, this formula has a huge increase in the value of $ x $ from around 3.6
.
Looking at $ x $ as time series data, it is a wonderful random number. If you compare it with a completely random value in chronological order, you can't tell the difference if nothing is said.
The great thing about chaos is that if you scatter $ x_t $ and $ x_ {t + 1} $, you will get some cool shapes (fractals). The difference is clear even when compared with the random value. It seems to start from $ r $. Ah beautiful.
Well, it's such a beautiful chaos, but even if it looks random, it seems easy to predict if it follows the law. However, it's crazy. This is the reason why chaos is chaos, but it is difficult to predict continuous data because even slight differences in initial values and parameters will result in completely different values.
Let's compare it appropriately.
generator.py
def logistic(r, t_first, num_steps):
array = []
for i in range(num_steps):
if "t" not in locals():
t = t_first
else:
t = tt
tt = r*t*(1-t)
array.append(tt)
return array
initial = 0.5
data1 = logistic(3.91, initial, 100)
data2 = logistic(3.9100000000001, initial, 100)
data3 = logistic(3.91, initial+0.0000001, 100)
Even if the error of the estimated value is 1e-10% or less, it will be difficult to predict at 60 data destinations.
** Conversely, if you perform a regression that looks for $ r $, you can make some short-term predictions. ** **
logistic_rnn.py
import os, sys
import numpy as np
import tensorflow as tf
num_steps = 1 # Number of inputs. logistic map only needs a current input
batch_size = 100
epoch_size = 10000
initial = 0.5 # 0 <= x <= 1: to generate data
L = 0.01 # learing rate
PRE_STEPS = 30 # Number of predictions. how far do you want to predict?
N_HIDDEN = 30 # Hidden nodes
'''
generate data
'''
def logistic(r, t_first, num_steps):
array = []
for i in range(num_steps):
if "t" not in locals():
t = t_first
else:
t = tt
tt = r*t*(1-t)
array.append(tt)
return array
def logmap_iterator(raw_data, batch_size, num_steps, prediction_steps):
raw_data = np.array(raw_data)
data_len = len(raw_data)
batch_len = data_len // batch_size
data = np.zeros([batch_size, batch_len])
for i in range(batch_size):
data[i] = raw_data[batch_len * i:batch_len * (i + 1)]
epoch_size = (batch_len - 1) // num_steps
if epoch_size == 0:
raise ValueError("epoch_size == 0, decrease batch_size or num_steps")
for i in range(epoch_size):
x = data[:, i*num_steps:(i+1)*num_steps]
y = data[:, (i+1)*num_steps:(i+1)*num_steps+(prediction_steps)]
yield(x, y)
raw_data = logistic(3.91, initial, num_steps*batch_size*epoch_size)
'''
mini-RNN Model
'''
x = tf.placeholder("float", [None, num_steps])
y = tf.placeholder("float", [None, PRE_STEPS])
weights = {
'hidden': tf.get_variable("hidden", shape=[1, N_HIDDEN], initializer=tf.truncated_normal_initializer(stddev=0.1)),
'out': tf.get_variable("out", shape=[N_HIDDEN, 1], initializer=tf.truncated_normal_initializer(stddev=0.1))
}
biases = {
'hidden': tf.get_variable("b_hidden", shape=[N_HIDDEN], initializer=tf.truncated_normal_initializer(stddev=0.1)),
'out': tf.get_variable("b_out", shape=[1], initializer=tf.truncated_normal_initializer(stddev=0.1))
}
def simple_reg(x, _weights, _biases, K = 1.0):
with tf.variable_scope("weight"):
h1 = tf.matmul(x, _weights['hidden']) + _biases['hidden']
h1 = tf.nn.dropout(tf.nn.sigmoid(h1), K)
o1 = tf.matmul(h1, _weights['out']) + _biases['out']
with tf.variable_scope("weight", reuse=True):
h2 = tf.matmul(o1, _weights['hidden']) + _biases['hidden']
h2 = tf.nn.dropout(tf.nn.sigmoid(h2), K)
o2 = tf.matmul(h2, _weights['out']) + _biases['out']
o = tf.concat(1, [o1, o2])
def more_step(predicted_value, o):
with tf.variable_scope("weight", reuse=True):
h = tf.matmul(predicted_value, _weights['hidden']) + _biases['hidden']
h = tf.nn.dropout(tf.nn.sigmoid(h), K)
o_v = tf.matmul(h, _weights['out']) + _biases['out']
o = tf.concat(1, [o, o_v])
return o, o_v
for i in range(PRE_STEPS-2):
if "o_v" not in locals():
o, o_v = more_step(o2, o)
else:
print o_v
o, o_v = more_step(o_v, o)
return o, o1
o, o1 = simple_reg(x, weights, biases)
z = tf.split(1, PRE_STEPS, y)
z = z[0]
cost = tf.reduce_sum(tf.square(o1 - z))
optimizer = tf.train.AdamOptimizer(L).minimize(cost)
init = tf.initialize_all_variables()
with tf.Session() as sess:
saver = tf.train.Saver()
sess.run(init)
for step in range(10):
gen = logmap_iterator(raw_data, batch_size, num_steps, PRE_STEPS)
for i in range(epoch_size-batch_size):
s, a = gen.next()
training = sess.run([optimizer, o, cost], {x:s, y:a})
if i % 100 == 0:
print "i", i, "cost", training[2]
print "initial input", s[0][:5]
print "pred", training[1][0][:5]
print "answ", a[0][:5]
if i % 2000 == 2000:
L = L/2
save_path = saver.save(sess, "dynamic_model.ckpt")
The shape of the model looks like this. I touched the illustrator for the first time in a long time ...
The number of hidden layers is reasonable, and the reason for using $ sigmoid $ is simply that the output fits in between 0 and 1
.
The input is only one number, because it makes no sense other than $ x_t $ to find $ x_ {t + 1} $. Therefore, there is no sharing of state values inside the network.
(It is a secret that loss did not decrease well when the input was increased)
I don't think this can be officially called an RNN, but it's a Feedloop.
(Actually, it estimates $ r $ from consecutive $ \ in x_n $ and outputs $ x_ {n + 1} $, and adds that $ x_ {n + 1} $ to the input again to $ x_ {n + ... I think it would be more accurate to set it to} $, but I couldn't use Tensorflow's RNN Cell
well.)
For the time being, I was personally satisfied with the results.
Early learning
Eventually, the loss will settle down around 1e-5
, and the best prediction will look like this.
1e-6
is now predictable (or rather, $ r $ is as close as possible) up to 35 steps ahead. It was.
I haven't tried this for other chaotic equations such as Lorenz, so I'm not sure.