Recurrent Neural Network

RNN that processes input one by one, holds the state value inside the network, and outputs the final output after all input is completed for continuous values such as time series data that are likely to be correlated. Mr.

Real-world time-series data is too noisy, and language-based data such as Tensorflow's tutorials are difficult to imagine as "time-series". And I can't grasp the specifications of the RNNCell class, so I'll give up and make an experimentally simple self-feedback model.

Verification target: Chaotic mechanics

Chaos is famous for its butterfly effect, but it looks random at first glance, but it is actually useful for analyzing events that have something like a certain law.

Among them, I will talk about the Logistic Map, which is the easiest to understand personally, as an example.

Discrete logistic equation

There is a super-simple formula below that allows the value of $ x (t) $ at a point in time $ t $ to be derived from the value of the previous point in time $ x (t-1) $ and the unique parameter $ r $.

x_t = rx_{t-1}(1-x_{t-1})

Or if you want to predict the next point in time, it will be like this.

x_{t+1} = rx_t(1-x_t)

$ x $ is 0 ≤ x ≤ 1 $ r $ is 0 ≤ r ≤ 4

Somewhere in this chaos, this formula has a huge increase in the value of $ x $ from around 3.6.

Looking at $ x $ as time series data, it is a wonderful random number. If you compare it with a completely random value in chronological order, you can't tell the difference if nothing is said.

The great thing about chaos is that if you scatter $ x_t $ and $ x_ {t + 1} $, you will get some cool shapes (fractals). The difference is clear even when compared with the random value. It seems to start from $ r $. Ah beautiful.

Short-term prediction of chaos dynamics

Well, it's such a beautiful chaos, but even if it looks random, it seems easy to predict if it follows the law. However, it's crazy. This is the reason why chaos is chaos, but it is difficult to predict continuous data because even slight differences in initial values and parameters will result in completely different values.

Let's compare it appropriately.

`generator.py`


           
def logistic(r, t_first, num_steps):
  array = []                        
  for i in range(num_steps):        
    if "t" not in locals():         
      t = t_first                   
    else:                           
      t = tt                        
    tt = r*t*(1-t)                  
                                    
    array.append(tt)                
  return array                      

initial = 0.5
data1 = logistic(3.91, initial, 100)
data2 = logistic(3.9100000000001, initial, 100)
data3 = logistic(3.91, initial+0.0000001, 100)

Green is data1 data1 vs data2 data1 vs data3

Even if the error of the estimated value is 1e-10% or less, it will be difficult to predict at 60 data destinations.

** Conversely, if you perform a regression that looks for $ r $, you can make some short-term predictions. ** **

`logistic_rnn.py`


import os, sys
import numpy as np
import tensorflow as tf

num_steps = 1 # Number of inputs. logistic map only needs a current input
batch_size = 100
epoch_size = 10000
initial = 0.5  # 0 <= x <= 1: to generate data

L = 0.01 # learing rate
PRE_STEPS = 30 # Number of predictions. how far do you want to predict?
N_HIDDEN = 30 # Hidden nodes


'''
generate data
'''
def logistic(r, t_first, num_steps):
	array = []
	for i in range(num_steps):
		if "t" not in locals():
			t = t_first
		else:
			t = tt
		tt = r*t*(1-t)
		
		array.append(tt)
	return array

def logmap_iterator(raw_data, batch_size, num_steps, prediction_steps):
	raw_data = np.array(raw_data)
	data_len = len(raw_data)
	batch_len = data_len // batch_size
	data = np.zeros([batch_size, batch_len])
	for i in range(batch_size):
		data[i] = raw_data[batch_len * i:batch_len * (i + 1)]

	epoch_size = (batch_len - 1) // num_steps
	if epoch_size == 0:
		raise ValueError("epoch_size == 0, decrease batch_size or num_steps")
	for i in range(epoch_size):
		x = data[:, i*num_steps:(i+1)*num_steps]
		y = data[:, (i+1)*num_steps:(i+1)*num_steps+(prediction_steps)]

		yield(x, y)

raw_data = logistic(3.91, initial, num_steps*batch_size*epoch_size)

'''
mini-RNN Model
'''
x = tf.placeholder("float", [None, num_steps])
y = tf.placeholder("float", [None, PRE_STEPS])

weights = {
  	        'hidden': tf.get_variable("hidden", shape=[1, N_HIDDEN], initializer=tf.truncated_normal_initializer(stddev=0.1)),
	        'out': tf.get_variable("out", shape=[N_HIDDEN, 1],  initializer=tf.truncated_normal_initializer(stddev=0.1))
}
biases = {
  	        'hidden': tf.get_variable("b_hidden", shape=[N_HIDDEN], initializer=tf.truncated_normal_initializer(stddev=0.1)),
  	        'out': tf.get_variable("b_out", shape=[1],  initializer=tf.truncated_normal_initializer(stddev=0.1))
}
	
def simple_reg(x, _weights, _biases, K = 1.0):
	with tf.variable_scope("weight"):
		h1 = tf.matmul(x, _weights['hidden']) + _biases['hidden'] 
		h1 = tf.nn.dropout(tf.nn.sigmoid(h1), K)
		o1 = tf.matmul(h1, _weights['out']) + _biases['out']

	with tf.variable_scope("weight", reuse=True):
		h2 = tf.matmul(o1, _weights['hidden']) + _biases['hidden'] 
		h2 = tf.nn.dropout(tf.nn.sigmoid(h2), K)
		o2 = tf.matmul(h2, _weights['out']) + _biases['out'] 

	o = tf.concat(1, [o1, o2])

	def more_step(predicted_value, o):
		with tf.variable_scope("weight", reuse=True):
			h = tf.matmul(predicted_value, _weights['hidden']) + _biases['hidden'] 
			h = tf.nn.dropout(tf.nn.sigmoid(h), K)
			o_v = tf.matmul(h, _weights['out']) + _biases['out'] 
		o = tf.concat(1, [o, o_v])
		return o, o_v
	
	for i in range(PRE_STEPS-2):
		if "o_v" not in locals():
			o, o_v = more_step(o2, o)
		else:
			print o_v
			o, o_v = more_step(o_v, o)
	return o, o1

o, o1 = simple_reg(x, weights, biases)
z = tf.split(1, PRE_STEPS, y) 
z = z[0]

cost = tf.reduce_sum(tf.square(o1 - z))
optimizer = tf.train.AdamOptimizer(L).minimize(cost)

init = tf.initialize_all_variables()

with tf.Session() as sess:
	saver = tf.train.Saver()
	sess.run(init)
	for step in range(10):
		gen = logmap_iterator(raw_data, batch_size, num_steps, PRE_STEPS)
		for i in range(epoch_size-batch_size):
			s, a  = gen.next()
			training = sess.run([optimizer, o, cost], {x:s, y:a})
			if i % 100 == 0:
				print "i", i, "cost", training[2]
				print "initial input", s[0][:5]
				print "pred", training[1][0][:5]
				print "answ", a[0][:5]
			if i % 2000 == 2000:
				L = L/2
    save_path = saver.save(sess, "dynamic_model.ckpt")

The shape of the model looks like this. I touched the illustrator for the first time in a long time ...

The number of hidden layers is reasonable, and the reason for using $ sigmoid $ is simply that the output fits in between 0 and 1. The input is only one number, because it makes no sense other than $ x_t $ to find $ x_ {t + 1} $. Therefore, there is no sharing of state values inside the network. (It is a secret that loss did not decrease well when the input was increased) I don't think this can be officially called an RNN, but it's a Feedloop.

(Actually, it estimates $ r $ from consecutive $ \ in x_n $ and outputs $ x_ {n + 1} $, and adds that $ x_ {n + 1} $ to the input again to $ x_ {n + ... I think it would be more accurate to set it to} $, but I couldn't use Tensorflow's RNN Cell well.)

For the time being, I was personally satisfied with the results.

Early learning

Eventually, the loss will settle down around 1e-5, and the best prediction will look like this.

This article was uploaded once, but I kept it private only while searching for a highly accurate model. I couldn't find a high-precision model, so the result of simply continuing to lower the loss to about 1e-6 is now predictable (or rather, $ r $ is as close as possible) up to 35 steps ahead. It was.

I haven't tried this for other chaotic equations such as Lorenz, so I'm not sure.

Chaos regression of logistic map by petit RNN in Tensorflow