This is an article about clipping of each value held by tensor handled by TensorFlow.
Specifically, it will be an explanation and implementation example of tf.clip_by ...
system methods.
I think there are various uses, but in machine learning-related calculations (especially gradient calculations), which is the main purpose of TensorFlow, there are many cases where the scale feeling of the variables to be handled is different and it is not possible to calculate well. Clipping and normalization are used in such cases.
In TensorFlow, conventional processes that represent variables, constants, and various operations are called Op nodes (Reference article). By executing (run) in the session, the Op node holds a set (tensor) of the values of the operation result and transmits it to the next Op node. In this article, the numerical set after some processing is completed is referred to as ** node **.
Image of suppressing the value of a node when it meets the conditions for some fixed standard
tf.clip_by_value
tf.clip_by_value(
t,
clip_value_min,
clip_value_max,
name=None
)
For each value held by the node, change the value greater than the maximum value clip_value_max
to clip_value_max
and the value smaller than the minimum value clip_value_min
to clip_value_min
.
example1.py
p1 = tf.placeholder(tf.int32, 6, name='p1')
p2 = tf.placeholder(tf.float32, 6, name='p2')
clip_value1 = tf.clip_by_value(p1, clip_value_max=2, clip_value_min=-2, name='clip_value1')
clip_value2 = tf.clip_by_value(p2, clip_value_max=2., clip_value_min=-2., name='clip_value2')
num1 = np.linspace(-4, 6, 6)
with tf.Session() as sess:
print(p1.eval(feed_dict={p1: num1}, session=sess))
print(p2.eval(feed_dict={p2: num1}, session=sess))
print(clip_value1.eval(feed_dict={p1: num1}, session=sess))
print(clip_value2.eval(feed_dict={p2: num1}, session=sess))
console
[-4 -2 0 2 4 6]
[-4. -2. 0. 2. 4. 6.]
[-2 -2 0 2 2 2]
[-2. -2. 0. 2. 2. 2.]
** If the node and clip_value
type do not match, an error will be thrown. ** **
example1.py
print(clip_error1.eval(feed_dict={p1: num1}, session=sess))
console
TypeError: Expected int32 passed to parameter 'y' of op 'Minimum', got 2.0 of type 'float' instead.
tf.clip_by_norm
tf.clip_by_norm(
t,
clip_norm,
axes=None,
name=None
)
If the node's L2 norm is greater than clip_norm
, change each value to change to this norm. If it is less than clip_norm
, it will not be changed.
example2.py
p3 = tf.placeholder(tf.float32, [2, 3], name='p3')
clip_norm1 = tf.clip_by_norm(p3, clip_norm=4, name='clip_norm1')
clip_norm2 = tf.clip_by_norm(p3, clip_norm=5, name='clip_norm2')
num2 = np.linspace(-2, 3, 6).reshape((2, 3))
with tf.Session() as sess:
print(p3.eval(feed_dict={p3: num2}, session=sess))
print(clip_norm1.eval(feed_dict={p3: num2}, session=sess))
print(clip_norm2.eval(feed_dict={p3: num2}, session=sess))
console
[[-2. -1. 0.]
[ 1. 2. 3.]] #The overall L2 norm is 4.358 ...
[[-1.8353258 -0.9176629 0. ]
[ 0.9176629 1.8353258 2.7529888]]
[[-2. -1. 0.]
[ 1. 2. 3.]]
You can specify axes in tf.clip_by_norm
.
Normalizes the value with the L2 norm for each axis specified by axes.
example3.py
clip_norm3 = tf.clip_by_norm(p3, clip_norm=3, axes=1, name='clip_norm3')
with tf.Session() as sess:
print(p3.eval(feed_dict={p3: num2}, session=sess))
print(clip_norm3.eval(feed_dict={p3: num2}, session=sess))
console
[[-2. -1. 0.] #The L2 norm in column 0 is 2.236 ...
[ 1. 2. 3.]] #The L2 norm in the first row is 3.741 ...
[[-2. -1. 0. ]
[ 0.8017837 1.6035674 2.4053512]]
In addition, tf.clip_by_norm
will result in TypeError if the node to be bitten cannot handle the decimal point.
Please use float ◯◯ or complex ◯◯ type.
tf.clip_by_global_norm
tf.clip_by_global_norm(
t_list,
clip_norm,
use_norm=None,
name=None
)
Unlike tf.clip_by_norm, it passes a ** list of nodes ** instead of nodes. If you pass the node itself, you will get a TypeError.
Let the L2 norm of the entire node stored in the list be global_norm
, and if this value is greater than clip_norm
, change all the values in the list so that the L2 norm is clip_norm
. If it is less than clip_norm
, it will not be changed.
Also, there are two return values: a list containing ** nodes after clipping ** list_clipped
and a calculated global_norm
.
example4.py
c1 = tf.constant([[0, 1, 2], [3, 4, 5]], dtype=tf.float32, name='c1')
c2 = tf.constant([[-2, -4], [2, 4]], dtype=tf.float32, name='c2')
C = [c1, c2]
clip_global_norm, global_norm = tf.clip_by_global_norm(C, clip_norm=9, name='clip_global_norm')
with tf.Session() as sess:
for c in C:
print(c.eval(session=sess))
print(global_norm.eval(session=sess))
for cgn in clip_global_norm1:
print(cgn.eval(session=sess))
console
[[0. 1. 2.]
[3. 4. 5.]]
[[-2. -4.]
[ 2. 4.]]
9.746795
[[0. 0.9233805 1.846761 ]
[2.7701416 3.693522 4.6169024]]
[[-1.846761 -3.693522]
[ 1.846761 3.693522]]
The tf.clip_by_norm
and tf.clip_by_global_norm
methods themselves are simple, but can be used, for example, to address gradient explosion countermeasures "gradient clipping" in RNNs.
The following will be helpful.
- Pascanu et al., (2012), On the difficulty of training Recurrent Neural Networks (pdf) -[Understanding LSTM-Clipping gradient with #rnn and gradient with recent trends](https://qiita.com/t_Signull/items/21b82be280b46f467d1b#rnn%E3%81%A8%E5%8B%BE%E9 % 85% 8D% E3% 81% AE% E3% 82% AF% E3% 83% AA% E3% 83% 83% E3% 83% 94% E3% 83% B3% E3% 82% B0 gradient-cliping)
After building the model, if you want to learn it, you may be able to solve it by taking this method when you jump to inf by error propagation calculation etc.
Clipping actually changed the holding value of the node, but it is also possible to calculate the norm only.
tf.norm
tf.norm(
tensor,
ord='euclidean',
axis=None,
keepdims=None,
name=None
)
The parameter ʻord` determines the value of p in the Lp norm. For the L∞ norm, specify np.inf.
example4.py
p4 = tf.placeholder(tf.float32, [3, 4], name='p4')
normalize1 = tf.norm(p4, name='normalize1')
normalize2 = tf.norm(p4, ord=1.5, axis=0, name='normalize2')
normalize3 = tf.norm(p4, ord=np.inf, axis=1, name='normalize3')
num3 = np.linspace(-10, 8, 12).reshape((3, 4))
with tf.Session() as sess:
print(p4.eval(feed_dict={p4: num3}, session=sess))
print(normalize1.eval(feed_dict={p4: num3}, session=sess))
print(normalize2.eval(feed_dict={p4: num3}, session=sess))
print(normalize3.eval(feed_dict={p4: num3}, session=sess))
console
[[-10. -8.363636 -6.7272725 -5.090909 ]
[ -3.4545455 -1.8181819 -0.18181819 1.4545455 ]
[ 3.090909 4.7272725 6.3636365 8. ]]
19.87232
[12.364525 11.0871725 10.408293 10.876119 ]
[10. 3.4545455 8. ]
TensorFlow > API > TensorFlow Core r2.0 > Python > tf.cilp_by_value TensorFlow > API > TensorFlow Core r2.0 > Python > tf.clip_by_norm TensorFlow > API > TensorFlow Core r2.0 > Python > tf.clip_by_global_norm TensorFlow > API > TensorFlow Core r2.0 > Python > tf.norm
Tomorrow is "Hacking the reservation system using Ruby" by @yoshishin. Please continue to enjoy GMO Advent Calendar 2019!
Recommended Posts