--Text classification using Convolutional Neural Networks (CNN). -Convolutional Neural Networks for Sentence Classification was implemented in Chainer. -[Chainer] Document classification by convolutional neural network Text classification was possible with a higher accuracy rate.
[Convolutional Neural Networks for Sentence Classification](http: / /emnlp2014.org/papers/pdf/EMNLP2014181.pdf) has been implemented in Chainer.
Author's GitHub also publishes an implementation using Theano.
The source code developed this time is available here: chainer-cnnsc
-Data can be found at here. I used "sentence polarity dataset v1.0". -Download directly
--Installation of Chainer, scikit-learn, gensim --Download of the trained model of word2vec (GoogleNews-vectors-negative300.bin.gz).
Use English text data. Please obtain the text data from the above download destination. Each line corresponds to one document. The first column is the label and the second and subsequent columns are the text. Labels 0 are negative documents and 1 are positive documents.
[label] [text(Half-width space delimiter)]
0 it just didn't mean much to me and played too skewed to ever get a hold on ( or be entertained by ) .
1 culkin , who's in virtually every scene , shines as a young man who uses sarcastic lies like a shield .
...
This time, I used the model proposed in this paper (Convolutional Neural Networks for Sentence Classification). You can find a description of the model in this article.
-Model using convolutional neural network in natural language processing
In the program, multiple filter sizes for convolution are defined, and convolution is performed for each filter.
The defined filter size is stored in filter_height
in list format.
For forward propagation, convolution is performed by turning a loop for each filter size as shown below.
#Turn the loop for each filter type
for i, filter_size in enumerate(self.filter_height):
#Through the Convolution layer
h_conv[i] = F.relu(self[i](x))
#Through the Pooling layer
h_pool[i] = F.max_pooling_2d(h_conv[i], (self.max_sentence_len+1-filter_size))
The source code of the network part is shown below.
#I want to make the number of links variable, so I use ChainList
class CNNSC(ChainList):
def __init__(self,
input_channel,
output_channel,
filter_height,
filter_width,
n_label,
max_sentence_len):
#The number of filters, the height of the filters used, and the maximum sentence length will be used later.
self.cnv_num = len(filter_height)
self.filter_height = filter_height
self.max_sentence_len = max_sentence_len
#Added Link for Convolution layer for each filter
# Convolution2D(Number of input channels,Number of output channels (number of filters for each shape),Filter shape (in tuple format),Padding size)
link_list = [L.Convolution2D(input_channel, output_channel, (i, filter_width), pad=0) for i in filter_height]
#Added Link for Dropoff
link_list += [L.Linear(output_channel * self.cnv_num, output_channel * self.cnv_num)]
#Added Link to output layer
link_list += [L.Linear(output_channel * self.cnv_num, n_label)]
#Initialize the class using the list of Links defined so far
super(CNNSC, self).__init__(*link_list)
#By the way
# self.add_link(link)
#It is OK to enumerate the links and add them one by one like
def __call__(self, x, train=True):
#Prepare the filtered intermediate layer
h_conv = [None for _ in self.filter_height]
h_pool = [None for _ in self.filter_height]
#Turn the loop for each filter type
for i, filter_size in enumerate(self.filter_height):
#Through the Convolution layer
h_conv[i] = F.relu(self[i](x))
#Through the Pooling layer
h_pool[i] = F.max_pooling_2d(h_conv[i], (self.max_sentence_len+1-filter_size))
# Convolution+Combine the results of Pooling
concat = F.concat(h_pool, axis=2)
#Dropout on the combined result
h_l1 = F.dropout(F.tanh(self[self.cnv_num+0](concat)),ratio=0.5,train=train)
#Compress the Dropout result to the output layer
y = self[self.cnv_num+1](h_l1)
return y
In the experiment, the data set was divided into training data and test data, and 50 epochs were rotated for training. The correct answer rate for the test data was the 50th epoch, and ʻaccuracy = 0.799437701702`.
This article When classifying documents with a model using a simpler CNN, it was ʻaccuracy = 0.775624996424`, so the accuracy rate is slightly correct. Was found to improve.
input file name: dataset/mr_input.dat
loading word2vec model...
height (max length of sentences): 59
width (size of wordembedding vecteor ): 300
epoch 1 / 50
train mean loss=0.568159639835, accuracy=0.707838237286
test mean loss=0.449375987053, accuracy=0.788191199303
epoch 2 / 50
train mean loss=0.422049582005, accuracy=0.806962668896
test mean loss=0.4778624475, accuracy=0.777881920338
epoch 3 / 50
train mean loss=0.329617649317, accuracy=0.859808206558
test mean loss=0.458206892014, accuracy=0.792877197266
epoch 4 / 50
train mean loss=0.240891501307, accuracy=0.90389829874
test mean loss=0.642955899239, accuracy=0.769447028637
...
epoch 47 / 50
train mean loss=0.000715514877811, accuracy=0.999791562557
test mean loss=0.910120248795, accuracy=0.799437701702
epoch 48 / 50
train mean loss=0.000716249051038, accuracy=0.999791562557
test mean loss=0.904825389385, accuracy=0.801312088966
epoch 49 / 50
train mean loss=0.000753249507397, accuracy=0.999791562557
test mean loss=0.900236129761, accuracy=0.799437701702
epoch 50 / 50
train mean loss=0.000729961204343, accuracy=0.999791562557
test mean loss=0.892229259014, accuracy=0.799437701702
This article also introduces the implementation of text classification using CNN.
-Text classification using convolutional (CNN) and Spatial Pyramid Pooling (SPP-net)
Recommended Posts