I tried using BING with OpenCV as pre-processing for CNN [I tried using Selective search as R-CNN] (http://qiita.com/Almond/items/7850cf81903fbe2a2c6c)
So far, I have done R-CNN using BING and Selective Search. However, if you do CNN (convolutional neural network), would you like to find an object-like part at that time? Rest assured, you can do it! So, this time we will estimate the position with CNN. (* Sklearn-theano's GitHub describes Localizing --position estimation, so I will use this expression below.)
Sklearn-theano is a library that facilitates deep learning by abstracting Theano, a framework for deep learning. A similar one is Keras. This is popular and looks pretty good, but this time it's unnamed? I will try using Sklearn-theano.
As mentioned above, Sklearn-theano can easily classify and estimate images as well as position estimation. In addition, you can easily try the libraries of famous places such as OverFeat, AlexNet, VGGNet, GoogLeNet. I want to solve practical problems using various trained models. Why don't you try using Sklearn-theano?
https://github.com/sklearn-theano/sklearn-theano Clone the above repository and install it with the command python setup.py install. Also, this package is required, so if you haven't installed it, install it first. ・ Numpy ・ Scipy ・ Theano ・ Scikit-learn ・ Pillow
This time I tried using plot_single_localization.py in Sample. The position of the sloth is estimated in the last frame. To explain it fairly roughly, we first estimate the points that are likely to be objects in the yellow square area, and then detect the point group that gathered as the final object. Please refer to this page for the detailed detection process.
By the way, the processing time is about 310sec .... about 5 minutes! !! !! !! long. .. .. The official statement says 352.80 seconds, so there seems to be no mistake.
This time, I used a minor library called Sklearn-theano to estimate the position with CNN. If you don't mind taking some time, you may want to try it. However, I think it is tough, needless to say, when speed is required. ** If anyone knows a general object detection method that can withstand real time, please let us know **
Recommended Posts