When I implemented image recognition using Keras with interest, it was unexpectedly easy, so I decided to use VGG16 with a friend's recommendation to implement a more accurate model. I'm a beginner, so I'll try to find out more. This time, we will use the image of Orin of apples to evaluate whether it can be applied to varieties. It's just a memo.
VGG16 is a 16-layer CNN model trained on a large image dataset called "ImageNet". It was announced in 2014. It is one of the famous trained models used in various studies. Other models trained with ImageNet include AlexNet, GoogLeNet, and ResNet. https://www.y-shinno.com/keras-vgg16/
The following is a reference for the comparison with AlexNet, GoogLeNet, and ResNet here.
(Source: http://thunders1028.hatenablog.com/entry/2017/11/01/035609)
The network of Oxford University's VGG team, which finished second in the 2014 ILSVRC. A normal CNN consisting of a convolution layer and a pooling layer, which is a deeper version of AlexNet, with 16 or 19 layers of weight (convolution layer and fully connected layer). They are called VGG16 and VGG19, respectively.
It features a structure in which two to four convolution layers with small filters are stacked in succession, and the size is halved with a pooling layer. It seems that features can be better extracted by convolving multiple smaller filters (= deepening the layer) than by convolving the image at once with a large filter. (I don't know the reason well, but the number of times it passes through the activation function increases, so the expressiveness increases?) [2]
GoogleNet seems to be stronger, but I will try VGG with an emphasis on comprehensibility. (Things that seem difficult will be from the next time onwards)
I will write the code immediately. First of all, import of Keras
vgg16_fluits.py
!pip install keras
Next, import the required libraries. VGG16 is included in Keras. The weight is specified in the third line below.
#Import the model and display the summary
import numpy as np
from keras.applications.vgg16 import VGG16, preprocess_input, decode_predictions
model = VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
model.summary()
The image used this time evaluates apples (Orin).
#Image reading
from PIL import Image
#import glob
url = '/content/drive/My Drive/Colab Notebooks/img'
files=url+"/apple_orin.jpg "
image =Image.open(files)
image=image.convert('RGB')
image=image.resize((224,224))
#Convert the read PIL format image to array
data = np.asarray(image)
#Evaluation
from keras.preprocessing import image
#Increase the number of samples by one to make a four-dimensional tensor
data = np.expand_dims(data, axis=0)
#Output top 5
preds = model.predict(preprocess_input(data))
results = decode_predictions(preds, top=5)[0]
for result in results:
print(result)
('n07742313', 'Granny_Smith', 0.9861995) ('n02948072', 'candle', 0.0040857443) ('n07747607', 'orange', 0.001778649) ('n03887697', 'paper_towel', 0.0016588464) ('n07693725', 'bagel', 0.0012920648)
It became.
What is the 1st place "Granny_Smith"?
Granny Smith is a cultivar of apples. Developed in Australia in 1868 by accidental seedlings by Maria Anne Smith, the origin of the name
With that said, the image itself is quite close, so it seems that the accuracy is high. ImageNet may not have data on Orin.
The order, label, and class name information for 1000 ImageNet classes are summarized in the following JSON file. Below is Granny_Smith.
https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
Since it is necessary to learn separately in order to perform image recognition to determine the variety, we will do it from the next time onwards.
This time, the purpose was to try it out, so it's OK.
From the next time onward, we will create a model that can be applied to the variety.
The key points when using the VGG16 model are as follows.
model = VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
argument | Description |
---|---|
include_top | Whether to include a fully connected layer that is classified into 1000 classes. |
True: Included (Click here to use for the original 1000 classification) | |
False: Not included (Click here to customize) | |
weights | Weight type |
imagenet: Weights learned using ImageNet | |
None: Random | |
input_tensor | Used when inputting a model image |
Any image data: use it | |
None: Not used | |
input_shape | Specify the shape of the input image |
Any shape: use it | |
None:(224, 224, 3)Is used |
Set include_top to False and use VGG16 for feature extraction for fine tuning. (next time)
Reference (what you are trying to do) http://aidiary.hatenablog.com/entry/20170131/1485864665