Image recognition of fruits using VGG16

1. 1. background

When I implemented image recognition using Keras with interest, it was unexpectedly easy, so I decided to use VGG16 with a friend's recommendation to implement a more accurate model. I'm a beginner, so I'll try to find out more. This time, we will use the image of Orin of apples to evaluate whether it can be applied to varieties. It's just a memo.

2. What is VGG16 in the first place? : thinking:

VGG16 is a 16-layer CNN model trained on a large image dataset called "ImageNet". It was announced in 2014. It is one of the famous trained models used in various studies. Other models trained with ImageNet include AlexNet, GoogLeNet, and ResNet. https://www.y-shinno.com/keras-vgg16/

The following is a reference for the comparison with AlexNet, GoogLeNet, and ResNet here.

(Source: http://thunders1028.hatenablog.com/entry/2017/11/01/035609)

The network of Oxford University's VGG team, which finished second in the 2014 ILSVRC. A normal CNN consisting of a convolution layer and a pooling layer, which is a deeper version of AlexNet, with 16 or 19 layers of weight (convolution layer and fully connected layer). They are called VGG16 and VGG19, respectively.

It features a structure in which two to four convolution layers with small filters are stacked in succession, and the size is halved with a pooling layer. It seems that features can be better extracted by convolving multiple smaller filters (= deepening the layer) than by convolving the image at once with a large filter. (I don't know the reason well, but the number of times it passes through the activation function increases, so the expressiveness increases?) [2]

GoogleNet seems to be stronger, but I will try VGG with an emphasis on comprehensibility. (Things that seem difficult will be from the next time onwards)

3. 3. Introduction of VGG16 (using Google Colab)

I will write the code immediately. First of all, import of Keras

vgg16_fluits.py


!pip install keras

Next, import the required libraries. VGG16 is included in Keras. The weight is specified in the third line below.

#Import the model and display the summary
import numpy as np
from keras.applications.vgg16 import VGG16, preprocess_input, decode_predictions
model = VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
model.summary()
model.summary () result
Model: "vgg16" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_3 (InputLayer) (None, 224, 224, 3) 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 _________________________________________________________________ flatten (Flatten) (None, 25088) 0 _________________________________________________________________ fc1 (Dense) (None, 4096) 102764544 _________________________________________________________________ fc2 (Dense) (None, 4096) 16781312 _________________________________________________________________ predictions (Dense) (None, 1000) 4097000 ================================================================= Total params: 138,357,544 Trainable params: 138,357,544 Non-trainable params: 0 _________________________________________________________________

The image used this time evaluates apples (Orin). apple_orin.jpg

  #Image reading
from PIL import Image
#import glob
url = '/content/drive/My Drive/Colab Notebooks/img'
files=url+"/apple_orin.jpg "
image =Image.open(files)
image=image.convert('RGB')
image=image.resize((224,224))

#Convert the read PIL format image to array
data = np.asarray(image)
#Evaluation
from keras.preprocessing import image

#Increase the number of samples by one to make a four-dimensional tensor
data = np.expand_dims(data, axis=0)
#Output top 5
preds = model.predict(preprocess_input(data))
results = decode_predictions(preds, top=5)[0]
for result in results:
    print(result)

('n07742313', 'Granny_Smith', 0.9861995) ('n02948072', 'candle', 0.0040857443) ('n07747607', 'orange', 0.001778649) ('n03887697', 'paper_towel', 0.0016588464) ('n07693725', 'bagel', 0.0012920648)

It became.

4. result

What is the 1st place "Granny_Smith"?

Granny Smith is a cultivar of apples. Developed in Australia in 1868 by accidental seedlings by Maria Anne Smith, the origin of the name commodity-granny-smith.jpg

With that said, the image itself is quite close, so it seems that the accuracy is high. ImageNet may not have data on Orin.

The order, label, and class name information for 1000 ImageNet classes are summarized in the following JSON file. Below is Granny_Smith.

https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json

Since it is necessary to learn separately in order to perform image recognition to determine the variety, we will do it from the next time onwards.

This time, the purpose was to try it out, so it's OK.

From the next time onward, we will create a model that can be applied to the variety.

5. Consideration

The key points when using the VGG16 model are as follows.

model = VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)

Details

argument Description
include_top Whether to include a fully connected layer that is classified into 1000 classes.
True: Included (Click here to use for the original 1000 classification)
False: Not included (Click here to customize)
weights Weight type
imagenet: Weights learned using ImageNet
None: Random
input_tensor Used when inputting a model image
Any image data: use it
None: Not used
input_shape Specify the shape of the input image
Any shape: use it
None:(224, 224, 3)Is used

Set include_top to False and use VGG16 for feature extraction for fine tuning. (next time)

Reference (what you are trying to do) http://aidiary.hatenablog.com/entry/20170131/1485864665

Recommended Posts

Image recognition of fruits using VGG16
Python: Basics of image recognition using CNN
Application of CNN2 image recognition
Image capture of firefox using python
Judgment of backlit image using OpenCV
Image recognition
Category estimation using docomo's image recognition API
Image recognition model using deep learning in 2016
Image of closure
I tried image recognition of CIFAR-10 with Keras-Learning-
How to code a drone using image recognition
Nogizaka recognition program (using Yolov5) Table of contents
Chord recognition using chromagram of python library librosa
I tried image recognition of CIFAR-10 with Keras-Image recognition-
Basic principles of image recognition technology (for beginners)
I tried using the image filter of OpenCV
Machine Learning: Image Recognition of MNIST by using PCA and Gaussian Native Bayes
Image recognition with keras
Pepper Tutorial (7): Image Recognition
Image segmentation using U-net
CNN 1 Image Recognition Basics
Example of using lambda
Implementation of VGG16 using Keras created without using a trained model
Tree disease determination by image recognition using CNTK and SVM
Collect large numbers of images using Bing's image search API
Image recognition with API from zero knowledge using AutoML Vision
Judging the victory or defeat of Shadowverse by image recognition
Classify CIFAR-10 image datasets using various models of deep learning
Similar face image detection using face recognition and PCA and K-means clustering
I tried handwriting recognition of runes with CNN using Keras
[PyTorch] Image classification of CIFAR-10
Implementation of TF-IDF using gensim
Try using Jupyter's Docker image
Age recognition using Pepper's API
Deep learning image recognition 1 theory
I tried face recognition using Face ++
python: Basics of using scikit-learn ①
Image recognition of garbage with Edge (Raspberry Pi) from zero knowledge using AutoML Vsion and TPU
Introduction of caffe using pyenv
Image recognition with Keras + OpenCV
A memorandum of using eigen3
Understand the function of convolution using image processing as an example
Trial of voice recognition using Azure with Python (input from microphone)
A story that supports electronic scoring of exams with image recognition
Implementation of Datetime picker action using line-bot-sdk-python and implementation sample of Image Carousel