This time, I tried running YOLO v3 in the environment of Google Colaboratory.
For instructions on setting up YOLO v3 on Google Colaboratory YOLO setup Please refer to the chapter.
** YOLO ** is a ** real-time object detection system **. It's part of a neural network framework called Darknet. The word YOLO is an acronym for "You only look once". YOLO v3 is version 3 of YOLO, which is currently the latest version. For details, please refer to YOLO official page.
The environment used this time is Google Colaboratory. Other versions are as follows.
import platform
import cv2
print("Python " + platform.python_version())
print("OpenCV " + cv2.__version__)
# Python 3.6.9
# OpenCV 4.1.2
Import the library required to display the image.
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib
Now let's set up YOLO v3 on Google Colab. We will create a working directory and work in it. Note that this setup is not necessary after the first one (for that purpose, work under the working directory).
import os
os.mkdir(working_dir) # working_dir is the working directory
os.chdir(working_dir)
Clone darknet.
!git clone https://github.com/pjreddie/darknet
After cloning, move to the darknet directory and execute make.
os.chdir(working_dir + 'darknet')
!make
After make is finished, download the trained model (weight).
!wget https://pjreddie.com/media/files/yolov3.weights
This completes the setup of YOLO v3 on Google Colab.
Now, let's move YOLO to detect the object. Use the sample image already prepared. The sample image is under darknet / data.
!./darknet detect cfg/yolov3.cfg yolov3.weights 'data/dog.jpg'
# layer filters size input output
# 0 conv 32 3 x 3 / 1 608 x 608 x 3 -> 608 x 608 x 32 0.639 BFLOPs
# 1 conv 64 3 x 3 / 2 608 x 608 x 32 -> 304 x 304 x 64 3.407 BFLOPs
# 2 conv 32 1 x 1 / 1 304 x 304 x 64 -> 304 x 304 x 32 0.379 BFLOPs
# 3 conv 64 3 x 3 / 1 304 x 304 x 32 -> 304 x 304 x 64 3.407 BFLOPs
# 4 res 1 304 x 304 x 64 -> 304 x 304 x 64
# 5 conv 128 3 x 3 / 2 304 x 304 x 64 -> 152 x 152 x 128 3.407 BFLOPs
# 6 conv 64 1 x 1 / 1 152 x 152 x 128 -> 152 x 152 x 64 0.379 BFLOPs
# 7 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128 3.407 BFLOPs
# 8 res 5 152 x 152 x 128 -> 152 x 152 x 128
# .........
# 97 upsample 2x 38 x 38 x 128 -> 76 x 76 x 128
# 98 route 97 36
# 99 conv 128 1 x 1 / 1 76 x 76 x 384 -> 76 x 76 x 128 0.568 BFLOPs
# 100 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
# 101 conv 128 1 x 1 / 1 76 x 76 x 256 -> 76 x 76 x 128 0.379 BFLOPs
# 102 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
# 103 conv 128 1 x 1 / 1 76 x 76 x 256 -> 76 x 76 x 128 0.379 BFLOPs
# 104 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
# 105 conv 255 1 x 1 / 1 76 x 76 x 256 -> 76 x 76 x 255 0.754 BFLOPs
# 106 yolo
# Loading weights from yolov3.weights...Done!
# data/dog.jpg: Predicted in 22.825540 seconds.
# dog: 100%
# truck: 92%
# bicycle: 99%
Object detection is complete. Let's display the image and check it. The image that depicts the result of object detection is darknet / predictions.jpg.
img_in = cv2.imread('data/dog.jpg')
img_out = cv2.imread('predictions.jpg')
plt.figure(figsize=[20,10])
plt.subplot(121);plt.imshow(img_in[:,:,::-1]);plt.axis('off')
plt.subplot(122);plt.imshow(img_out[:,:,::-1]);plt.axis('off')
"Dog", "bicycle" and "car" can be detected.
Let's check other images as well.
!./darknet detect cfg/yolov3.cfg yolov3.weights 'data/horses.jpg'
img_in = cv2.imread('data/horses.jpg')
img_out = cv2.imread('predictions.jpg')
plt.figure(figsize=[20,10])
plt.subplot(121);plt.imshow(img_in[:,:,::-1]);plt.axis('off')
plt.subplot(122);plt.imshow(img_out[:,:,::-1]);plt.axis('off')
!./darknet detect cfg/yolov3.cfg yolov3.weights 'data/person.jpg'
img_in = cv2.imread('data/person.jpg')
img_out = cv2.imread('predictions.jpg')
plt.figure(figsize=[20,10])
plt.subplot(121);plt.imshow(img_in[:,:,::-1]);plt.axis('off')
plt.subplot(122);plt.imshow(img_out[:,:,::-1]);plt.axis('off')
!./darknet detect cfg/yolov3.cfg yolov3.weights 'data/kite.jpg'
img_in = cv2.imread('data/kite.jpg')
img_out = cv2.imread('predictions.jpg')
plt.figure(figsize=[20,10])
plt.subplot(121);plt.imshow(img_in[:,:,::-1]);plt.axis('off')
plt.subplot(122);plt.imshow(img_out[:,:,::-1]);plt.axis('off')
This time, I tried running YOLO v3 in the environment of Google Colaboratory. Object detection was performed using the sample images already prepared. I think it would be interesting to prepare various images and detect objects.
-YOLO official page -[Practice with YOLO v3] Introduction to object detection by deep learning (Udemy)
Recommended Posts