Introduction

This time, I tried running YOLO v3 in the environment of Google Colaboratory.

For instructions on setting up YOLO v3 on Google Colaboratory YOLO setup Please refer to the chapter.

What is YOLO v3

** YOLO ** is a ** real-time object detection system **. It's part of a neural network framework called Darknet. The word YOLO is an acronym for "You only look once". YOLO v3 is version 3 of YOLO, which is currently the latest version. For details, please refer to YOLO official page.

YOLO v3 on Google Colab

environment

The environment used this time is Google Colaboratory. Other versions are as follows.

import platform
import cv2

print("Python " + platform.python_version())
print("OpenCV " + cv2.__version__)
# Python 3.6.9
# OpenCV 4.1.2

Preparation

Import the library required to display the image.

import cv2
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib

YOLO setup

Now let's set up YOLO v3 on Google Colab. We will create a working directory and work in it. Note that this setup is not necessary after the first one (for that purpose, work under the working directory).

import os

os.mkdir(working_dir) # working_dir is the working directory
os.chdir(working_dir)

Clone darknet.

!git clone https://github.com/pjreddie/darknet

After cloning, move to the darknet directory and execute make.

os.chdir(working_dir + 'darknet')
!make

After make is finished, download the trained model (weight).

!wget https://pjreddie.com/media/files/yolov3.weights

This completes the setup of YOLO v3 on Google Colab.

Let's move YOLO

Now, let's move YOLO to detect the object. Use the sample image already prepared. The sample image is under darknet / data.

!./darknet detect cfg/yolov3.cfg yolov3.weights 'data/dog.jpg'

# layer     filters    size              input                output
#     0 conv     32  3 x 3 / 1   608 x 608 x   3   ->   608 x 608 x  32  0.639 BFLOPs
#     1 conv     64  3 x 3 / 2   608 x 608 x  32   ->   304 x 304 x  64  3.407 BFLOPs
#     2 conv     32  1 x 1 / 1   304 x 304 x  64   ->   304 x 304 x  32  0.379 BFLOPs
#     3 conv     64  3 x 3 / 1   304 x 304 x  32   ->   304 x 304 x  64  3.407 BFLOPs
#     4 res    1                 304 x 304 x  64   ->   304 x 304 x  64
#     5 conv    128  3 x 3 / 2   304 x 304 x  64   ->   152 x 152 x 128  3.407 BFLOPs
#     6 conv     64  1 x 1 / 1   152 x 152 x 128   ->   152 x 152 x  64  0.379 BFLOPs
#     7 conv    128  3 x 3 / 1   152 x 152 x  64   ->   152 x 152 x 128  3.407 BFLOPs
#     8 res    5                 152 x 152 x 128   ->   152 x 152 x 128
# .........
#    97 upsample            2x    38 x  38 x 128   ->    76 x  76 x 128
#    98 route  97 36
#    99 conv    128  1 x 1 / 1    76 x  76 x 384   ->    76 x  76 x 128  0.568 BFLOPs
#   100 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
#   101 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
#   102 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
#   103 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
#   104 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
#   105 conv    255  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 255  0.754 BFLOPs
#   106 yolo
# Loading weights from yolov3.weights...Done!
# data/dog.jpg: Predicted in 22.825540 seconds.
# dog: 100%
# truck: 92%
# bicycle: 99%

Object detection is complete. Let's display the image and check it. The image that depicts the result of object detection is darknet / predictions.jpg.

img_in = cv2.imread('data/dog.jpg')
img_out = cv2.imread('predictions.jpg')
plt.figure(figsize=[20,10])
plt.subplot(121);plt.imshow(img_in[:,:,::-1]);plt.axis('off')
plt.subplot(122);plt.imshow(img_out[:,:,::-1]);plt.axis('off')

"Dog", "bicycle" and "car" can be detected.

Let's check other images as well.

!./darknet detect cfg/yolov3.cfg yolov3.weights 'data/horses.jpg'

img_in = cv2.imread('data/horses.jpg')
img_out = cv2.imread('predictions.jpg')
plt.figure(figsize=[20,10])
plt.subplot(121);plt.imshow(img_in[:,:,::-1]);plt.axis('off')
plt.subplot(122);plt.imshow(img_out[:,:,::-1]);plt.axis('off')

!./darknet detect cfg/yolov3.cfg yolov3.weights 'data/person.jpg'

img_in = cv2.imread('data/person.jpg')
img_out = cv2.imread('predictions.jpg')
plt.figure(figsize=[20,10])
plt.subplot(121);plt.imshow(img_in[:,:,::-1]);plt.axis('off')
plt.subplot(122);plt.imshow(img_out[:,:,::-1]);plt.axis('off')

!./darknet detect cfg/yolov3.cfg yolov3.weights 'data/kite.jpg'

img_in = cv2.imread('data/kite.jpg')
img_out = cv2.imread('predictions.jpg')
plt.figure(figsize=[20,10])
plt.subplot(121);plt.imshow(img_in[:,:,::-1]);plt.axis('off')
plt.subplot(122);plt.imshow(img_out[:,:,::-1]);plt.axis('off')

Summary

This time, I tried running YOLO v3 in the environment of Google Colaboratory. Object detection was performed using the sample images already prepared. I think it would be interesting to prepare various images and detect objects.

reference

-YOLO official page -[Practice with YOLO v3] Introduction to object detection by deep learning (Udemy)

I tried running YOLO v3 on Google Colab