I got Raspberry Pi 3B + and picamera for university classes. Since I'm free, I decided to let Raspberry Pi classify using deep learning. However, instead of classifying the photos taken in advance, the objects in the real-time image from picamera are classified and displayed in a nice way.
It may be at the student level, but I hope it will be helpful in part.
I decided to create a "function that puts multiple personal belongings in the fixed field of view of the picamera and classifies them in real time and displays them **" in the Raspberry Pi.
Specifically, the object is extracted by ** background subtraction ** (a method of extracting the changed part from the background image), and deep learning is performed by ** PyTorch [PyTorch] ** (similar to Keras, TensorFlow). We will take a policy of classifying by.
** (* YOLO, SSD, etc. are not handled!) **
So I implemented it in the next step.
Since the processing of Raspberry Pi is slow, I learned ** on my own PC and classified it on Raspberry Pi using the obtained parameter file **. So I put PyTorch on my PC and Raspberry Pi.
The following is a series of processes. Make a note of the areas where you struggled with the ** [⚠Note] ** symbol.
Prepare the execution environment on the PC and Raspberry Pi.
The versions of the same package are different for PC and Raspberry Pi, but don't worry. Your own PC is for learning.
** * Torchvision ** is a library used for ** image preprocessing and dataset creation ** in combination with PyTorch.
Raspberry Pi 3 Model B+ (Raspbian Stretch)
I used ** Raspberry Pi Camera Module V2 ** as the camera to plug into the Raspberry Pi. I also put VNC Viewer on my PC and operated Raspberry Pi with ** SSH connection **.
Put the above version of the package on each computer. I will omit the details, but I referred to the link site area.
PyTorch / Torchvision
Install on ** PC ** by selecting the environment from PyTorch Official.
** [⚠Note] ** GPU cannot be used unless it is made by NVIDIA, so if you have "intel", select ** CUDA → None ** (normally use CPU).
For ** Raspberry Pi **, "PyTorch v1.3.0 in Raspberry Pi 3" and "PyTorch Deep Learning Framework in Raspberry Pi" How to build from " Thank you for your reference.
** [⚠Note] ** Specify the version as git clone ~~~ -b v1.3.0
etc.
** [⚠Note] ** In PyTorch 1.4.0, fatal error: immintrin.h
does not exist, and the build stopped at about 80%. A mystery. (2020/3/20)
OpenCV
Please refer to "Installing OpenCV 3 on Raspberry Pi + Python 3 as easily as possible" and install it on ** Raspberry Pi **.
Both take a few hours to build ...
After a lot of trial and error, I just created a Python script.
I created image data of personal belongings to be used for learning. It is assumed that the picamera is inserted into the Raspberry Pi and ** fixed so that the picamera does not move **.
After rotating the screen with the "r" key, press "p" to shoot the background without capturing anything. When you place the personal belongings you want to take and shoot again with "p", background subtraction is performed and the photo in the ** green frame ** is saved.
This time, I'm going to classify the three categories of ** "certain phone", "watch", and "wallet" **, so I will just take those three pictures.
take_photo.py
# coding: utf-8
import cv2
from datetime import datetime
import picamera
import picamera.array
MIN_LEN = 50 #Minimum length of one side of the object detection frame
GRAY_THR = 20 #Concentration change threshold
CUT_MODE = True # True:Cut and save the detected object, False:Save the entire image as is
def imshow_rect(img, contour, minlen=0):
"""
Enclose all object detection points in the acquired image with a square frame
argument:
img:Camera image
contour:Contour
minlen:Threshold for detection size (excluding areas where one side of the frame is shorter than this)
"""
for pt in contour:
x, y, w, h = cv2.boundingRect(pt)
if w < minlen and h < minlen: continue
cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)
cv2.imshow('Preview', img)
def save_cutimg(img, contour, minlen=0):
"""
Cut out and save all object detection points in the acquired image
argument:
Same as above
"""
#Get the date and time and use it in the file name
dt = datetime.now()
f_name = '{}.jpg'.format(dt.strftime('%y%m%d%H%M%S'))
imgs_cut = []
for pt in contour:
x, y, w, h = cv2.boundingRect(pt)
if w < minlen and h < minlen: continue
imgs_cut.append(img[y:y+h, x:x+w])
#Cut out and save the object
if not imgs_cut: return -1
if len(imgs_cut) > 1:
for i in range(len(imgs_cut)):
cv2.imwrite(f_name[:-4]+'_'+str(i+1)+f_name[-4:], imgs_cut[i])
else:
cv2.imwrite(f_name, imgs_cut[0])
return len(imgs_cut)
def save_img(img):
"""
Save the acquired image as it is
argument:
Same as above
"""
dt = datetime.now()
fname = '{}.jpg'.format(dt.strftime('%y%m%d%H%M%S'))
cv2.imwrite(fname, img)
def take_photo():
"""
Background shooting->Object photography,Save
Key input:
"p":take a picture
"q":Stop
"r":Rotate the screen (when shooting the background)
"i":Start over from the beginning (when shooting an object)
"""
cnt = 0
#Start picamera
with picamera.PiCamera() as camera:
camera.resolution = (480, 480) #resolution
camera.rotation = 0 #Camera rotation angle(Every time)
#Start streaming
with picamera.array.PiRGBArray(camera) as stream:
print('Set background ... ', end='', flush=True)
#First shoot the background
while True:
#Get and display streaming images
camera.capture(stream, 'bgr', use_video_port=True)
cv2.imshow('Preview', stream.array)
wkey = cv2.waitKey(5) & 0xFF #Key input reception
stream.seek(0) #2 new spells to capture
stream.truncate()
if wkey == ord('q'):
cv2.destroyAllWindows()
return print()
elif wkey == ord('r'):
camera.rotation += 90
elif wkey == ord('p'):
camera.exposure_mode = 'off' #White balance fixed
save_img(stream.array)
#Grayscale and set as background image
back_gray = cv2.cvtColor(stream.array,
cv2.COLOR_BGR2GRAY)
print('done')
break
#After setting the background,Shooting objects without moving the camera
print('Take photos!')
while True:
camera.capture(stream, 'bgr', use_video_port=True)
#Grayscale the current frame
stream_gray = cv2.cvtColor(stream.array,
cv2.COLOR_BGR2GRAY)
#Calculate the absolute value of the difference and binarize it,Mask making
diff = cv2.absdiff(stream_gray, back_gray)
mask = cv2.threshold(diff, GRAY_THR, 255,
cv2.THRESH_BINARY)[1]
cv2.imshow('mask', mask)
#Contour for object detection,Mask making
contour = cv2.findContours(mask,
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)[1]
#All detected objects are enclosed in a square and displayed.
stream_arr = stream.array.copy()
imshow_rect(stream_arr, contour, MIN_LEN)
wkey = cv2.waitKey(5) & 0xFF
stream.seek(0)
stream.truncate()
if wkey == ord('q'):
cv2.destroyAllWindows()
return
elif wkey == ord('i'):
break
elif wkey == ord('p'):
if CUT_MODE:
num = save_cutimg(stream.array, contour, MIN_LEN)
if num > 0:
cnt += num
print(' Captured: {} (sum: {})'.format(num, cnt))
else:
save_img(stream.array)
cnt += 1
print(' Captured: 1 (sum: {})'.format(cnt))
print('Initialized')
take_photo()
if __name__ == '__main__':
take_photo()
I just take a picture. The cropped image for each green frame is saved like this.
➡ & &
** [⚠Note] If there are too few pictures, it will not learn well. ** ** I took more than 50 photos for each class for training data, but I wonder if there are still few ... For the time being, various noises are added during learning, and the amount of data increases.
Put the photos in a folder and use Slack or something to ** move them to your PC **. (Semi-analog) Then, store the photos of each personal item in the folder structure below **. ** **
image_data
├─train
│ ├─phone
│ │ 191227013419.jpg
│ │ 191227013424.jpg
│ │ :
│ ├─wallet
│ │ 191227013300.jpg
│ │ 191227013308.jpg
│ │ :
│ └─watch
│ 191227013345.jpg
│ 191227013351.jpg
| :
└─val
├─phone
│ 191227013441.jpg
│ 191227013448.jpg
| :
├─wallet
│ 191227013323.jpg
│ 191227013327.jpg
| :
└─watch
191227013355.jpg
191227013400.jpg
:
Build a network and train with the image above.
When executed, the image is read from the previous folder and learning is started, and the progress file, loss and accuracy transition diagram, and final parameter file are output.
In creating it, I referred to "PyTorch Neural Network Implementation Handbook" (Shuwa System).
Even if you interrupt with "Ctrl + C", the learning progress up to that point is saved as ** "train_process.ckpt" **, and you can continue learning from the next execution. It is okay to change the hyperparameters on the way.
By the way, torchvsion's ** Image Folder ** creates a dataset with the folder name containing the photos as the class name. Easy! !! The photos in the train folder will be used for learning, and the photos in the val folder will be used for evaluation.
train_net.py
# coding: utf-8
import os
import re
import torch.nn as nn
import torch.optim as optim
import torch.utils
from torchvision import datasets, transforms
import matplotlib.pyplot as plt
DATA_DIR = 'image_data' #Image folder name
CKPT_PROCESS = 'train_process.ckpt' #Learning progress save file name
CKPT_NET = 'trained_net.ckpt' #Learned parameter file name
NUM_CLASSES = 3 #Number of classes
NUM_EPOCHS = 100 #Number of learning
#Hyperparameters that change often
LEARNING_RATE = 0.001 #Learning rate
MOMENTUM = 0.5 #inertia
checkpoint = {} #Variable for saving progress
#Image data conversion definition (bulky)
#With the size of Resize,Related to first Linear input size of classifier
data_transforms = transforms.Compose([
transforms.Resize((112, 112)), #resize
transforms.RandomRotation(30), #Randomly rotate
transforms.Grayscale(), #Binarization
transforms.ToTensor(), #Tensorization
transforms.Normalize(mean=[0.5], std=[0.5]) #Normalization (numbers are texto)
])
val_transforms = transforms.Compose([
transforms.Resize((112, 112)),
transforms.Grayscale(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.5], std=[0.5])
])
#Data set creation
train_dataset = datasets.ImageFolder(
root=os.path.join(DATA_DIR, 'train'),
transform=train_transforms
)
val_dataset = datasets.ImageFolder(
root=os.path.join(DATA_DIR, 'val'),
transform=val_transforms
)
#Get mini batch
train_loader = torch.utils.data.DataLoader(
dataset=train_dataset,
batch_size=10, #Batch size during learning
shuffle=True #Shuffle training data
)
val_loader = torch.utils.data.DataLoader(
dataset=val_dataset,
batch_size=10,
shuffle=True
)
class NeuralNet(nn.Module):
"""Network definition. nn.Module inheritance"""
def __init__(self, num_classes):
super(NeuralNet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(1, 8, kernel_size=11, stride=4, padding=2),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(8, 16, kernel_size=5, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2),
)
self.classifier = nn.Sequential(
nn.Dropout(p=0.5),
nn.Linear(400, 200),
nn.ReLU(inplace=True),
nn.Dropout(p=0.5),
nn.Linear(200, num_classes)
)
def forward(self, x):
x = self.features(x)
x = x.view(x.size(0), -1)
x = self.classifier(x)
return x
def main():
"""Data reading during training->Learning(->Saving data during training)->Illustration of results"""
global checkpoint
print('[Settings]')
#Device settings
device = 'cuda' if torch.cuda.is_available() else 'cpu'
#network,Evaluation function,Optimization function settings
net = NeuralNet(NUM_CLASSES).to(device)
criterion = nn.CrossEntropyLoss() #Evaluation function
optimizer = optim.SGD( #Optimization algorithm
net.parameters(),
lr=LEARNING_RATE,
momentum=MOMENTUM,
weight_decay=5e-4
)
#View settings
# print(' Device :', device)
# print(' Dataset Class-Index :', train_dataset.class_to_idx)
# print(' Network Model :', re.findall('(.*)\(', str(net))[0])
# print(' Criterion :', re.findall('(.*)\(', str(criterion))[0])
# print(' Optimizer :', re.findall('(.*)\(', str(optimizer))[0])
# print(' -Learning Rate :', LEARNING_RATE)
# print(' -Momentum :', MOMENTUM)
t_loss_list = []
t_acc_list = []
v_loss_list = []
v_acc_list = []
epoch_pre = -1
#Training (on the way) data acquisition
if os.path.isfile(CKPT_PROCESS):
checkpoint = torch.load(CKPT_PROCESS)
net.load_state_dict(checkpoint['net'])
optimizer.load_state_dict(checkpoint['optimizer'])
t_loss_list = checkpoint['t_loss_list']
t_acc_list = checkpoint['t_acc_list']
v_loss_list = checkpoint['v_loss_list']
v_acc_list = checkpoint['v_acc_list']
epoch_pre = checkpoint['epoch']
print("Progress until last time = {}/{} epochs"\
.format(epoch_pre+1, NUM_EPOCHS))
print('[Main process]')
for epoch in range(epoch_pre+1, NUM_EPOCHS):
t_loss, t_acc, v_loss, v_acc = 0, 0, 0, 0
#Learning---------------------------------------------------------
net.train() #Learning mode
for _, (images, labels) in enumerate(train_loader):
images, labels = images.to(device), labels.to(device)
optimizer.zero_grad()
outputs = net(images)
loss = criterion(outputs, labels)
t_loss += loss.item()
t_acc += (outputs.max(1)[1] == labels).sum().item()
loss.backward()
optimizer.step()
avg_t_loss = t_loss / len(train_loader.dataset)
avg_t_acc = t_acc / len(train_loader.dataset)
#Evaluation---------------------------------------------------------
net.eval() #Evaluation mode
with torch.no_grad(): #Stop updating the gradient
for images, labels in val_loader:
images, labels = images.to(device), labels.to(device)
images = images.to(device)
labels = labels.to(device)
outputs = net(images)
loss = criterion(outputs, labels)
v_loss += loss.item()
v_acc += (outputs.max(1)[1] == labels).sum().item()
avg_v_loss = v_loss / len(val_loader.dataset)
avg_v_acc = v_acc / len(val_loader.dataset)
# --------------------------------------------------------------
print('\rEpoch [{}/{}] | Train [oss:{:.3f}, acc:{:.3f}] | Val [loss:{:.3f}, acc:{:.3f}]'\
.format(epoch+1, NUM_EPOCHS, avg_t_loss, avg_t_acc, avg_v_loss, avg_v_acc), end='')
#loss,Accuracy record
t_loss_list.append(avg_t_loss)
t_acc_list.append(avg_t_acc)
v_loss_list.append(avg_v_loss)
v_acc_list.append(avg_v_acc)
#Process for saving progress
checkpoint['net'] = net.state_dict()
checkpoint['optimizer'] = optimizer.state_dict()
checkpoint['t_loss_list'] = t_loss_list
checkpoint['t_acc_list'] = t_acc_list
checkpoint['v_loss_list'] = v_loss_list
checkpoint['v_acc_list'] = v_acc_list
checkpoint['epoch'] = epoch
graph()
save_process()
save_net()
def save_process():
"""Save progress"""
global checkpoint
if not checkpoint: return
torch.save(checkpoint, CKPT_PROCESS)
def save_net():
"""Save only network information"""
global checkpoint
if not checkpoint: return
torch.save(checkpoint['net'], CKPT_NET)
def graph():
"""loss,Graphing accuracy"""
global checkpoint
if not checkpoint: return
t_loss_list = checkpoint['t_loss_list']
t_acc_list = checkpoint['t_acc_list']
v_loss_list = checkpoint['v_loss_list']
v_acc_list = checkpoint['v_acc_list']
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(range(len(t_loss_list)), t_loss_list,
color='blue', linestyle='-', label='t_loss')
plt.plot(range(len(v_loss_list)), v_loss_list,
color='green', linestyle='--', label='v_loss')
plt.legend()
plt.xlabel('epoch')
plt.ylabel('loss')
plt.title('Training and validation loss')
plt.grid()
plt.subplot(1, 2, 2)
plt.plot(range(len(t_acc_list)), t_acc_list,
color='blue', linestyle='-', label='t_acc')
plt.plot(range(len(v_acc_list)), v_acc_list,
color='green', linestyle='--', label='v_acc')
plt.legend()
plt.xlabel('epoch')
plt.ylabel('acc')
plt.title('Training and validation accuracy')
plt.grid()
plt.show()
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print()
graph()
save_process()
** [⚠Note] The scale of the network is moderate. ** **
If you increase the number of layers and nodes too much, you will get the error DefaultCPUAllocator: can't allocate memory: you tried to allocate 685198800 bytes.
If you later classify by Raspberry Pi, will a huge amount of parameters consume memory? ..
Click here for the progress of learning. The left is the loss and the right is the accuracy. The blue line is for training data and the green dashed line is for verification data. The accuracy of the verification data is about 72%. There is room for improvement ...
When you finish learning, you will have a ** "" trained_net.ckpt "" ** file that stores only the trained parameters, so send it to Raspberry Pi again with Slack or something **.
As a goal, the objects in the camera image are classified in real time and displayed in a nice way.
First shoot the background, then divide the background from the frame, cut out the emerging object and make it into a 4D tensor batch through defined pre-processing. The entire batch is passed through the network, converted to the probability of each class, and the class (name of the object) with the highest probability is overlaid and displayed in the window.
Load the "trained_net.ckpt" created earlier.
** [⚠Note] If you do not set an upper limit on the batch size (the number of objects to be detected at one time), the Raspberry Pi may freeze when trying to process a large amount of detected areas at once. ** **
raltime_classification.py
# coding: utf-8
import os
from PIL import Image
from time import sleep
import cv2
import picamera
import picamera.array
import torch
#In the pytorch directory"export OMP_NUM_THREADS=1 or 2 or 3"Mandatory(The default is 4)
#The number of parallel processing cores"print(torch.__config__.parallel_info())"Confirm with
import torch.nn as nn
import torch.utils
from torchvision import transforms
CKPT_NET = 'trained_net.ckpt' #Trained parameter file
OBJ_NAMES = ['Phone', 'Wallet', 'Watch'] #Display name of each class
MIN_LEN = 50
GRAY_THR = 20
CONTOUR_COUNT_MAX = 3 #Batch size(Number of objects to detect at one time)Upper limit of
SHOW_COLOR = (255, 191, 0) #Frame color(B,G,R)
NUM_CLASSES = 3
PIXEL_LEN = 112 #Size after resize(1 side)
CHANNELS = 1 #Number of color channels(BGR:3,grayscale:1)
#Image data conversion definition
#With Resize,Related to the first Linear input of classifier
data_transforms = transforms.Compose([
transforms.Resize((PIXEL_LEN, PIXEL_LEN)),
transforms.Grayscale(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.5], std=[0.5])
])
class NeuralNet(nn.Module):
"""Network definition.Must be the same as the one used for learning"""
def __init__(self, num_classes):
super(NeuralNet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(1, 8, kernel_size=11, stride=4, padding=2),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(8, 16, kernel_size=5, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2),
)
self.classifier = nn.Sequential(
nn.Dropout(p=0.5),
nn.Linear(400, 200),
nn.ReLU(inplace=True),
nn.Dropout(p=0.5),
nn.Linear(200, num_classes)
)
def forward(self, x):
x = self.features(x)
x = x.view(x.size(0), -1)
x = self.classifier(x)
return x
def detect_obj(back, target):
"""
With OpenCV background subtraction processing,Create a tuple of detected objects
argument:
back:Input background image
Color image
target:Image for background subtraction
Color image.Cut out multiple objects,Collect in color image tuples
"""
print('Detecting objects ...')
#Binarization
b_gray = cv2.cvtColor(back, cv2.COLOR_BGR2GRAY)
t_gray = cv2.cvtColor(target, cv2.COLOR_BGR2GRAY)
#Calculate the difference
diff = cv2.absdiff(t_gray, b_gray)
#Contour according to threshold,Create a mask,Extract the object
#The index of findContours is, cv2.__version__ == 4.2.0->[0], 3.4.7->[1]
mask = cv2.threshold(diff, GRAY_THR, 255, cv2.THRESH_BINARY)[1]
cv2.imshow('mask', mask)
contour = cv2.findContours(mask,
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)[1]
#Coordinates of the change area detected above a certain height and width,Create size batch
pt_list = list(filter(
lambda x: x[2] > MIN_LEN and x[3] > MIN_LEN,
[cv2.boundingRect(pt) for pt in contour]
))[:CONTOUR_COUNT_MAX]
#Cut out the frame according to the position information,Convert to tuple of PIL image and return
obj_imgaes = tuple(map(
lambda x: Image.fromarray(target[x[1]:x[1]+x[3], x[0]:x[0]+x[2]]),
pt_list
))
return (obj_imgaes, pt_list)
def batch_maker(tuple_images, transform):
"""
Transform tuples of PIL format images,Convert to a tensor batch that can be processed on the network
argument:
tuple_images:PIL image tuple
transform:torchvision image conversion definition
"""
return torch.cat([transform(img) for img
in tuple_images]).view(-1, CHANNELS, PIXEL_LEN, PIXEL_LEN)
def judge_what(img, probs_list, pos_list):
"""
Determine the object from the probability of belonging to each class,Display frame and name at that position,Returns the index of the class
argument:
probs_list:Secondary array of probabilities.Batch format
pos_list:Secondary array of positions.Batch format
"""
print('Judging objects ...')
#Convert to a list of the highest probabilities and their indexes
ip_list = list(map(lambda x: max(enumerate(x), key = lambda y:y[1]),
F.softmax(probs_list, dim=-1))) # <- 4/30 fixes
#Convert index to object name,Write and display the object name and certainty at the position of the object
for (idx, prob), pos in zip(ip_list, pos_list):
cv2.rectangle(img, (pos[0], pos[1]), (pos[0]+pos[2], pos[1]+pos[3]), SHOW_COLOR, 2)
cv2.putText(img, '%s:%.1f%%'%(OBJ_NAMES[idx], prob*100), (pos[0]+5, pos[1]+20),
cv2.FONT_HERSHEY_SIMPLEX, 0.8, SHOW_COLOR, thickness=2)
return ip_list
def realtime_classify():
"""Trained model loading->Read test data->Classification->Display the result overlaid on the image"""
#Device settings
device = 'cuda' if torch.cuda.is_available() else 'cpu'
#network settings
net = NeuralNet(NUM_CLASSES).to(device)
#Trained data acquisition
if os.path.isfile(CKPT_NET):
checkpoint = torch.load(CKPT_NET)
net.load_state_dict(checkpoint)
else:
raise FileNotFoundError('No trained network file: {}'.format(CKPT_NET))
#Evaluation mode
net.eval()
#Start picamera
with picamera.PiCamera() as camera:
camera.resolution = (480, 480)
#Start streaming
with picamera.array.PiRGBArray(camera) as stream:
print('Setting background ...')
sleep(2)
camera.exposure_mode = 'off' #White balance fixed
camera.capture(stream, 'bgr', use_video_port=True)
#Set as background
img_back = stream.array
stream.seek(0)
stream.truncate()
print('Start!')
with torch.no_grad():
while True:
camera.capture(stream, 'bgr', use_video_port=True)
#Background subtraction for future input images
img_target = stream.array
#Detects objects and their positions
obj_imgs, positions = detect_obj(img_back, img_target)
if obj_imgs:
#Convert detected object to network input format
obj_batch = batch_maker(obj_imgs, data_transforms)
#Classification
outputs = net(obj_batch)
#Judgment
result = judge_what(img_target, outputs, positions)
print(' Result:', result)
#display
cv2.imshow('detection', img_target)
if cv2.waitKey(200) == ord('q'):
cv2.destroyAllWindows()
return
stream.seek(0)
stream.truncate()
if __name__ == "__main__":
try:
realtime_classify()
except KeyboardInterrupt:
cv2.destroyAllWindows()
Bring "trained_net.ckpt" to Raspberry Pi and execute it in the same directory. The name of the detected object and its certainty are displayed.
The execution result is ... I'm satisfied with the high-precision classification from the moment I put it! !!
➡ ➡
** [⚠Note] It is recommended to change the number of cores used for execution (default 4). ** **
There is a great risk of freezing when used with 4 cores full.
In the pytorch directory, change the command to ʻexport OMP_NUM_THREADS = 2(using 2 cores). You can check the number of cores with
print (torch.config.parallel_info ()). However, closing the shell will discard the changes, so to make it persistent, under
... ~ fi at the bottom of **". Profile "** in
/ home / pi, ʻexport OMP_NUM_THREADS Write = 2
and reboot.
I was able to do what I wanted to do! (I'm sorry for the lack of readability ...) If you use OpenCV face detection, it seems that you can immediately apply it to very simple face recognition.
Originally I was thinking of implementing SSD, but I thought it would be difficult to create a dataset with location information, and I gave up because I could not solve the error that I got when trying to train with sample data. ..
Unlike SSDs, the disadvantage of this background subtraction is that overlapping objects cannot be separated and are judged to be one.
It was a good study ~
Recommended Posts