I want to detect images of cats from Instagram

1.First of all

This is ATcat, an Aidemy trainee. Do you guys use Instagram? I often use Instagram to look at images of cats, but when I'm looking for images of cats, I often get mixed with images other than cats as shown in the figure below. So it seems like only a cat image I have difficulty changing the system on the Instagram side, and I do not have enough time to create a dedicated application, so I got an image that is said to be a "cat" on Instagram I made a system that can extract only images of cats. notcat.png

2. Object detection

This time, we used object detection to extract images of cats, and I will briefly explain this technology called object detection. When there is an object of interest in an image, the technology that identifies only what is reflected from the features of the entire image is called image recognition, but in object detection, it is a technology that identifies even where and what is reflected. In other words, for the objects contained in the image, what is the object of interest in the object and where the object is are specified, and it is represented by a rectangle called a bounding box. There is also a technique called semantic segmentation, which is more complex because it sorts by pixel. This time, we implemented using Google's pre-trained model, but the reason is that it takes a huge amount of time to prepare a dataset, learn time, set an appropriate number of classes, etc. to build and learn a model from scratch. This is because the industry very often uses pre-trained models.

3. Advance preparation

First of all, I decided to collect images from the hashtags #cat and #cat in order to collect images of cats from Instagram. At that time, I used an API called Instagram Scraper.

pip install instagram-scraper

First, install with pip. Instagram Scraper allows you to retrieve posts by specific users and images and videos posted with specified hashtags. This time, I executed it as follows.

insta.sh


#!/bin/sh
instagram_login_user='' #Your username
instagram_login_pass='' #Your password

target_tag='cat' #Tags to be scraped

instagram-scraper \
 --login_user $instagram_login_user \
 --login_pass $instagram_login_pass \
 --tag $target_tag \ 
 --media-types image \ #Specify the data type to get
 --maximum 100 \ #Maximum number of data to retrieve
 --latest \ #Start with the last scraping

I set the number to get as 200.

mosaiccat.png I was able to get the image like this.

4. Implementation

Next, the acquired image is determined to be a cat by object detection. Here, we implemented Google's pre-learned models Faster R-CNN and SSD using Google Colaboratory through Tensorflow Hub.

This time, I implemented it with reference to the following site. https://qiita.com/code0327/items/3b23fd5002b373dc8ae8

The flow here is to acquire and define a pre-trained model through Tensorflow Hub, and perform object detection on the cat image acquired on Instagram. After that, only when a cat is detected, an image showing the detection result will be output.

First, select the imported and trained model.


# For running inference on the TF-Hub module.
import tensorflow as tf
import tensorflow_hub as hub
import os 
import glob
import time
import numpy as np
import matplotlib.patheffects as pe 
import matplotlib.pyplot as plt
import tempfile
from six.moves.urllib.request import urlopen
from six import BytesIO
import numpy as np
from PIL import Image
from PIL import ImageColor
from PIL import ImageDraw
from PIL import ImageFont
from PIL import ImageOps

#SSD or Faster R-Select CNN
#module_handle = 'https://tfhub.dev/google/openimages_v4/ssd/mobilenet_v2/1' 
module_handle = 'https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1'

detector = hub.load(module_handle).signatures['default']

The image of the result of object detection is as follows.


def showImage(img, r, imgfile, min_score=0.1):
  fig = plt.figure(dpi=150,figsize=(8,8))
  ax = plt.gca()
  ax.tick_params(axis='both', which='both', left=False, 
                 labelleft=False, bottom=False, labelbottom=False)
  ax.imshow(img)

  decode = np.frompyfunc( lambda p : p.decode("ascii"), 1, 1)

  boxes =       r['detection_boxes']
  scores =      r['detection_scores']
  class_names = decode( r['detection_class_entities'] )

  n = np.count_nonzero(scores >= min_score)

  # class_Color preparation corresponding to names
  class_set = np.unique(class_names[:n])
  colors = dict()
  cmap = plt.get_cmap('tab10')
  for i, v in enumerate(class_set):
    colors[v] =cmap(i)

  #Draw Rectangle Draw from the one with the lowest score
  img_w = img.shape[1]
  img_h = img.shape[0]
  for i in reversed(range(n)):
    text = f'{class_names[i]} {100*scores[i]:.0f}%'
    color = colors[class_names[i]]
    y1, x1, y2, x2 = tuple(boxes[i])
    y1, y2 = y1*img_h, y2*img_h
    x1, x2 = x1*img_w, x2*img_w

    #frame
    r = plt.Rectangle(xy=(x1, y1), width=(x2-x1), height=(y2-y1),
                      fill=False, edgecolor=color, joinstyle='round', 
                      clip_on=False, zorder=8+(n-i) )
    ax.add_patch( r )

    #Tags: text
    t = ax.text(x1+img_w/200, y1-img_h/300, text, va='bottom', fontsize=6, color=color,zorder=8+(n-i))
    t.set_path_effects([pe.Stroke(linewidth=1.5,foreground='white'), pe.Normal()])
    fig.canvas.draw()
    r = fig.canvas.get_renderer()
    coords = ax.transData.inverted().transform(t.get_window_extent(renderer=r))
    tag_w = abs(coords[0,0]-coords[1,0])+img_w/100
    tag_h = abs(coords[0,1]-coords[1,1])+img_h/120

    #Tags: background
    r = plt.Rectangle(xy=(x1, y1-tag_h), width=tag_w, height=tag_h,
                      edgecolor=color, facecolor=color,
                      joinstyle='round', clip_on=False, zorder=8+(n-i))
    ax.add_patch( r )
  #Save
  plt.savefig('/content/save/'+imgfile)
  plt.close()

I am trying to localize by enclosing it with a rectangle for those with a reliability of min_score or higher.

Finally, define the function to be detected.


import time
import numpy as np
import PIL.Image as Image

def run_detector(detector, path,img_file):
  #Import an image and convert it to a format that can be input to detector
  img = Image.open(path+img_file) # Pillow(PIL)
  if img.mode == 'RGBA' :
    img = img.convert('RGB')
  converted_img = img.copy()
  converted_img = converted_img.resize((227,227),Image.LANCZOS) #Reduce to input size
  converted_img = np.array(converted_img, dtype=np.float32)     # np.Convert to array
  converted_img = converted_img / 255. # 0.0 ~ 1.Normalize to 0
  converted_img = converted_img.reshape([1,227,227,3])
  converted_img = tf.constant(converted_img)

  t1 = time.time()
  result = detector(converted_img) #General object detection (main body)
  t2 = time.time()
  print(f'Detection time: {t2-t1:.3f}Seconds' )

  #Preparing to output the result as text
  r = {key:value.numpy() for key,value in result.items()}
  boxes =       r['detection_boxes']
  scores =      r['detection_scores']
  decode = np.frompyfunc( lambda p : p.decode('ascii'), 1, 1)
  class_names = decode( r['detection_class_entities'] )

  #Score is 0.Text output for more than 25 results (n)
  print(f'Discovery object' )
  n = np.count_nonzero(scores >= 0.25 )
  for i in range(n):
    y1, x1, y2, x2 = tuple(boxes[i])
    x1, x2 = int(x1*img.width), int(x2*img.width)
    y1, y2 = int(y1*img.height),int(y2*img.height)
    t = f'{class_names[i]:10} {100*scores[i]:3.0f}%  '
    t += f'({x1:>4},{y1:>4}) - ({x2:>4},{y2:>4})'
    print(t)
  #Output when a cat is detected
    if "Cat" in t:
      showImage(np.array(img), r, img_file,min_score=0.25) #Overlay the detection result on the image
  return t2-t1

This time, I want to output when a cat is detected, so I made it output when the "Cat" class is detected.

5. Result

As a result of this time, the result performed by Faster R-CNN detected 73 out of 100 sheets and output them. Here is an example that could be detected by both.

<img width="340" alt="代替テキスト" src="https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/689144/66be9ed0-b85c-9f6f-64d2-d384179cf23f.jpeg " "SSDの結果"><img width="340" alt="代替テキスト" src="https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/689144/3b5b960a-26e5-90aa-3187-ac236aefcb60.jpeg " "Faster R-CNNの結果"> In this figure, the left side is the SSD result and the right side is the Faster R-CNN result. The average detection time was 0.23 seconds for SSD and 1.30 seconds for Faster R-CNN. In addition, the result was 74 for SSD. Although the number of images is close, I think you can see that there are surprisingly many images that are not covered with cats and that they are good or bad at images by the detection method. Both results contained almost no images other than cats, so it can be said that they were successful in picking up only images of cats. The following image was an example of what I got even though I was not a cat. 代替テキスト When I looked at the list, I thought it was a cat, but when I looked closely, it was a dog. Also, among the images detected as cats, the one that detected the cat in the picture was rare. I thought it was quite interesting to be able to detect even a picture cat, but it seems difficult to set a class because it is necessary to learn there when distinguishing between a picture cat and a real cat. 代替テキスト

Summary

I was able to detect an image of a cat and not pick up any other images. However, since it was found that there were omissions in each detection method, in the future, it will be possible to obtain both in combination, implement object detection using the now popular DETR and YOLOv5, and use semantic segmentation. I would like to try to create a system that can extract only the cat part in the image. Thank you for staying with us until the end!

Referenced site

https://qiita.com/code0327/items/3b23fd5002b373dc8ae8 https://github.com/arc298/instagram-scraper https://githubja.com/rarcega/instagram-scraper

Recommended Posts

I want to detect images of cats from Instagram
I want to scrape images to learn
I want to start a lot of processes from python
I want to detect objects with OpenCV
I want to use jar from python
I want to connect to PostgreSQL from various languages
I want to email from Gmail using Python.
[Python] I want to manage 7DaysToDie from Discord! 1/3
I want to perform SageMaker inference from PHP
I want to display multiple images with matplotlib.
I want to make fits from my head
I want to get League of Legends data ③
I want to get League of Legends data ②
I want to use ceres solver from python
[I want to classify images using Tensorflow] (2) Let's classify images
[Python] I want to manage 7DaysToDie from Discord! 2/3
I want to make C ++ code from Python code!
I want to customize the appearance of zabbix
I want to get League of Legends data ①
I want to get / execute variables / functions / classes of external files from Python
I want to see the file name from DataLoader
I want to detect unauthorized login to facebook with Jubatus (1)
I tried to detect the iris from the camera image
I want to grep the execution result of strace
I want to fully understand the basics of Bokeh
I want to install a package of Php Redis
[Python3] I want to generate harassment names from Japanese!
I want to increase the security of ssh connections
I want to collect a lot of images, so I tried using "google image download"
I want to solve Sudoku (Sudoku)
I want to specify another version of Python with pyvenv
I want to use only the normalization process of SudachiPy
NikuGan ~ I want to see a lot of delicious meat! !!
I want to get the operation information of yahoo route
I want to calculate the allowable downtime from the operating rate
I want to color black-and-white photos of memories with GAN
I want to judge the authenticity of the elements of numpy array
I want to install a package from requirements.txt with poetry
I want to send a message from Python to LINE Bot
I want to know the features of Python and pip
Keras I want to get the output of any layer !!
I want to know the legend of the IT technology world
I want to change the symbolic link destination of / lib64 from / usr / lib64 to / my-lib64 on CentOS
I want to get rid of import warnings from Pyright and pylint in VS Code
I tried to find the trend of the number of ships in Tokyo Bay from satellite images.
I want to get the name of the function / method being executed
I want to manually assign the training parameters of the [Pytorch] model
I want to automatically find high-quality parts from the videos I shot
Detect post-it notes from whiteboard images
I want to extract an arbitrary URL from the character string of the html source with python
I want to understand systemd roughly
I want to read the html version of "OpenCV-Python Tutorials" OpenCV 3.1 version
I want you to be aware of what you will develop together from next year. (From "Readable Code")
I want to output the beginning of the next month with Python
TensorFlow To learn from a large number of images ... ~ (almost) solution ~
Comparison of GCP computing services [I want to use it serverless]
I want to use both key and value of Python iterator
Post images from Python to Tumblr
I want to make a parameter list from CloudFormation code (yaml)
I want to mess with ALB's default security group from CDK
I want to do ○○ with Pandas