Hello.

I don't want to see adult things when collecting images and videos on a crawler. (Eh? Are you happy?) The Google Cloud Vision API, which was talked about a while ago, seems to have the ability to detect such harmful images.

Here's a note of what I did before trying out harmful image detection in a free trial of the Google Cloud Vison API. Other features of the Cloud Vison API, such as image recognition, text detection, face detection, etc., are also available in the same way as shown below.

Steps to try

The Google Cloud Vison API is used in the same way as a general API by following the steps below.

Register personal information and get API key
Make a request and post it to the API
Receive the response and check its contents

The above contents are briefly summarized in order.

Get API key

Get the API key by referring to the following site.

Summary of how to use Cloud Vision API (with sample code)

The points to pay particular attention to are as follows.

Billing registration (credit or debit card registration) is required even for a free trial. __ In other words, you can't try without a card __.
When entering a phone number, prefix it with an international number ("+81" in Japan). (I'm stuck here. I'm sorry that there is no common sense.) ex.) +8109012345678
I definitely don't want to pay! Don't forget to set alerts.

Create / send request

This time I tried it with python. Rather, I could only use python. The requests module is used to send requests, so if it is not included, install it with pip.

`python`


$ pip install requests

To create a request, use generatejson.py published in the official Cloud Vision API tutorial. It is convenient to use. This is to input the information of the request you want to send in a text and output a json file that summarizes the information. The format of the input file is as follows.

`input_file.txt`


#Image Path Function Number:Number of results obtained
filepath_to_image1.jpg 4:10
filepath_to_image2.png 1:10 6:10

Separated by a single-byte space, first specify the Path of the image, then the number of the function you want to use and the number of acquired results, separated by a colon. (Please do not write the comment in the above example.) The function numbers (as of May 19, 2016) correspond to the table below.

Function name	Description	number
FACE_DETECTION	Face detection	1
LANDMARK_DETECTION	Landmark detection	2
LOGO_DETECTION	Logo detection	3
LABEL_DETECTION	Object detection / recognition	4
TEXT_DETECTION	Text detection in images	5
SAFE_SEARCH_DETECTION	Hazardous image detection	6

For example, if you specify "4:10", object recognition will be performed and the top 10 estimated labels will be returned. Even if you don't need "number" such as harmful image detection, I tried it with: X and a numerical value. The result did not change whether X was set to 1 or 10.

Create a json file using generatejson.py as follows.

`python`


python generatejson.py -i <inputfile> -o <outputfile>
# ex.) python generatejson.py -i input_file.txt -o vision.json

Specify the name of the text file created earlier after the -i option and the json file created after the -o option. Once you have a json file, send it. How to send is like this.

$ python
...
>>> import requests
>>> data = open('/path/to/json', 'rb').read()
>>> response = requests.post(url='https://vision.googleapis.com/v1/images:annotate?key=<API-key>',
    data=data,
    headers={'Content-Type': 'application/json'})
>>> print response.text
>>> '''Below is an example of the response
{
  "responses": [
    {
      "safeSearchAnnotation": {
        "adult": "VERY_UNLIKELY",
        "spoof": "VERY_UNLIKELY",
        "medical": "VERY_UNLIKELY",
        "violence": "VERY_UNLIKELY"
      }
    }
  ]
}
'''

In the "" part, enter the API key obtained in "Getting the API key". You can send a request and get a response in about 1 second. You can check the contents with the text property.

How to read the response

Here, only the case of harmful image detection is summarized. I think it is easy to get the contents of the response with the json module as shown below.

$ python
...
>>> #Assume that the variable response has a response
>>> import json
>>> response_json = json.loads(response.text)
>>> #Of the first image"Adult degree(?)"Get
>>> print response["responses"][0]["safeSearchAnnotation"]["adult"]
>>> # -> "VERY_UNLIKELY"
...

There are four viewpoints for judging harmful images: "adult", "medical", "spoof", and "violence". The meaning of each is as shown in the table below.

Perspective	Description
adult	Is it an adult image?
spoof	Is it some kind of processed image?(Plagiarism image?)
medical	Is it a medical image?(Internal organs)
violence	Is it an image of violent depiction?(Grotesque image?)

Harmfulness is judged in 5 stages, and the strongest is "VERY_LIKELY", "LIKELY", "POSSIBLE", "UNLIKELY", "VERY_LIKELY". There also seems to be a value called "UNKNOWN". Is this the label when it couldn't be judged well? ?? (This value was not returned during this trial.)

This page is also helpful for viewing the results of other functions. Summary of how to use Cloud Vision API (with sample code)

accuracy

If you put the result here, it will definitely be deleted, so if you just give your impression, I felt that the accuracy was quite high.

If you are interested, please try it in your own treasured collection.

Summary

Here's a quick summary of the steps to try out harmful image detection with a free trial of the Google Cloud Vision API.

I think this service is very good for people who do not have much knowledge of machine learning or who have knowledge but do not have resources such as learning data and machines to easily experience the latest technology. .. If you are interested, please try the free trial.

It's a lot of fun!

The page I referred to this time

Google Cloud Vision API Cloud Vision API Requests and Responses Summary of how to use Cloud Vision API (with sample code) Machine learning for harmful images? Try using the Cloud Vision API

Until you try the Google Cloud Vision API (harmful image detection)

Steps to try

Get API key

Create / send request

python

input_file.txt

python

How to read the response

accuracy

Summary

The page I referred to this time

`python`

`input_file.txt`

`python`