Try to determine food photos using Google Cloud Vision API

This article is the 12th day article of Fujitsu Systems Web Technology Advent Calendar. (Promise) The content of this article is my own opinion and does not represent the organization to which I belong.

Introduction

This article summarizes the minimum steps to use Google's image recognition API "Cloud Vision API". Finally, I am trying to judge the food photo.

Digression

My personal hobby is Mesitello [^ 1], and I post food photos on the LINE timeline for the purpose of making me think "I'm hungry". I was wondering if I could do something with the image recognition API to take more delicious pictures and improve the quality of Mesitello, so this time I started by trying the Google Cloud Vision API.

What I used this time

Preparation for API use

Registration for Google Cloud Platform

First, register to use Google Cloud Platform to use the Cloud Vision API.

Click "Start for free" on the following site to start the registration procedure. Available with your Google account.

Cloud Computing Service | Google Cloud

Creating a project

When registration is complete, you will see a screen like this and the default project has been created. You can create a new project from the red frame. The API is also available in the default project, but here I created a "Meshitero" project.

コンソールトップ.PNG

Enable Cloud Vision API

If you enter "Viso in API" in the search form, the Cloud Vision API will be hit, so click it. Select "Enable" on the screen after the transition.

コンソールトップ_検索.png

Vision_API_トップ.PNG

Digression: By the way, if you select "Try this API", you can try the demo on the screen.

When enabled, you will see a screen like this.

vision_API_on_top.PNG

Issuance of API key

This time we will call it in Python, so create the credentials. Click "Create Credentials" on the Cloud Vision API screen to move to the "APIs and Services" credentials page. If you select "API Key" from the "Create Credentials" pull-down, an API key will be issued.

APIキー.PNG

APIキー作成.PNG

Restrict keys to avoid unauthorized use. Clicking "Restrict Keys" on the above screen will take you to a screen where you can restrict keys. Set restrictions as needed. This time, we restricted the usage by IP address, and once restricted the APIs that can be used to the Cloud Vision API.

キーを制限.PNG

This completes the settings for using the API.

API call

Installation of Anaconda

To call the API in Python I installed Anaconda referring to the following.

Installation of Anaconda (Windows version)

Creating the source

Create the source by referring to the contents of Reference of API.

Creation of request data to be passed to API

It seems that the image data needs to be passed as a base64 encoded string. Also, specify the type of image analysis and the maximum number of results to be returned in "features". This time, specify label detection (LABEL_DETECTION).

When passing it to the API, it will be in Json format, so it will also be encoded in Json format.

img_request = []
with open(filename, 'rb') as f:
    ctxt = b64encode(f.read()).decode()
    img_requests.append({
            'image': {'content': ctxt},
            'features': [{
                'type': 'LABEL_DETECTION',
                'maxResults': 10
            }]
    })
request_data = json.dumps({"requests": img_request }).encode()

API call part

Specify the request data and the API key output in the previous stage, and send the API request.

API_URL = 'https://vision.googleapis.com/v1/images:annotate'

response = requests.post(API_URL,
                         data=request_data ,
                         params={'key': api_key},
                         headers={'Content-Type': 'application/json'})

Result output

Outputs the data returned from the API.

for resp in enumerate(response.json()['responses']):
            print (json.dumps(resp, indent=2))

Run

Pass the API key issued by GCP and the image path to the created source and execute it. Try some food images.

$ Python Meshitero.py [API key] [Image path]
Whole source (Meshitero.py)

Meshitero.py


from base64 import b64encode
from sys import argv
import json
import requests

API_URL = 'https://vision.googleapis.com/v1/images:annotate'

if __name__ == '__main__':
    api_key = argv[1]
    filename = argv[2]
    
    img_request = []
    with open(filename, 'rb') as f:
        ctxt = b64encode(f.read()).decode()
        img_request.append({
                'image': {'content': ctxt},
                'features': [{
                    'type': 'LABEL_DETECTION',
                    'maxResults': 10
                }]
        })

    request_data = json.dumps({"requests": img_request }).encode()
    
    response = requests.post(API_URL,
                            data=request_data,
                            params={'key': api_key},
                            headers={'Content-Type': 'application/json'})

    if response.status_code != 200 or response.json().get('error'):
        print(response.text)
    else:
        for resp in enumerate(response.json()['responses']):
            print (json.dumps(resp, indent=2))

DSC_1113.PNG
Execution result (oyster)
 {
    "labelAnnotations": [
      {
        "mid": "/m/0_cp5",
        "description": "Oyster",
        "score": 0.9910632,
        "topicality": 0.9910632
      },
      {
        "mid": "/m/02wbm",
        "description": "Food",
        "score": 0.9903261,
        "topicality": 0.9903261
      },
      {
        "mid": "/m/06nwz",
        "description": "Seafood",
        "score": 0.9609892,
        "topicality": 0.9609892
      },
      {
        "mid": "/m/01cqy9",
        "description": "Bivalve",
        "score": 0.9138548,
        "topicality": 0.9138548
      },
      {
        "mid": "/m/02q08p0",
        "description": "Dish",
        "score": 0.8472096,
        "topicality": 0.8472096
      },
      {
        "mid": "/m/01ykh",
        "description": "Cuisine",
        "score": 0.811229,
        "topicality": 0.811229
      },
      {
        "mid": "/m/07xgrh",
        "description": "Ingredient",
        "score": 0.8011539,
        "topicality": 0.8011539
      },
      {
        "mid": "/m/088kg2",
        "description": "Oysters rockefeller",
        "score": 0.70525026,
        "topicality": 0.70525026
      },
      {
        "mid": "/m/0fbdv",
        "description": "Shellfish",
        "score": 0.6510715,
        "topicality": 0.6510715
      },
      {
        "mid": "/m/0ffhy",
        "description": "Clam",
        "score": 0.6364975,
        "topicality": 0.6364975
      }
    ]
  }
DSC_1052.PNG
Execution result (sushi)
  {
    "labelAnnotations": [
      {
        "mid": "/m/02q08p0",
        "description": "Dish",
        "score": 0.9934035,
        "topicality": 0.9934035
      },
      {
        "mid": "/m/01ykh",
        "description": "Cuisine",
        "score": 0.9864208,
        "topicality": 0.9864208
      },
      {
        "mid": "/m/02wbm",
        "description": "Food",
        "score": 0.97343695,
        "topicality": 0.97343695
      },
      {
        "mid": "/m/048wsd",
        "description": "Gimbap",
        "score": 0.96859926,
        "topicality": 0.96859926
      },
      {
        "mid": "/m/07030",
        "description": "Sushi",
        "score": 0.9650486,
        "topicality": 0.9650486
      },
      {
        "mid": "/m/0cjyd",
        "description": "Sashimi",
        "score": 0.9185767,
        "topicality": 0.9185767
      },
      {
        "mid": "/m/04q6ng",
        "description": "Comfort food",
        "score": 0.8544887,
        "topicality": 0.8544887
      },
      {
        "mid": "/m/07xgrh",
        "description": "Ingredient",
        "score": 0.8450334,
        "topicality": 0.8450334
      },
      {
        "mid": "/m/05jrv",
        "description": "Nori",
        "score": 0.8431285,
        "topicality": 0.8431285
      },
      {
        "mid": "/m/027lnr6",
        "description": "Sakana",
        "score": 0.8388547,
        "topicality": 0.8388547
      }
    ]
  }
DSC_0883.PNG
Execution result (hamburger)
 {
    "labelAnnotations": [
      {
        "mid": "/m/02q08p0",
        "description": "Dish",
        "score": 0.9934035,
        "topicality": 0.9934035
      },
      {
        "mid": "/m/02wbm",
        "description": "Food",
        "score": 0.9903261,
        "topicality": 0.9903261
      },
      {
        "mid": "/m/01ykh",
        "description": "Cuisine",
        "score": 0.9864208,
        "topicality": 0.9864208
      },
      {
        "mid": "/m/0h55b",
        "description": "Junk food",
        "score": 0.9851551,
        "topicality": 0.9851551
      },
      {
        "mid": "/m/01_bhs",
        "description": "Fast food",
        "score": 0.97022384,
        "topicality": 0.97022384
      },
      {
        "mid": "/m/0cdn1",
        "description": "Hamburger",
        "score": 0.9571771,
        "topicality": 0.9571771
      },
      {
        "mid": "/m/0cc7bks",
        "description": "Buffalo burger",
        "score": 0.94575346,
        "topicality": 0.94575346
      },
      {
        "mid": "/m/03f476",
        "description": "Veggie burger",
        "score": 0.9283731,
        "topicality": 0.9283731
      },
      {
        "mid": "/m/0bp3f6m",
        "description": "Fried food",
        "score": 0.9257971,
        "topicality": 0.9257971
      },
      {
        "mid": "/m/02y6n",
        "description": "French fries",
        "score": 0.92217153,
        "topicality": 0.92217153
      }
    ]
  }
DSC_1050.PNG
Execution result (fried shrimp)
  {
    "labelAnnotations": [
      {
        "mid": "/m/02q08p0",
        "description": "Dish",
        "score": 0.9934035,
        "topicality": 0.9934035
      },
      {
        "mid": "/m/02wbm",
        "description": "Food",
        "score": 0.9903261,
        "topicality": 0.9903261
      },
      {
        "mid": "/m/01ykh",
        "description": "Cuisine",
        "score": 0.9864208,
        "topicality": 0.9864208
      },
      {
        "mid": "/m/0g9vs81",
        "description": "Steamed rice",
        "score": 0.9271187,
        "topicality": 0.9271187
      },
      {
        "mid": "/m/07xgrh",
        "description": "Ingredient",
        "score": 0.9207317,
        "topicality": 0.9207317
      },
      {
        "mid": "/m/0bp3f6m",
        "description": "Fried food",
        "score": 0.9098738,
        "topicality": 0.9098738
      },
      {
        "mid": "/m/0dxjn",
        "description": "Deep frying",
        "score": 0.9049985,
        "topicality": 0.9049985
      },
      {
        "mid": "/m/0f99t",
        "description": "Tonkatsu",
        "score": 0.901048,
        "topicality": 0.901048
      },
      {
        "mid": "/m/0krfg",
        "description": "Meal",
        "score": 0.81980187,
        "topicality": 0.81980187
      },
      {
        "mid": "/m/04q6ng",
        "description": "Comfort food",
        "score": 0.8160322,
        "topicality": 0.8160322
      }
    ]
  }

Results of judging food photos with Cloud Vision API

Oysters, sushi, and hamburgers are not only food products, but also their types. It can be determined that the fried shrimp is fried food, but it seems that it cannot be determined that it is "fried shrimp". I tried it with the photos not shown in this article, but basically it seemed to be able to identify the type of food. Although the genre of fried food is easy to understand, it seems difficult to identify the type of photo with the same conditions as fried shrimp, where the elements of shrimp are difficult to understand on the image.

Finally

I tried label detection this time, but it seems that the Cloud Vision API can also detect the tint of the image. If you can grasp the tendency of the color of food photos, you may be able to understand what kind of color looks delicious. The image recognition API itself is also provided by other than Google, so I think that it is necessary to try that as well in the future.

[^ 1]: The act of making what you see hungry by uploading a picture of a delicious meal at midnight etc.

Recommended Posts

Try to determine food photos using Google Cloud Vision API
I tried using the Google Cloud Vision API
Google Cloud Vision API sample for python
Try using Python with Google Cloud Functions
Speech transcription procedure using Google Cloud Speech API
Use Google Cloud Vision API from Python
How to use GCP's Cloud Vision API
When introducing the Google Cloud Vision API to rails, I followed the documentation.
[Rails] How to detect radical images by analyzing images using Cloud Vision API
How to display Map using Google Map API (Android)
How to use the Google Cloud Translation API
[Google Cloud Platform] Use Google Cloud API using API Client Library
Try to delete tweets in bulk using Twitter API
How to analyze with Google Colaboratory using Kaggle API
Speech transcription procedure using Python and Google Cloud Speech API
I tried to automatically collect erotic images from Twitter using GCP's Cloud Vision API
Try using the Twitter API
Try using the Twitter API
Try using the PeeringDB 2.0 API
I tried the Google Cloud Vision API for the first time
Let's publish the super resolution API using Google Cloud Platform
Try to implement linear regression using Pytorch with Google Colaboratory
Stream speech recognition using Google Cloud Speech gRPC API on python3 on Mac!
Regularly upload files to Google Drive using the Google Drive API in Python
Try using Janus gateway's Admin API
Print PDF using Google Cloud Print. (GoogleAPI)
[Python3] Google translate google translate without using api
Try using Pleasant's API (python / FastAPI)
Try using pynag to configure Nagios
Try to get statistics using e-Stat
Try using Python argparse's action API
Python calling Google Cloud Vision API from LINE BOT via AWS Lambda
Get tweets with Google Cloud Function and automatically save images to Google Photos
Detect Japanese characters from images using Google's Cloud Vision API in Python
Regular export of Google Analytics raw data to BigQuery using cloud functions
Try to build a pipeline to store the result in Bigquery by hitting the Youtube API regularly using Cloud Composer
Google Cloud Speech API vs. Amazon Transcribe
Try using the Wunderlist API in Python
Streaming speech recognition with Google Cloud Speech API
Try using the Kraken API in Python
Bulk posting to Qiita: Team using Qiita API
Try using Dropbox API v2 with Go
Try to operate Excel using Python (Xlwings)
Image collection using Google Custom Search API
Creating Google Spreadsheet using Python / Google Data API
Try to download Youtube videos using Pytube
Convert the cURL API to a Python script (using IBM Cloud object storage)
Continue to challenge Cyma's challenges using the OCR service of Google Cloud Platform