You've never used the Vison API while doing image processing. I thought it was amazing, but I didn't do anything about it ..
For the time being, I decided to try it even if it was light, so I wrote it in Python! This article was left as a memo at that time ~
By the way, the registration procedure etc. are done while referring to this area ~
The following functions were used this time. I bring the explanation from the official.
** Automatic detection of objects ** The Cloud Vision API allows you to use object localization to detect and extract multiple objects in an image. Object localization identifies the objects in the image and specifies a LocalizedObjectAnnotation for each object. Each LocalizedObjectAnnotation identifies information about the object, the location of the object, and the border of the area in the image where the object is located. Object localization identifies both prominent and less prominent objects in the image.
It's rough, but please forgive me ... I also wanted the recognized start point coordinates and end point coordinates, so I'm pulling them out with a rough technique. How to check the json key Is this correct? I feel like that.
ENDPOINT_URL = 'https://vision.googleapis.com/v1/images:annotate'
API_KEY = 'API key'
#json keyword
RESPONSES_KEY = 'responses'
LOCALIZED_KEY = 'localizedObjectAnnotations'
BOUNDING_KEY = 'boundingPoly'
NORMALIZED_KEY = 'normalizedVertices'
NAME_KEY = 'name'
X_KEY = 'x'
Y_KEY = 'y'
def get_gcp_info(image):
image_height, image_width, _ = image.shape
min_image = image_proc.exc_resize(int(image_width/2), int(image_height/2), image)
_, enc_image = cv2.imencode(".png ", min_image)
image_str = enc_image.tostring()
image_byte = base64.b64encode(image_str).decode("utf-8")
img_requests = [{
'image': {'content': image_byte},
'features': [{
'type': 'OBJECT_LOCALIZATION',
'maxResults': 5
}]
}]
response = requests.post(ENDPOINT_URL,
data=json.dumps({"requests": img_requests}).encode(),
params={'key': API_KEY},
headers={'Content-Type': 'application/json'})
# 'responses'If the key exists
if RESPONSES_KEY in response.json():
# 'localizedObjectAnnotations'If the key exists
if LOCALIZED_KEY in response.json()[RESPONSES_KEY][0]:
# 'boundingPoly'If the key exists
if BOUNDING_KEY in response.json()[RESPONSES_KEY][0][LOCALIZED_KEY][0]:
# 'normalizedVertices'If the key exists
if NORMALIZED_KEY in response.json()[RESPONSES_KEY][0][LOCALIZED_KEY][0][BOUNDING_KEY]:
name = response.json()[RESPONSES_KEY][0][LOCALIZED_KEY][0][NAME_KEY]
start_point, end_point = check_recognition_point(
response.json()[RESPONSES_KEY][0][LOCALIZED_KEY][0][BOUNDING_KEY][NORMALIZED_KEY],
image_height,
image_width
)
print(name, start_point, end_point)
return True, name, start_point, end_point
print("non", [0, 0], [0, 0])
#If there is not enough information
return False, "non", [0, 0], [0, 0]
def check_recognition_point(point_list_json, image_height, image_width):
#X start point (%) of recognition coordinates
x_start_rate = point_list_json[0][X_KEY]
#Y start point (%) of recognition coordinates
y_start_rate = point_list_json[0][Y_KEY]
#X end point (%) of recognition coordinates
x_end_rate = point_list_json[2][X_KEY]
#Y end point (%) of recognition coordinates
y_end_rate = point_list_json[2][Y_KEY]
x_start_point = int(image_width * x_start_rate)
y_start_point = int(image_height * y_start_rate)
x_end_point = int(image_width * x_end_rate)
y_end_point = int(image_height * y_end_rate)
return [x_start_point, y_start_point], [x_end_point, y_end_point]
The recognized object name is returned in name, and the coordinates of the recognized object are returned in start_point and end_point.
I tried through clothes and shoes, but I was able to recognize it properly! (Although the name was pretty rough) It would be interesting to make a model by yourself using AUTOML.
Recommended Posts