I tried to sort out the objects from the image of the steak set meal --③ Similar image Heat map detection

Introduction

Continuing from I tried to sort out objects from the images of the steak set meal-several overlaps, this time I compared the histograms to detect similar images. I went.

reference

-Calculate image similarity with Python + OpenCV

Source code

By the way, this time we will put together rectangles of almost the same image size, so we have not dared to resize the image.

group_image.py


# -*- coding: utf-8 -*-

import cv2
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import selectivesearch
import os

def main():
    # loading lena image
    img = cv2.imread("{Image file}")

    # perform selective search
    img_lbl, regions = selectivesearch.selective_search(
        img,
        scale=500,
        sigma=0.9,
        min_size=10
    )

    candidates = set()

    for r in regions:
        # excluding same rectangle (with different segments)
        if r['rect'] in candidates:
            continue

        # excluding regions smaller than 2000 pixels
        if r['size'] < 2000:
            continue

        # distorted rects
        x, y, w, h = r['rect']

        if w / h > 1.2 or h / w > 1.2:
            continue

        candidates.add(r['rect'])

    # draw rectangles on the original image
    fig, ax = plt.subplots(ncols=1, nrows=1, figsize=(6, 6))
    ax.imshow(img)

    overlaps = {}

    #Count the number of overlaps and assign them to the array.
    for x, y, w, h in candidates:
        group = x + y + w + h

        for x2, y2, w2, h2 in candidates:
            if x2 - w < x < x2 + w2 and y2 - h < y < y2 + h2:

                if not group in overlaps:
                    overlaps[group] = 0

                overlaps[group] = overlaps[group] + 1

    #Outputs files with 30 or more overlaps (30 is arbitrarily thresholded).
    for key, overlap in enumerate(overlaps):
        if overlap > 30:
            for x, y, w, h in candidates:
                group = x + y + w + h

                if group in overlaps:
                    cv2.imwrite("{File Path}" + str(group) + '.jpg', img[y:y + h, x:x + w])

    #Calculate the similarity of images by comparing histograms
    image_dir = "{File Path}/"
    target_files = os.listdir(image_dir)
    files = os.listdir(image_dir)

    for target_file in target_files:
        if target_file == '.DS_Store':
            continue

        target_image_path = image_dir + target_file
        target_image = cv2.imread(target_image_path)
        target_hist = cv2.calcHist([target_image], [0], None, [256], [0, 256])

        for file in files:
            if file == '.DS_Store' or file == target_file:
                continue

            comparing_image_path = image_dir + file
            comparing_image = cv2.imread(comparing_image_path)
            comparing_hist = cv2.calcHist([comparing_image], [0], None, [256], [0, 256])

            ret = cv2.compareHist(target_hist, comparing_hist, 0)
            probability = ret * 100

            print("target file: " + target_file, "file: " + file, "similarity: " + str(probability) + "%")

if __name__ == "__main__":
    main()

result

The one with 100% similarity seems to be the exact same image.

('target file: 1018.jpg', 'file: 250.jpg', 'similarity: 24.8911807562%')
('target file: 1018.jpg', 'file: 369.jpg', 'similarity: 6.78223462382%')
('target file: 1018.jpg', 'file: 389.jpg', 'similarity: 11.1974626968%')
('target file: 1018.jpg', 'file: 432.jpg', 'similarity: 35.179639392%')
('target file: 1018.jpg', 'file: 463.jpg', 'similarity: 79.5281353144%')
('target file: 1018.jpg', 'file: 477.jpg', 'similarity: 51.5870749875%')
('target file: 1018.jpg', 'file: 480.jpg', 'similarity: 55.1832671208%')
('target file: 1018.jpg', 'file: 492.jpg', 'similarity: 88.2822944972%')
('target file: 1018.jpg', 'file: 522.jpg', 'similarity: 76.9528435542%')
('target file: 1018.jpg', 'file: 547.jpg', 'similarity: 84.9997652385%')
('target file: 1018.jpg', 'file: 559.jpg', 'similarity: 77.6441098189%')
('target file: 1018.jpg', 'file: 575.jpg', 'similarity: 76.3571281251%')
('target file: 1018.jpg', 'file: 581.jpg', 'similarity: 76.7456283874%')
('target file: 1018.jpg', 'file: 594.jpg', 'similarity: 31.9957806646%')
('target file: 1018.jpg', 'file: 603.jpg', 'similarity: 85.3813480299%')
('target file: 1018.jpg', 'file: 629.jpg', 'similarity: 88.0957855275%')
('target file: 1018.jpg', 'file: 632.jpg', 'similarity: 60.7236277665%')
('target file: 1018.jpg', 'file: 634.jpg', 'similarity: 62.3073009307%')
('target file: 1018.jpg', 'file: 635.jpg', 'similarity: 65.5935422037%')
('target file: 1018.jpg', 'file: 657.jpg', 'similarity: 56.6421422253%')
('target file: 1018.jpg', 'file: 658.jpg', 'similarity: 82.0967550779%')
('target file: 1018.jpg', 'file: 659.jpg', 'similarity: 89.7396556858%')
('target file: 1018.jpg', 'file: 754.jpg', 'similarity: 78.3236083079%')
('target file: 1018.jpg', 'file: 758.jpg', 'similarity: 79.0903410039%')
('target file: 1018.jpg', 'file: 799.jpg', 'similarity: 89.3025985059%')
('target file: 1018.jpg', 'file: 806.jpg', 'similarity: 97.2873823376%')
('target file: 1018.jpg', 'file: 815.jpg', 'similarity: 93.3345515745%')
('target file: 1018.jpg', 'file: 867.jpg', 'similarity: 81.8261095798%')
('target file: 1018.jpg', 'file: 920.jpg', 'similarity: 93.4987208053%')
('target file: 1018.jpg', 'file: 921.jpg', 'similarity: 90.3518029292%')
('target file: 1018.jpg', 'file: 932.jpg', 'similarity: 94.4258967857%')
('target file: 1018.jpg', 'file: 964.jpg', 'similarity: 10.5652113467%')
('target file: 1018.jpg', 'file: 972.jpg', 'similarity: 98.8755231495%')
----The following is omitted

Details

What is the similarity of 90% or more?

('target file: 1018.jpg', 'file: 972.jpg', 'similarity: 98.8755231495%')

1018.jpg

972.jpg

('target file: 754.jpg', 'file: 758.jpg', 'similarity: 99.8932682258%')

754.jpg

758.jpg

I feel that the accuracy is quite good.

Summary

It seems that I got a pretty good result once, so next time I would like to group images with high similarity and put them together.

All page links

-I tried object detection using Python and OpenCV -I tried to sort out objects from the image of steak set meal-① Object detection -I tried to sort out the objects from the image of the steak set meal-② Overlap number sorting -I tried to sort out the objects from the image of the steak set meal-③ Similar image heat map detection -I tried to sort out the objects from the image of the steak set meal-④ Clustering -I tried to sort out objects from the image of steak set meal-⑤ Similar image feature point detection edition

Recommended Posts

I tried to sort out the objects from the image of the steak set meal --③ Similar image Heat map detection
I tried to sort out the objects from the image of the steak set meal-⑤ Similar image feature point detection
I tried to sort out the objects from the image of the steak set meal-① Object detection
I tried to sort out the objects from the image of the steak set meal-④ Clustering
I tried to sort out the objects from the image of the steak set meal-② Overlap number sorting
I tried to cut out a still image from the video
I tried to correct the keystone of the image
I tried to display the infection condition of coronavirus on the heat map of seaborn
I tried to detect the iris from the camera image
I tried to find the entropy of the image with python
I tried to transform the face image using sparse_image_warp of TensorFlow Addons
I tried cluster analysis of the weather map
I tried to touch the API of ebay
I tried to wake up the place name that appears in the lyrics of Masashi Sada on the heat map
I tried to predict the number of people infected with coronavirus in consideration of the effect of refraining from going out
I tried to automate the face hiding work of the coordination image for wear
I tried using the image filter of OpenCV
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to put out the frequent word ranking of LINE talk with Python
I tried to extract the text in the image file using Tesseract of the OCR engine
I tried to summarize the basic form of GPLVM
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to classify the voices of voice actors
I tried to compress the image using machine learning
I tried to summarize the string operations of Python
I want to cut out only the face from a person image with Python and save it ~ Face detection and trimming with face_recognition ~
I want to output a beautifully customized heat map of the correlation matrix. matplotlib edition
I tried to predict the genre of music from the song title on the Recurrent Neural Network
Implementation of recommendation system ~ I tried to find the similarity from the outline of the movie using TF-IDF ~
I tried using PI Fu to generate a 3D model of a person from one image
I tried to find the trend of the number of ships in Tokyo Bay from satellite images.
I tried to find out the outline about Big Gorilla
I tried "gamma correction" of the image with Python + OpenCV
I tried to get the location information of Odakyu Bus
I tried to find the average of the sequence with TensorFlow
[Python] I tried to visualize the follow relationship of Twitter
[Machine learning] I tried to summarize the theory of Adaboost
I tried to fight the Local Minimum of Goldstein-Price Function
I tried changing the python script from 2.7.11 to 3.6.0 on windows10
I tried to get various information from the codeforces API
[Linux] I tried to summarize the command of resource confirmation system
I tried to get the index of the list using the enumerate function
I tried to automate the watering of the planter with Raspberry Pi
[Python] Try to graph from the image of Ring Fit [OCR]
I tried to process the image in "sketch style" with OpenCV
I tried to process the image in "pencil style" with OpenCV
I tried to expand the size of the logical volume with LVM
I tried to summarize the frequently used implementation method of pytest-mock
I tried to improve the efficiency of daily work with Python
I tried to visualize the common condition of VTuber channel viewers
[Python] I tried to reproduce the emergency escape program to return from the world to return from the modified world of "The disappearance of Haruhi Suzumiya"
I tried to make a thumbnail image of the best avoidance flag-chan! With RGB values ​​[Histogram] [Visualization]
I tried to deliver mail from Node.js and Python using the mail delivery service (SendGrid) of IBM Cloud!
I tried to move the ball
I tried to estimate the interval.
[Python] I tried to summarize the set type (set) in an easy-to-understand manner.
I tried to execute SQL from the local environment using Looker SDK
I tried moving the image to the specified folder by right-clicking and left-clicking
I tried to visualize the age group and rate distribution of Atcoder