Color page judgment of scanned image with python


In order to automate that, we implemented a process to determine whether the scanned image is a color page or a black and white page. (Well, the scanner has automatic discrimination, but I want to keep the original data in color.)

In conclusion, python + SVM gave simple and relatively accurate results. I referred to when implementing it. Thank you very much.

import sklearn
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
import cv2
import numpy as np

class IsColorPageSVM:
    def __init__(self, kernel='linear', random_state=None):
        self.svm = SVC(kernel=kernel, random_state=random_state)
        self.scaler =  StandardScaler() 

    def learn(self, vec_array, label_array, test_size=0.3, random_state=None):
        #Separate training data and test data
        vec_train, vec_test, label_train, label_test = sklearn.model_selection.train_test_split(vec_array, label_array, test_size=test_size, random_state=random_state )

        #Data standardization process
        #Model learning, label_train)

        #Returns the correct answer rate for training data and test data
        return self.test(vec_train, label_train), self.test(vec_test, label_test)

    #If you pass test data and answer, it will return the correct answer rate
    def test(self, vec_array, label_array):
        vec_array_std   = self.scaler.transform(vec_array)
        pred_train        = self.svm.predict(vec_array_std)
        return sklearn.metrics.accuracy_score(label_array, pred_train) 


This is because it specializes in two-class classification and seems to have high accuracy. The other reason is that it seemed like it could be done with python soon. If you use the trendy deep learning, you may get different results.

data form

If you insert the image data as it is, the number of dimensions will vary depending on the image, so conversion processing will be applied. It may be possible to use Reshape for 128 * 128, but I found out that it would be better if it was a color judgment. Tell me goo posted the following.

I calculated the standard deviation of each HSV element using cvAvgSdv. After that, as far as I tried with the image I have, The saturation (S) is now roughly as follows. Color: Approximately 40 to 50 with many single colors. It usually looks like that, over 80, over 100. Black and white: around 20. Yellowing: Around 20 to 30. * I didn't have many samples, so it's reasonable. From now on, it seems that it can be judged by setting the threshold value around 30-40.

Then, I think it's okay to make a judgment based on this, without having to do machine learning. I thought so and tried it with the image I had, but it was mostly correct, but the page that was too burnt was judged as color, but if I raised the threshold, the subtle color page was judged as black and white. There was a problem that it was done. In that case, it is necessary to judge the hue as well, and it is impossible to judge the threshold value by the scale! That's why I decided to do machine learning.

So, convert the image to HSV format, get the standard deviation of all HSVs, and then add the mean/maximum/minimum values ​​and try.

Finally, I tested the following 12-dimensional vector as a data format.

(Standard deviation of H, Standard deviation of S, Standard deviation of V, Mean of H, Mean of S, Mean of V, Maximum value of H, Maximum value of S, Maximum value of V, Minimum value of H Value, minimum value of S, minimum value of V)

Test method

The data of the color page/black and white page was read as follows, and the average value trained 100 times was calculated. At that time, we also conducted a test after extracting some dimensions with the meaning of measuring how much of the 12-dimensional vectors contributed to learning.

item Contents
environment python 3.8
Number of training data for color pages 267
Number of training data for black and white pages 267
Number of trials 100

def main():
    #Read 2D array csv file
    color_vec_array = np.loadtxt('color_page.csv', delimiter=',')
    gray_vec_array = np.loadtxt('gray_page.csv', delimiter=',')
    #Since the training data for black and white pages is large for color images, make them the same size.
    gray_vec_array = gray_vec_array[: len(color_vec_array),:]

    #Merge into one 2D array
    vec_array = np.append(color_vec_array, gray_vec_array, axis=0)

    #Color image label is 0,Set the label of the black and white image to 1.
    label_array = [0] * len(color_vec_array) +[1] * len(gray_vec_array)
    # i,j is the dimension of the vector to extract
    for i in range(0,vec_array.shape[1]):
        for j in range(i+1,vec_array.shape[1] + 1):
            train_score, test_score = 0.0, 0.0
            #Take the average of the results of training 100 times
            test_num = 100
            print('{0:2}:{1:2}Element data'.format(i, j), end='  ')
            for n in range(test_num):
                data_list = vec_array[:,i:j]
                model = IsColorPageSVM()                    
                train, test = model.learn(data_list, label_array)
                train_score += train
                test_score += test
            print(u'Correct answer rate(Training/test):{0:0.3} - {1:0.3}'.format(train_score/test_num, test_score/test_num))

if __name__ == '__main__':


The following is the correct answer rate for each dimension extracted. The item of the number of dimensions indicates from where to where of the 12 dimensions was used (3: 6 means that the 3rd dimension from the 3rd element to the element immediately before the 6th is used). .. Looking at this, the standard deviation of S alone is a considerable accuracy rate. (The area matches what was said in the data format of ↑)

The highest accuracy rate of the test data was 99% of 0: 4 data (standard deviation of HSV element + mean value of H) and 3: 9 (mean value of HSV + maximum value of HSV). What is surprising is that the 3: 9 score, which does not use the standard deviation, also gives the result of the 1st place tie (this is the best result because the score of the training data is also the 1st place). Regarding this, I wrote a little in the chapter on data formats, but it may be because I labeled the image so that monochromatic color pages (such as pages with a single blue color) are also color-judged.

Naturally, information on only the maximum and minimum values ​​and only the brightness is almost unrecognizable (since there are two categories, around 50% is virtually random). The minimum value would be 0 if there were black pixels, and the brightness doesn't matter for color or black and white pages.

Number of dimensions Correct answer rate(Training data) Correct answer rate(test data) Remarks
0: 1 0.774 0.777 H standard deviation only
0: 2 0.958 0.96 H,Standard deviation of S
0: 3 0.97 0.968 H, S,Standard deviation of V
0: 4 0.99 0.99 (1st place tie) H, S,Standard deviation of V,Mean of H
0: 5 0.99 0.987 H, S,Standard deviation of V, H,Mean of S
0: 6 0.991 0.984 H, S,Standard deviation of V, H, S,Mean of V
0: 7 0.991 0.985 H, S,Standard deviation of V, H, S,Mean of V,Maximum value of H
0: 8 0.993 0.986 H, S,Standard deviation of V, H, S,Mean of V, H,Maximum value of S
0: 9 0.996 0.989 H, S,Standard deviation of V, H, S,Mean of V, H, S,Maximum value of V
0:10 0.995 0.988 H, S,Standard deviation of V, H, S,Mean of V, H, S,Maximum value of V,Minimum value of H
0:11 0.995 0.986 H, S,Standard deviation of V, H, S,Mean of V, H, S,Maximum value of V, H,Minimum value of S
0:12 0.995 0.987 All elements
1: 2 0.953 0.952 S standard deviation only
1: 3 0.968 0.964
1: 4 0.99 0.987
1: 5 0.989 0.989
1: 6 0.991 0.986
1: 7 0.992 0.983
1: 8 0.992 0.988
1: 9 0.995 0.989
1:10 0.996 0.987
1:11 0.994 0.987
1:12 0.995 0.985
2: 3 0.602 0.588 V standard deviation only
2: 4 0.832 0.831
2: 5 0.928 0.92
2: 6 0.941 0.937
2: 7 0.943 0.935
2: 8 0.991 0.986
2: 9 0.994 0.988
2:10 0.994 0.989
2:11 0.994 0.987
2:12 0.994 0.986
3: 4 0.814 0.816 Mean of H only
3: 5 0.925 0.925
3: 6 0.937 0.935
3: 7 0.938 0.939
3: 8 0.99 0.988
3: 9 0.994 0.99 (1st place tie) H, S,Mean of V, H, S,Maximum value of V
3:10 0.994 0.989
3:11 0.993 0.986
3:12 0.993 0.986
4: 5 0.772 0.775 Mean of S only
4: 6 0.761 0.75
4: 7 0.762 0.754
4: 8 0.961 0.96
4: 9 0.97 0.964
4:10 0.97 0.961
4:11 0.975 0.964
4:12 0.981 0.97
5: 6 0.525 0.491 Mean of V only
5: 7 0.529 0.492
5: 8 0.961 0.954
5: 9 0.969 0.962
5:10 0.969 0.964
5:11 0.974 0.966
5:12 0.982 0.971
6: 7 0.516 0.477 Maximum value of HV
6: 8 0.957 0.961
6: 9 0.97 0.963
6:10 0.969 0.964
6:11 0.973 0.966
6:12 0.98 0.969
7: 8 0.961 0.957 Maximum value of S only
7: 9 0.97 0.965
7:10 0.968 0.969
7:11 0.973 0.97
7:12 0.978 0.968
8: 9 0.521 0.487 Maximum value of V only
8:10 0.522 0.484
8:11 0.533 0.493
8:12 0.903 0.897
9:10 0.512 0.477 Only the minimum value of H
9:11 0.518 0.477
9:12 0.896 0.891
10:11 0.518 0.481 Only the minimum value of S
10:12 0.895 0.891
11:12 0.894 0.894 Only the minimum value of V

Future plans

Regarding color judgment, the only way to do it in the future is to reshape the image data and use it as it is, or try other methods such as deep learning. After that, I also want to judge whether it is an illustration of a novel or a text page, so I may do that too (I may write it because I got a decent result with the same feeling) That's it. Thank you for reading until the end.

Recommended Posts

Color page judgment of scanned image with python
Basics of binarized image processing with Python
Drawing with Matrix-Reinventor of Python Image Processing-
Image processing with Python
Extract the table of image files with OneDrive & Python
[OpenCV / Python] I tried image analysis of cells with OpenCV
Image processing with Python (Part 2)
Image editing with python OpenCV
Sorting image files with Python (2)
Sorting image files with Python (3)
Image processing with Python (Part 1)
Tweet with image in Python
Sorting image files with Python
Image processing with Python (Part 3)
[Python] Image processing with scikit-image
I tried to find the entropy of the image with python
I tried "gamma correction" of the image with Python + OpenCV
Detect objects of a specific color and size with Python
[Python] Easy reading of serial number image files with OpenCV
Get the source of the page to load infinitely with python.
Cut out an image with python
Image capture of firefox using python
[Python] Using OpenCV with Python (Image Filtering)
Judgment of backlit image using OpenCV
Judgment of holidays including holidays with bash
Getting Started with Python Basics of Python
Image processing with Python 100 knocks # 3 Binarization
Life game with Python! (Conway's Game of Life)
10 functions of "language with battery" python
Let's do image scraping with Python
Implementation of Dijkstra's algorithm with python
Find image similarity with Python + OpenCV
Image processing with Python 100 knocks # 2 Grayscale
Coexistence of Python2 and 3 with CircleCI (1.0)
Send image with python, save with php
Basic study of OpenCV with Python
Gradation image generation with Python [1] | np.linspace
[python, ruby] fetch the contents of a web page with selenium-webdriver
Image processing with Python 100 knocks # 4 Binarization of Otsu (discriminant analysis method)
Create a compatibility judgment program with the random module of python.
Image processing with Python 100 knock # 10 median filter
Python: Basics of image recognition using CNN
[Examples of improving Python] Learning Python with Codecademy
Try to reproduce color film with Python
HTML email with image to send with python
How to crop the lower right part of the image with Python OpenCV
Execute Python script with cron of TS-220
Output color characters to pretty with python
Create a dummy image with Python + PIL.
Image processing with Python 100 knocks # 8 Max pooling
Check the existence of the file with python
Introduction to Python Image Inflating Image inflating with ImageDataGenerator
Algorithm learned with Python 8th: Evaluation of algorithm
Python: Application of image recognition using CNN
Page cache in Python + Flask with Flask-Caching
Clogged with python update of GCP console ①
Try to image the elevation data of the Geographical Survey Institute with Python
Easy introduction of speech recognition with Python
[Python] Calculation of image similarity (Dice coefficient)
Grayscale by matrix-Reinventor of Python image processing-
Use cryptography library cryptography with Docker Python image