Self-satisfaction implementation of OCR

Plans

Recognize individual characters first. (Using haar or YOLO?)
Cover each character with Rectangle and perform the following processing.
In the area surrounded by Rectangle, the brightness value is 3D-plotted, and this is 3D-modeled so that each character can be accurately recognized based on the map when rotated.

# Modules
from pathlib import Path
from skimage import io
import matplotlib.pyplot as plt
import cv2
import numpy as np

# Putting some image files of any documents under dataset

p = Path("../dataset")
paths = list(p.glob("**/*.jpg "))
data1 = io.imread(paths[0])

# Using tile strategy to evaluate the recognition accuracy.

mini = data1[1200:1400, 800:1000, 0]
plt.imshow(mini)
print("Showing data to be processed...")
plt.show()

Switch1

Estimate the pixel-width of one character after cutting out the entire area of the paper with Affine transformation.

# Highlighting the target

## Lazy normalization

if mini.max() > 256:
    subject = np.true_divide(mini, 256).astype("uint8")
else:
    subject = mini.astype("uint8")

## Creating a mask to remove noise.

mask = (subject < 200)
                
masked = mask * subject

## Distance transform

distmap = cv2.distanceTransform(masked,1,3)
                
## Creating all zero matrix, which size equal to data

featuremap = distmap*0
                
## Deciding kernel size to convolve.

ksize = 20
                
## Detecting edge with convolution...

for x in range(ksize,distmap.shape[0]-ksize*2):
    for y in range(ksize,distmap.shape[1]-ksize*2):
    
    ###The coordinates with the largest value in the Kernel are output as 1 in the feature map....
    ### max-Processing is close to pooling. Max-We are implementing pooling and standardization at the same time.
    
        if distmap[x,y]>0 and distmap[x,y]==np.max(distmap[x-ksize:x+ksize,y-ksize:y+ksize]):
            featuremap[x,y]=1

        ### defining feature_dilated for imshow

        feature_dilated = cv2.dilate(featuremap, (50, 50))

print("Masked image is ... : ")
plt.imshow(masked)
plt.show()

plt.imshow(feature_dilated)
print("Feature shape is ... : " + str(feature_dilated.shape))
plt.show()

Feature_mask = (feature_dilated > 0)

Cropped = masked * Feature_mask
plt.imshow(Cropped)
plt.show()

Switch2

Let Public-OCR-server process the image file cut by Tile-strategy.
Let a server such as MNIST process the cut image file for single character recognition. (After performing One-hot vectorization, crop only one character from that coordinate.)

Future tasks

Now implement YOLO on the spiking neural network (?) And complete the one-character cropping process of interpretive gorigori.
Study (study) the projection matrix that is effective for searching the eigenvalues of 3D execution columns.
Full scratch dimensionality reduction

reference

Projection matrix
The nature of the projection matrix
[Projection operator](https://ja.wikipedia.org/wiki/projection operator)
HSV conversion

Lazy advent calendar 2019

Self-satisfaction implementation of OCR

Future tasks

reference