Summary for performing image processing (pixel manipulation) in Python. For image processing learning purposes, performance and quality are secondary, and clarity is important.
The environment is Mac. In terms of academic approach, the language is Python and the version is 2.7.x.
Python
The Mac comes with Python from the beginning.
python --version
Please try. If not,
brew install python
I will enter it.
It is assumed that brew is included. .. ..
Install pip, a Python package management tool, to make it easier to install the library.
easy_install pip
This time we will use numpy and Pillow (PIL). numpy includes tools that are convenient for calculations, and Pillow includes tools related to image processing.
pip install numpy
pip install pillow
For the time being, I will load the image and display it.
from PIL import Image
im = Image.open("./achan.jpg ")
im.show()
It's just displayed, so I'll rotate it to display it.
from PIL import Image
im = Image.open("./achan.jpg ")
im.rotate(30).show()
I think it was rotated 30 degrees counterclockwise around the center of the image.
In practice, this is fine, but I don't know what it's doing internally. So, here, I would like to write a program that processes and rotates each pixel individually.
Now let's work with the individual pixels.
Now, let's perform negative-positive inversion processing, which is a typical and simple pixel operation processing.
#coding:utf-8
from PIL import Image
#Loading images
im = Image.open("./achan.jpg ")
#Convert to RGB
rgb_im = im.convert('RGB')
#Get image size
size = rgb_im.size
#Create a new empty image with the same size as the acquired size
im2 = Image.new('RGBA',size)
#loop
#x
for x in range(size[0]):
#y
for y in range(size[1]):
#Get pixels
r,g,b = rgb_im.getpixel((x,y))
#Inversion processing
r = 255 - r
g = 255 - g
b = 255 - b
#set pixel
im2.putpixel((x,y),(r,g,b,0))
#show
im2.show()
It was flipped to a negative.
Grayscale looks gray when r, g, and b have the same value. However, it is a case-by-case basis whether to make the same value according to such a rule. Here, we will get the average value of r, g, b and use that value (hereinafter, only the pixel operation is excerpted).
#loop
#x
for x in range(size[0]):
#y
for y in range(size[1]):
#Get pixels
r,g,b = rgb_im.getpixel((x,y))
#Averaging
g = (r + g + b)/3
#set pixel
im2.putpixel((x,y),(g,g,g,0))
#show
im2.show()
It turned gray for the time being.
There are several ways to rotate an image, but the most primitive one is to use a rotation matrix.
The rotation matrix is expressed by the following formula. If you give θ the angle you want to rotate in radians, you can get a seat (x2, y2) rotated by that angle.
\begin{equation}
\begin{pmatrix}
x_{2} \\
y_{2}
\end{pmatrix}
=
\begin{pmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta
\end{pmatrix}
\begin{pmatrix}
x_{1} \\
y_{2}
\end{pmatrix}
\end{equation}
In Python, by using the numpy module, the above rotation matrix can be displayed.
m_matrix = np.matrix([
[np.cos(rad),-1*np.sin(rad)],
[np.sin(rad),np.cos(rad)]
])
It is very convenient because it can be expressed intuitively, and the sum and product can be obtained by describing it like a normal four arithmetic operation.
Matrix is used here, but it seems better to use array unless there is a specific reason.
Now, let's write a code that rotates using a rotation matrix. In the code below, the upper left is processed as the rotation axis, not the center of the image. Furthermore, since putpixel, which can only handle ints, is used, the image becomes uneven (although this method is used here for the sake of clarity).
#coding:UTF-8
from PIL import Image
import numpy as np
#Loading images
im = Image.open("./achan.jpg ")
#Convert to RGB
rgb_im = im.convert('RGB')
#Get image size
size = rgb_im.size
#Create a new empty image with the same size as the acquired size
im2 = Image.new('RGBA',size)
#loop
#x
for x in range(size[0]):
#y
for y in range(size[1]):
#Get pixels
r,g,b = rgb_im.getpixel((x,y))
#processing
#30 degrees
rad = np.pi/6
#Rotation matrix
m_matrix = np.matrix([
[np.cos(rad),-1*np.sin(rad)],
[np.sin(rad),np.cos(rad)]
])
#Applicable coordinates (original coordinates)
p_matrix = np.matrix([
[x],
[y]
])
#Matrix operation
p2_matrix = m_matrix * p_matrix
#Get work after moving(Since only int can be put, convert to int)
x2 = int(p2_matrix[0,0])
y2 = int(p2_matrix[1,0])
#If it is within the image size
if 0 < x2 < size[0] and 0 < y2 < size[1]:
#Specify the original RGB as the coordinates after moving
im2.putpixel((x2,y2),(r,g,b,0))
#show
im2.show()
You can also invert the x-axis, y-axis, and arbitrary axes in a matrix. This is the first conversion of high school mathematics. For example, the y-axis inversion is given below.
\begin{equation}
\begin{pmatrix}
x_{2} \\
y_{2}
\end{pmatrix}
=
\begin{pmatrix}
-1 & 0 \\
0 & 1
\end{pmatrix}
\begin{pmatrix}
x_{1} \\
y_{2}
\end{pmatrix}
\end{equation}
This matrix is in Python
#y-axis target
y_matrix = np.matrix([
[-1,0],
[0,1]
])
Can be expressed as.
Now let's write the code. If you flip the y-axis normally, all points will shift in the minus direction and will not be drawn. Therefore, translate by the width (x-axis) of the image.
#coding:UTF-8
from PIL import Image
import numpy as np
#Loading images
im = Image.open("./achan.jpg ")
#Convert to RGB
rgb_im = im.convert('RGB')
#Get image size
size = rgb_im.size
#Create a new empty image with the same size as the acquired size
im2 = Image.new('RGBA',size)
#loop
#x
for x in range(size[0]):
#y
for y in range(size[1]):
#Get pixels
r,g,b = rgb_im.getpixel((x,y))
#processing
#y-axis target
y_matrix = np.matrix([
[-1,0],
[0,1]
])
#Applicable coordinates (original coordinates)
p_matrix = np.matrix([
[x],
[y]
])
#Matrix operation
p2_matrix = y_matrix * p_matrix
#Get work after moving(Since only int can be put, convert to int)
x2 = int(p2_matrix[0,0]) + size[0] #Translate the x coordinate by the horizontal size of the addition
y2 = int(p2_matrix[1,0])
#If it is within the image size
if 0 < x2 < size[0] and 0 < y2 < size[1]:
#Specify the original RGB as the coordinates after moving
im2.putpixel((x2,y2),(r,g,b,0))
#show
im2.show()
It was reversed. So-called left-right reversal. It feels like it's off by an extra pixel. .. .. I don't care here.
As with matrix operations, we will try neighborhood processing, which is essential processing in image processing. Neighborhood processing is useful for "blurring processing" and "contour extraction".
Here, we will try a relatively simple "blurring process". There are various algorithms, but here we will try the simplest method of getting the average value of the vicinity of 8 and setting it.
The eight neighborhoods are the eight areas that surround the reference coordinates.
Get the r, g, b values of each coordinate and average them.
One caveat is that x-1 does not exist or x + 1 extends beyond the coordinates of the edges of the image, so that processing is required. Now let's write the code.
#coding:utf-8
from PIL import Image
#Loading images
im = Image.open("./achan.jpg ")
#Convert to RGB
rgb_im = im.convert('RGB')
#Get image size
size = rgb_im.size
#Create a new empty image with the same size as the acquired size
im2 = Image.new('RGBA',size)
#loop
#x
for x in range(size[0]):
#y
for y in range(size[1]):
#Get pixels of target coordinates
r0,g0,b0 = rgb_im.getpixel((x,y))
#Initialization (for the time being, set the current coordinate values to all neighborhood values)
r1 = r2 = r3 = r4 = r5 = r6 = r7 = r8 = r0;
g1 = g2 = g3 = g4 = g5 = g6 = g7 = g8 = g0;
b1 = b2 = b3 = b4 = b5 = b6 = b7 = b8 = b0;
#Get the value of the neighborhood coordinates
#1
if x-1 > 0 and y+1 < size[1]:
r1,g1,b1 = rgb_im.getpixel((x-1,y+1))
#2
if y+1 < size[1]:
r2,g2,b2 = rgb_im.getpixel((x,y+1))
#3
if x+1 < size[0] and y+1 < size[1]:
r3,g3,b3 = rgb_im.getpixel((x+1,y+1))
#4
if x-1 > 0:
r4,g4,b4 = rgb_im.getpixel((x-1,y))
#5
if x+1 < size[0]:
r5,g5,b5 = rgb_im.getpixel((x+1,y))
#6
if x-1 > 0 and y-1 > 0:
r6,g6,b6 = rgb_im.getpixel((x-1,y-1))
#7
if y-1 > 0:
r7,g7,b7 = rgb_im.getpixel((x,y-1))
#8
if x+1 < size[0] and y-1 > 0:
r8,g8,b8 = rgb_im.getpixel((x+1,y-1))
#Average RGB in the neighborhood
r = (r0 + r1 + r2 + r3 + r4 + r5 + r6 + r7 + r8)/9
g = (g0 + g1 + g2 + g3 + g4 + g5 + g6 + g7 + g8)/9
b = (b0 + b1 + b2 + b3 + b4 + b5 + b6 + b7 + b8)/9
#drawing
im2.putpixel((x,y),(r,g,b,0))
#show
im2.show()
It's a little confusing, but it's blurry. Various blurs are possible by further enlarging the neighborhood and devising the algorithm.
It's still a long way to get it to work, such as pixel correction, but for now, that's all for the basics.
Recommended Posts