Pixel manipulation of images in Python

Summary for performing image processing (pixel manipulation) in Python. For image processing learning purposes, performance and quality are secondary, and clarity is important.

environment

The environment is Mac. In terms of academic approach, the language is Python and the version is 2.7.x.

Usage environment

I use a Mac.
Use Python.
Python uses 2.7.x.
Pillow (PIL) is used for image processing.

Installation

Python

The Mac comes with Python from the beginning.

python --version

Please try. If not,

brew install python

I will enter it.

It is assumed that brew is included. .. ..

pip installation

Install pip, a Python package management tool, to make it easier to install the library.

easy_install pip

Module (library) installation

This time we will use numpy and Pillow (PIL). numpy includes tools that are convenient for calculations, and Pillow includes tools related to image processing.

pip install numpy
pip install pillow

Try to manipulate the image

For the time being, I will load the image and display it.

from PIL import Image

im = Image.open("./achan.jpg ")
im.show()

It's just displayed, so I'll rotate it to display it.

py01

from PIL import Image

im = Image.open("./achan.jpg ")
im.rotate(30).show()

I think it was rotated 30 degrees counterclockwise around the center of the image.

py01

In practice, this is fine, but I don't know what it's doing internally. So, here, I would like to write a program that processes and rotates each pixel individually.

Try manipulating pixels

Now let's work with the individual pixels.

Try flipping negative and positive

Now, let's perform negative-positive inversion processing, which is a typical and simple pixel operation processing.

#coding:utf-8
from PIL import Image

#Loading images
im = Image.open("./achan.jpg ")

#Convert to RGB
rgb_im = im.convert('RGB')

#Get image size
size = rgb_im.size

#Create a new empty image with the same size as the acquired size
im2 = Image.new('RGBA',size)

#loop
#x
for x in range(size[0]):
    #y
    for y in range(size[1]):
        #Get pixels
        r,g,b = rgb_im.getpixel((x,y))

        #Inversion processing
        r = 255 - r
        g = 255 - g
        b = 255 - b

        #set pixel
        im2.putpixel((x,y),(r,g,b,0))

#show
im2.show()

It was flipped to a negative.

py01

grayscale

Grayscale looks gray when r, g, and b have the same value. However, it is a case-by-case basis whether to make the same value according to such a rule. Here, we will get the average value of r, g, b and use that value (hereinafter, only the pixel operation is excerpted).

#loop
#x
for x in range(size[0]):
    #y
    for y in range(size[1]):
        #Get pixels
        r,g,b = rgb_im.getpixel((x,y))

        #Averaging
        g = (r + g + b)/3

        #set pixel
        im2.putpixel((x,y),(g,g,g,0))

#show
im2.show()

It turned gray for the time being.

py01

Try to rotate

There are several ways to rotate an image, but the most primitive one is to use a rotation matrix.

Rotation matrix

The rotation matrix is expressed by the following formula. If you give θ the angle you want to rotate in radians, you can get a seat (x2, y2) rotated by that angle.

\begin{equation}

\begin{pmatrix}
x_{2} \\
y_{2}
\end{pmatrix}

=

\begin{pmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta
\end{pmatrix}

\begin{pmatrix}
x_{1} \\
y_{2}
\end{pmatrix}


\end{equation}

In Python, by using the numpy module, the above rotation matrix can be displayed.

m_matrix = np.matrix([
            [np.cos(rad),-1*np.sin(rad)],
            [np.sin(rad),np.cos(rad)]
        ])

It is very convenient because it can be expressed intuitively, and the sum and product can be obtained by describing it like a normal four arithmetic operation.

Matrix is used here, but it seems better to use array unless there is a specific reason.

Write code

Now, let's write a code that rotates using a rotation matrix. In the code below, the upper left is processed as the rotation axis, not the center of the image. Furthermore, since putpixel, which can only handle ints, is used, the image becomes uneven (although this method is used here for the sake of clarity).

#coding:UTF-8
from PIL import Image
import numpy as np

#Loading images
im = Image.open("./achan.jpg ")

#Convert to RGB
rgb_im = im.convert('RGB')

#Get image size
size = rgb_im.size

#Create a new empty image with the same size as the acquired size
im2 = Image.new('RGBA',size)

#loop
#x
for x in range(size[0]):
	#y
	for y in range(size[1]):

		#Get pixels
		r,g,b = rgb_im.getpixel((x,y))

		#processing

		#30 degrees
		rad = np.pi/6

		#Rotation matrix
		m_matrix = np.matrix([
			[np.cos(rad),-1*np.sin(rad)],
			[np.sin(rad),np.cos(rad)]
		])

		#Applicable coordinates (original coordinates)
		p_matrix = np.matrix([
				[x],
				[y]
			])

		#Matrix operation
		p2_matrix = m_matrix * p_matrix

		#Get work after moving(Since only int can be put, convert to int)
		x2 = int(p2_matrix[0,0])
		y2 = int(p2_matrix[1,0])

		#If it is within the image size
		if 0 < x2 < size[0] and 0 < y2 < size[1]:
			#Specify the original RGB as the coordinates after moving
			im2.putpixel((x2,y2),(r,g,b,0))

#show
im2.show()

py01

Try to flip

You can also invert the x-axis, y-axis, and arbitrary axes in a matrix. This is the first conversion of high school mathematics. For example, the y-axis inversion is given below.

Inversion matrix (y-axis)

\begin{equation}

\begin{pmatrix}
x_{2} \\
y_{2}
\end{pmatrix}

=

\begin{pmatrix}
-1 & 0 \\
0 & 1
\end{pmatrix}

\begin{pmatrix}
x_{1} \\
y_{2}
\end{pmatrix}


\end{equation}

This matrix is in Python

#y-axis target
y_matrix = np.matrix([
    [-1,0],
    [0,1]
])

Can be expressed as.

Try to write the code

Now let's write the code. If you flip the y-axis normally, all points will shift in the minus direction and will not be drawn. Therefore, translate by the width (x-axis) of the image.

#coding:UTF-8
from PIL import Image
import numpy as np

#Loading images
im = Image.open("./achan.jpg ")

#Convert to RGB
rgb_im = im.convert('RGB')

#Get image size
size = rgb_im.size

#Create a new empty image with the same size as the acquired size
im2 = Image.new('RGBA',size)

#loop
#x
for x in range(size[0]):
    #y
    for y in range(size[1]):

        #Get pixels
        r,g,b = rgb_im.getpixel((x,y))

        #processing

        #y-axis target
        y_matrix = np.matrix([
        	[-1,0],
        	[0,1]
        ])

        #Applicable coordinates (original coordinates)
        p_matrix = np.matrix([
                [x],
                [y]
            ])

        #Matrix operation
        p2_matrix = y_matrix * p_matrix

        #Get work after moving(Since only int can be put, convert to int)
        x2 = int(p2_matrix[0,0]) + size[0] #Translate the x coordinate by the horizontal size of the addition
        y2 = int(p2_matrix[1,0])

        #If it is within the image size
        if 0 < x2 < size[0] and 0 < y2 < size[1]:
            #Specify the original RGB as the coordinates after moving
            im2.putpixel((x2,y2),(r,g,b,0))

#show
im2.show()

py01

It was reversed. So-called left-right reversal. It feels like it's off by an extra pixel. .. .. I don't care here.

Neighborhood processing

As with matrix operations, we will try neighborhood processing, which is essential processing in image processing. Neighborhood processing is useful for "blurring processing" and "contour extraction".

Blur

Here, we will try a relatively simple "blurring process". There are various algorithms, but here we will try the simplest method of getting the average value of the vicinity of 8 and setting it.

The eight neighborhoods are the eight areas that surround the reference coordinates.

py01

Get the r, g, b values of each coordinate and average them.

One caveat is that x-1 does not exist or x + 1 extends beyond the coordinates of the edges of the image, so that processing is required. Now let's write the code.

#coding:utf-8
from PIL import Image

#Loading images
im = Image.open("./achan.jpg ")

#Convert to RGB
rgb_im = im.convert('RGB')

#Get image size
size = rgb_im.size

#Create a new empty image with the same size as the acquired size
im2 = Image.new('RGBA',size)

#loop
#x
for x in range(size[0]):
    #y
    for y in range(size[1]):

        #Get pixels of target coordinates
        r0,g0,b0 = rgb_im.getpixel((x,y))

        #Initialization (for the time being, set the current coordinate values to all neighborhood values)
        r1 = r2 = r3 = r4 = r5 = r6 = r7 = r8 = r0;
        g1 = g2 = g3 = g4 = g5 = g6 = g7 = g8 = g0;
        b1 = b2 = b3 = b4 = b5 = b6 = b7 = b8 = b0;

        #Get the value of the neighborhood coordinates

        #1
        if x-1 > 0 and y+1 < size[1]:
        	r1,g1,b1 = rgb_im.getpixel((x-1,y+1))

        #2
        if y+1 < size[1]:
        	r2,g2,b2 = rgb_im.getpixel((x,y+1))

        #3
        if x+1 < size[0] and y+1 < size[1]:
        	r3,g3,b3 = rgb_im.getpixel((x+1,y+1))

        #4
        if x-1 > 0:
        	r4,g4,b4 = rgb_im.getpixel((x-1,y))

        #5
        if x+1 < size[0]:
        	r5,g5,b5 = rgb_im.getpixel((x+1,y))

        #6
        if x-1 > 0 and y-1 > 0:
        	r6,g6,b6 = rgb_im.getpixel((x-1,y-1))

        #7
        if y-1 > 0:
        	r7,g7,b7 = rgb_im.getpixel((x,y-1))

        #8
        if x+1 < size[0] and y-1 > 0:
        	r8,g8,b8 = rgb_im.getpixel((x+1,y-1))


        #Average RGB in the neighborhood
        r = (r0 + r1 + r2 + r3 + r4 + r5 + r6 + r7 + r8)/9
        g = (g0 + g1 + g2 + g3 + g4 + g5 + g6 + g7 + g8)/9
        b = (b0 + b1 + b2 + b3 + b4 + b5 + b6 + b7 + b8)/9

        #drawing
        im2.putpixel((x,y),(r,g,b,0))

#show
im2.show()

It's a little confusing, but it's blurry. Various blurs are possible by further enlarging the neighborhood and devising the algorithm.

py01

It's still a long way to get it to work, such as pixel correction, but for now, that's all for the basics.