Learning neural networks using the genetic algorithm (GA)

Introduction

f(x,y)=\frac{(\frac{\sin x^2}{cos y}+x^2-5y+30)}{80}

The function of is trained in a neural network using a genetic algorithm. (I couldn't do it because of my undergraduate assignment, so I tried to revenge)

What are Genetic Algorithms (GA)?

GA is a program developed by Holland to search for optimal solutions that mimic the evolutionary mechanism of living organisms. In the process of evolution of sexually reproducing organisms, individuals who can adapt to the environment can leave their genes to the next generation, and the crossing of two individuals produces offspring, and rarely mutations occur. I am paying attention to. A definitive and excellent exact solution has not been found, and it is effective for problems with a vast solution space that makes full search impossible, and it can be applied to various optimization problems in the future. Further development is expected.

GA algorithm

The GA algorithm is shown below.

  1. Initial generation of individuals: Randomly generate early individuals.
  2. Calculation of goodness of fit (evaluation value) Determine the goodness of fit (evaluation value) of each individual.
  3. Play Regeneration of individuals depends on the goodness of fit of each individual. Individuals with a high degree of fit proliferate, and individuals with a low degree of fit are eliminated.
  4. Crossing A new individual is generated from a pair of individuals randomly selected from the population selected in Step 3. Repeat this a predetermined number of times.
  5. Mutation A part of the gene of each individual is randomly rewritten based on the mutation probability.
  6. Judgment of end conditions If the end condition is satisfied, the process ends, otherwise the process returns to Step 2.

Properties of GA algorithm

In GA, particles with high evaluation values ​​are searched intensively by regeneration, and at the same time, solutions are searched in a wide range by crossing and mutation, so if these genetic manipulations work effectively, good solutions can be found. Be expected. However, due to genetic manipulation, information on the best solution may be lost, leading to a local solution, and it is often the case that the optimal solution cannot be found successfully. In particular, since there are many local solutions in the solution space of control parameters, it can be said that the crossing operation is not suitable. GA is a solution search method that assumes that the coefficient of determination is a discrete value, but it may not be optimal for finding a solution because the solution information for optimizing control parameters is a continuous value. Conceivable.

Implementation

fig1.png

Assuming that the gene is composed of the weight and threshold of the neural network, we designed one neural network to correspond to one individual, and updated it using the above GA method. The code is here

Set parameters etc.

#generation
GEN = 100

#Number of NNs
In = 2
Hidden = 2
Out = 1

#NN population
Number = 1000

#Number of teacher signals
Num = 1000

#Crossing probability
kousa = 0.8

#Mutation probability
change = 0.05

#Function to learn
def kansu(x):
    return((math.sin(x[0])*math.sin(x[0])/math.cos(x[1]))+x[0]*x[0]-5*x[1]+30)/80

#Sigmoid function
def sigmoid(x):
    return 1/(1+np.exp(-x))

#plot
def plot(data,name):
    fig = plt.figure()
    plt.plot(data)
    fig.show()
    fig.savefig(str(name)+'.pdf')

Teacher signal creation


class Kyoshi:
    def __init__(self):
        self.input = np.random.rand(Num, In)*10-5
        self.output = np.zeros(Num)

    def make_teacher(self):
        for count in range(Num):
            self.output[count]=kansu(self.input[count])

Create a neural network (NN)

class NN:
    def __init__(self):
        self.u = np.random.rand(In, Hidden)*2-1 #Input layer-Hidden layer weight
        self.v = np.random.rand(Hidden, Out)*2-1 #Hidden layer-Output layer weight
        self.bias_h = np.random.rand(Hidden)*2-1 #Hidden layer bias
        self.bias_o = np.random.rand(Out)*2-1 #Output layer bias
        self.Output = 0 #output
        ##For GA
        self.gosa = 0 #Error with teacher data
        self.F = 0 #Goodness of fit

    #Calculate the output given the input
    def calOutput(self, x): #x is the input
        hidden_node = np.zeros(Hidden)
        for j in range(Hidden):
            for i in range(In):
                hidden_node[j]+=self.u[i][j]*x[i]
                hidden_node[j]-=self.bias_h[j]
                self.Output+=sigmoid(self.v[j]*hidden_node[j])
        self.Output-=self.bias_o

Optimize NN with GA

class NNGA:
    def __init__(self):
        self.nn = [NN()]*Number
        self.aveE = 0 #Overall error average

    #Error and goodness of fit calculation
    def error(self, x,y): #x is teacher input, y is teacher output

        self.aveE = 0
        for count in range(Number):
            self.nn[count].gosa = 0
            #Input and output to each NN
            self.nn[count].calOutput(x[count])
            #Calculate error
            self.nn[count].gosa = abs(self.nn[count].Output - y[count])
            #################################
            # for i in range(Num):
            #     #Input and output to each NN
            #     self.nn[count].calOutput(x[i])
            #     #Calculate error
            #     self.nn[count].gosa = abs(self.nn[count].Output - y[i])/Num
            #################################
            self.aveE += self.nn[count].gosa/Num

        #Goodness of fit calculation
        for count in range(Number):
            self.nn[count].F= 1/ self.nn[count].gosa

    #Genetic algorithm(GA)
    def GA(self):
        #Population/Do it twice
        for _ in range(int(Number/2)):
            
            F_sum=0 #Total goodness of fit for each individual
            for count in range(Number):
                F_sum+=self.nn[count].F 
           
            #Choice
            p = [0,0] #Record the index selected

            #Roulette selection
            for i in range(2):
                F_temp=0
                j = -1
                for count in range(Number):
                    j +=1
                    F_temp+=self.nn[count].F
                    if F_temp > random.random()*F_sum:
                        break    
                p[i]=j

            #Create a child candidate
            child = [NN()]*2

            #Uniform crossing
            if random.random() < kousa:
                    if random.random() < 0.5:
                        child[0].u = self.nn[p[0]].u
                        child[1].u = self.nn[p[1]].u                 
                    else:
                        child[0].u = self.nn[p[1]].u
                        child[1].u = self.nn[p[0]].u

                    if random.random() < 0.5:
                        child[0].v = self.nn[p[0]].v
                        child[1].v = self.nn[p[1]].v
                    else:
                        child[0].v = self.nn[p[1]].v
                        child[1].v = self.nn[p[0]].v                
                            
                    if random.random() < 0.5:
                        child[0].bias_h = self.nn[p[0]].bias_h
                        child[1].bias_h = self.nn[p[1]].bias_h
                    else:
                        child[0].bias_h = self.nn[p[1]].bias_h
                        child[1].bias_h = self.nn[p[0]].bias_h                  

                    if random.random() < 0.5:
                        child[0].bias_o = self.nn[p[0]].bias_o
                        child[1].bias_o = self.nn[p[1]].bias_o   
                    else:
                        child[0].bias_o = self.nn[p[1]].bias_o
                        child[1].bias_o = self.nn[p[0]].bias_o           
            else:
                child[0] = self.nn[p[0]]
                child[1] = self.nn[p[1]]

            #Inherit the average goodness of fit of parents
            child[0].F = (self.nn[p[0]].F+self.nn[p[1]].F)/2
            child[1].F = (self.nn[p[0]].F+self.nn[p[1]].F)/2

            #mutation
            for count in range(2):
                for j in range(Hidden):
                    for i in range(In):
                        if random.random() < change:
                            child[count].u[i][j] = random.random()*2-1
                    
                    if random.random() < change:
                        child[count].bias_h[j] = random.random()*2-1

                    if random.random() < change:
                        child[count].v[j] = random.random()*2-1

                if random.random() < change:
                    child[count].bias_o = random.random()*2-1
     
            #Add children to the population
            rm1=0
            rm2=0
            min_F=100000

            #Replace with the individual with the minimum goodness of fit
            rm1 = np.argmin(self.nn[count].F)
            self.nn[rm1]=child[0]

            #Replaced with the second lowest goodness of fit individual
            for count in range(Number):
                if count==rm1:
                    pass
                elif min_F > self.nn[count].F:
                    min_F = self.nn[count].F
                    rm2 = count
            self.nn[rm2]=child[1]

main function

def main():
    #Counting the number of generations
    generation=0
    #Generate an early individual
    nnga = NNGA()
   
    #Determines the input / output of the teacher signal
    teacher = Kyoshi()
    teacher.make_teacher()

    #test data
    testTeacher = Kyoshi()
    testTeacher.make_teacher()
    
    #Goodness of fit calculation
    nnga.error(teacher.input, teacher.output)
 
    #Recording function
    kiroku = []
    eliteKiroku = []
    minEKiroku = []

    #Start learning
    while(True):
        generation += 1
        #GA optimization
        nnga.GA()
        #Calculate goodness of fit
        nnga.error(teacher.input,teacher.output)      
        #Find the least error elite
        min_E = 100000
        elite = 0
        for count in range(Number):
            if min_E > nnga.nn[count].gosa:
                min_E = nnga.nn[count].gosa
                elite = count
        #Check the elite with test data
        sumE = 0
        for i in range(Num):
            nnga.nn[elite].calOutput(testTeacher.input[i])
            sumE += abs(nnga.nn[elite].Output - testTeacher.output[i])/Num

        #Shuffle teacher data
        # np.random.shuffle(teacher.input)
        # teacher.make_teacher()

        #Record
        kiroku.append(nnga.aveE)
        eliteKiroku.append(sumE)
        minEKiroku.append(min_E)

        print("generation:",generation,"average",nnga.aveE, "elite",min_E,"test", sumE)

        # if min_E < 0.06:
        #     break
        if generation==GEN:
            break

    # plot
    plot(kiroku,"Mean error")
    plot(minEKiroku,"Elite individual error")
    plot(eliteKiroku,"Elite individual error(test data)")


if __name__ == '__main__':
    main()

Experimental result

The graph below shows the average error between the elite individuals of each generation and the test data. The horizontal axis is the number of generations, and the vertical axis is the error.

スクリーンショット 2021-01-09 22.00.24.png

The error became smaller at once up to the first 5 generations. However, there was almost no change after that.

I tried changing the condition of various parameters and changing the design, but the performance did not improve. Hmmm. I would like to update it when my knowledge increases. (If you have any advice, please do not hesitate to contact us.)

Recommended Posts

Learning neural networks using the genetic algorithm (GA)
Learning neural networks using Chainer-Creating a Web API server
Algorithm generation automation using genetic algorithms
Finding the optimum value of a function using a genetic algorithm (Part 1)
How to fix the initial population with a genetic algorithm using DEAP
[Text classification] I tried using the Attention mechanism for Convolutional Neural Networks.
I tried to understand the learning function of neural networks carefully without using a machine learning library (first half).
I tried running an object detection tutorial using the latest deep learning algorithm
Solving the traveling salesman problem with the genetic algorithm (GA) and its library (vcopt)
Simple neural network implementation using Chainer-optimization algorithm setting-
Reinforcement learning 10 Try using a trained neural network.
Dictionary learning algorithm
[Linux] I tried using the genetic statistics software PLINK
I tried to implement GA (genetic algorithm) in Python
I investigated the reinforcement learning algorithm of algorithmic trading
I tried to compress the image using machine learning