Using the Python library DEAP, you can easily perform simulations using the genetic algorithm (GA). At that time, the initial group may be fixed due to circumstances such as having some prior knowledge about the optimum solution and wanting to shorten the calculation time.
For example, when applying to machine learning feature selection such as this example, I want to start with the best model I have tried so far. Can be considered. If it is 100 to several hundred, it is not necessary to set the initial value, but there is a demand for humans who have mechanically created about 100,000 features without thinking ahead.
There is a guide in the Official Document for how to deal with such cases. There isn't much information on the web that I actually tested, so I'll list it here.
GA implementation that maximizes the total value of the array consisting of 0 or 1 as the evaluation function The optimal solution is an array of 1, and the maximum value of the evaluation function is the length of the array. http://deap.gel.ulaval.ca/doc/default/examples/ga_onemax.html http://deap.gel.ulaval.ca/doc/default/examples/ga_onemax_short.html
#Library import
import numpy as np
import random
from deap import algorithms
from deap import base
from deap import creator
from deap import tools
Total value of an array of 100 elements consisting of 0 or 1
def evalOneMax(individual):
return sum(individual), #Add a comma even if there is only one return value
Add FitnessMax and Individual methods to creator with create. The maximization of the evaluation function is evaluated by setting weights = (1.0,).
creator.create("FitnessMax", base.Fitness, weights=(1.0,)) #Add a comma even if there is only one argument
creator.create("Individual", list, fitness=creator.FitnessMax)
Add a method with the name of the first variable to the toolbox with resister.
toolbox.attr_bool
: random.randint(0,1)
toolbox.individual
: 01 random number using toolbox.attr_bool> Repeat 100 times with tools.initRepeat Create a list of 100 elements (= individual generation)
toolbox.polulation
: Repeat individual generation by toolbox.individual to create a population
del toolbox
toolbox = base.Toolbox()
# Attribute generator
toolbox.register("attr_bool", random.randint, 0.0, 1.0)
# Structure initializers
toolbox.register("individual", tools.initRepeat, creator.Individual,
toolbox.attr_bool, 100)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
Then the method required for GA execution by ʻalgorithms.eaMuPlusLambda Define
toolbox.evaluate (evaluation),
toolbox.mate(crossover),
toolbox.mutate (mutation),
toolbox.select(selection). You can write as below using
deap.tools`.
toolbox.register("evaluate", evalOneMax)
toolbox.register("mate", tools.cxTwoPoint)
toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)
toolbox.register("select", tools.selNSGA2)
pop: Initial population hof: Preservation of the best individual stats: Saving statistics for the evaluation function of the population
"""Genetic algorithm setting
Up to NGEN generation
1 generation population LAMBDA
Number of individuals to pass on to the next generation MU
Crossover probability CXPB
Mutation probability MUTPB
"""
random.seed(4)
NGEN = 10
MU = 50
LAMBDA = 100
CXPB = 0.7
MUTPB = 0.3
#Population creation
pop = toolbox.population(n=MU)
hof = tools.HallOfFame(1)
stats = tools.Statistics(lambda ind: ind.fitness.values)
stats.register("avg", np.mean, axis=0)
stats.register("std", np.std, axis=0)
stats.register("min", np.min, axis=0)
stats.register("max", np.max, axis=0)
pop, log = algorithms.eaMuPlusLambda(pop,
toolbox,
mu=MU,
lambda_=LAMBDA,
cxpb=CXPB,
mutpb=MUTPB,
ngen=NGEN,
stats=stats,
halloffame=hof)
gen nevals avg std min max
0 50 [50.32] [4.71355492] [41.] [63.]
1 100 [56.] [2.02977831] [53.] [63.]
2 100 [59.3] [2.3] [57.] [68.]
3 100 [62.56] [2.03135423] [60.] [69.]
4 100 [65.58] [1.96051014] [63.] [72.]
5 100 [68.46] [1.52590956] [66.] [73.]
6 100 [70.4] [1.34164079] [69.] [74.]
7 100 [72.36] [1.10923397] [71.] [76.]
8 100 [74.06] [1.06602064] [73.] [77.]
9 100 [75.38] [0.79724526] [74.] [77.]
10 100 [76.36] [0.62481997] [76.] [78.]
Since it started with a random number, the evaluation value of the initial individual is within 41 to 63. It has not converged in the 10th generation, but it is OK because the purpose is not to converge.
If you want to fix the initial population to a specific value and start, rewrite the relevant part of the official document as follows.
Create a list of self-made initial population = individuals. For example, if you want to start the genetic algorithm from the clone population L00 of an individual consisting of all 0s, it will be as follows.
MU = 50
# individual
L0 = [0] * 100
# list of individuals
L00 = [L0] * MU
Same as 1-2.
Define a method toolbox.population_guess
that replaces toolbox.population
.
toolbox = base.Toolbox()
# population_Functions used in guess
def initPopulation(pcls, ind_init, file):
return pcls(ind_init(c) for c in file)
# population_Creating a guess method
# creator.Add Fitness to each individual by Individual
toolbox.register("population_guess", initPopulation, list, creator.Individual, L00)
#Here together
toolbox.register("evaluate", evalOneMax)
toolbox.register("mate", tools.cxTwoPoint)
toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)
toolbox.register("select", tools.selNSGA2)
hof and stat are the same, so omitted
#Population of initial population_Change to guess
#pop = toolbox.population(n=MU)
pop = toolbox.population_guess()
pop, log = algorithms.eaMuPlusLambda(pop,
toolbox,
mu=MU,
lambda_=LAMBDA,
cxpb=CXPB,
mutpb=MUTPB,
ngen=NGEN,
stats=stats,
halloffame=hof)
gen nevals avg std min max
0 50 [0.] [0.] [0.] [0.]
1 100 [3.54] [3.04111822] [0.] [9.]
2 100 [8.3] [2.0808652] [6.] [15.]
3 100 [12.4] [2.45764115] [10.] [21.]
4 100 [16.38] [1.92758917] [14.] [22.]
5 100 [20.36] [2.10485154] [18.] [29.]
6 100 [24.52] [1.5651198] [22.] [29.]
7 100 [28.02] [1.74917123] [26.] [33.]
8 100 [31.3] [2.21133444] [29.] [39.]
9 100 [35.] [1.69705627] [33.] [40.]
10 100 [38.16] [1.71300905] [36.] [43.]
A group with a minimum evaluation value of 0 and a maximum value of 0 = I was able to start GA from the initial group of my own work.
Recommended Posts