I tried to make a lot of hits by googled with "Traveling salesman problem genetic algorithm".
-Theory -Python code -Execution result
The ** Traveling Salesman Problem ** is the problem of finding the shortest route back to the original point, going through all $ N $ points once. The total number of routes that pass through all $ N $ points and return is $ (N-1)! / 2 $. If there are 3 points, there is one way. If there are 4 points, there are 3 ways. If there are 5 points, there are 12 ways. As the number of points increases, the total number of routes will increase explosively. It takes an enormous amount of time to find the shortest route among them.
However, if conditional, you can find the route in a practical amount of time.
For example, if you know that a point is at the apex of a convex polygon, the shortest path is that convex polygon. In addition, there are various search methods such as genetic algorithms, annealing methods, and self-organizing maps, as long as they are ** almost ** shortest.
** Genetic algorithm ** is one of the search algorithms for combinatorial optimization problems, and has the motif of the evolution of living things. Since the traveling salesman problem is also one of the combination optimization problems, a genetic algorithm can be used. The procedure for searching for a genetic algorithm is as follows.
The genetic algorithm is likened to a living thing and is explained using the following words.
-** Individual ** ... Each combination. -** Chromosome ** ... A combination is expressed in an appropriate way. -** Gene ** ... It is a basic component of the chromosome. -** Goodness of fit ** ... A numerical value that measures the goodness of each individual to the problem. -** Generation ** ... A set of individuals. -** Crossing ** ... This is the procedure to generate a new individual's chromosome from the existing individual's chromosome. -** Mutation ** ... 1 Changes the chromosome of an individual to generate a new individual. -** Crossing ** ... Combining the chromosomes of two individuals to create a new individual. -** Selection ** ... Select individuals to be used for crossing from generations according to their fitness. -** Selection ** ... From the set of crossed individuals and the parent generation, discarding the individuals with low fitness and creating a certain number of individuals for the next generation.
Commentary by Shizuoka Institute of Science and Technology Suganuma Laboratory Getting started with the Genetic Algorithm! --SlideShare
Arrange the genetic algorithm for the traveling salesman problem.
Let $ p_0, p_1, \ dots, p_N $ be the $ N + 1 $ points that the salesman passes through. The path starting from $ p_0 $, passing through all the points in the numerical order of the points and returning to $ p_0 $ can be expressed as $ p_0, p_1, \ dots, p_N, p_0 $. $ p_1, p_2, \ dots, p_N, p_0, p_1 $ are routes that start the same circle from $ p_1 $. Since the length of the looped route is the same regardless of which point is selected as the starting point, the starting point is fixed to $ p_0 $ and the route is expressed as $ p_ {i_1}, \ dots, p_ {i_N} $. Also, even if the whole is translated, the length of the path is the same, so set $ p_0 = O $ and move the other points in parallel. The path created in this way is an individual.
For fitness, use the length of the path represented by the individual.
When the distance between two points $ p, p'$ is $ d (p, p') $, the length of the route $ p_ {i_1}, \ dots p_ {i_N} $ is $ d (p_0, p_ {i_1}) + d (p_0, p_ {i_N} + \ sum_ {k = 1} ^ {N-1} d (p_ {i_k}, p_ {i_ {k + 1}}) $. The smaller the length, the better the route. The larger the number, the better the fitness, so use the reciprocal length or sign inversion. This time I used sign inversion.
If the smaller the number, the better the fitness, use the length as it is.
At the time of crossing, instead of generating child generations from all individuals of the parent generation, ** selection ** is made so that individuals with higher fitness are more likely to have offspring.
Also, during selection, selection is made so that individuals with higher fitness are more likely to remain in the offspring.
There are three main choices. All have advantages and disadvantages, so use them in combination.
Select in descending order of fitness. Since excellent individuals remain, it converges quickly. On the other hand, the influence of excellent individuals in the early stages is strong, and with each generation there is a possibility that diversity will decrease and we will fall into a dead end of evolution (= local solution).
Select based on the probability calculated from fitness. Genes with low fitness remain (although less likely), so diversity does not decrease. On the other hand, genes with high fitness may be discarded and no solution may be reached.
There are various ways to calculate the probability.
--Roulette probability ... $ (individual fitness) \ div (sum of generation fitness) $. This is effective when the fitness is positive and the variance is large. --Ranking probability ... Associates the fitness rank and probability with a ratio of 1: 1. It can be used even if the fitness is negative. The difference in fitness no longer corresponds to the difference in probability. --Function conversion ... Converts fitness to probability with a sigmoid function. Individuals with a higher fitness than the roulette probability are more likely to be selected.
Randomly extract $ n $ individuals from the population and leave the one with the highest fitness. Repeat this until the required number of individuals is collected. Genes with low fitness remain stochastically, so diversity does not decrease much. On the other hand, genes with high fitness may be discarded and no solution may be reached.
I will introduce two ways of expressing (coding) the route.
This is an expression method in which the dot numbers are arranged as they are. The genes are the dot numbers $ 1, \ dots, N $. Chromosomes are columns $ g_1, \ dots, g_N $ in which genes are arranged without duplication. If $ g_i = k $, it means passing through $ p_k $ in the $ i $ th. Below is an example of a route through 5 points and its permutation coding.
\{ p_4, p_1, p_3, p_5, p_2 \} \rightarrow 4, 1, 3, 5, 2
This representation always keeps $ g_i \ neq g_j (i \ neq j) $.
Ordinal coding is an arbitrary name. I didn't know the official name.
This is an expression method that uses the ordinal number of the point sequence excluding the passed points. The $ i $ th gene $ g_i $ takes up to $ N-i + 1 $, the number of remaining points. When there is a sequence of points $ P_1 = p_1, p_2, \ dots, p_N $, the procedure for determining $ g_i $ is as follows.
Below is an example of a path through 5 points and its ordinal coding.
\{ p_4, p_1, p_3, p_5, p_2 \} \rightarrow 4, 1, 2, 2, 1 \\
\begin{array}{lll}
p_1, p_2, p_3, p_4, p_5 & -, -, -, -, - &Remove points while recording the position in the sequence\\
p_1, p_2, p_3, p_5 & 4, -, -, -, - & p_4 is the 4th\\
p_2, p_3, p_5 & 4, 1, -, -, - & p_1 is the first\\
p_2, p_5 & 4, 1, 2, -, - & p_3 is the second\\
p_2 & 4, 1, 2, 2, - & p_5 is the second\\
& 4, 1, 2, 2, 1 & p_2 is the first
\end{array}
With this representation, $ g_i \ le N-i + 1 $ is always kept. Instead, the same number may appear in the chromosome ($ g_i = g_j (i \ neq j) $). The correspondence between the gene $ g_i $ and the dots is not intuitive and can only be found by examining $ g_1, \ dots, g_ {i-1} $.
In crossing of organisms, it is common to exchange paired chromosomes or exchange corresponding sections between chromosomes. In addition, there are various mutations because there are no restrictions on the position of genes and the length of chromosomes is flexible.
Since the salesman's chromosome has coding-dependent restrictions, we define crossovers and mutations within that range.
――Since the restriction of permutation coding is that each gene is once, it is necessary to operate so that there is no duplication in the crossing of chromosomes (fortunately, smart people have devised crossings for genes in order. ). On the other hand, exchanging genes in a chromosome is easy to describe because it does not break the constraints. ――In ordinal coding, restrictions are determined by the position of genes, so it is easier to exchange genes at the same position between chromosomes or simply change genes, rather than exchanging gene positions. Can be described.
From the two chromosomes, find the group with the same set of points and the same set of positions, and exchange the groups.
\begin{array}{cc}
Parent 1& 1\acute{2}345678 \\
Parent 2& 6\grave{7}852341
\end{array}
\rightarrow
\begin{array}{c}
1\acute{2}3456\acute{7}8 \\
6\grave{7}8523\grave{4}1
\end{array}
\rightarrow \dots \rightarrow
\begin{array}{c}
1\acute{2}3\acute{4}\acute{5}6\acute{7}8 \\
6\grave{7}8\grave{5}\grave{2}3\grave{4}1
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& 1\hat{7}3\hat{5}\hat{2}6\hat{4}8 \\
Child 2& 6\hat{2}8\hat{4}\hat{5}3\hat{7}1
\end{array}
Choose a random position $ r $ and look at the genes $ {g_1} _r, {g_2} _r $ at that position in parent 1 and parent 2. The genes $ {g_1} _r and {g_2} _r $ are exchanged between parent 1 and parent 2, respectively. Repeat this several times to generate the chromosomes for child 1 and child 2.
\begin{array}{cc}
Parent 1& 12\acute{3}4567\grave{8} \\
Parent 2& 67\grave{8}52\acute{3}41
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& 12\acute{8}4567\acute{3} \\
Child 2& 67\grave{3}52\grave{8}41
\end{array}
Parent 1 inherits the random section of the chromosome as is to the child. Parent 2 fills in the blanks by taking over the missing genes in the offspring so that they are not out of order.
\begin{array}{cc}
Parent 1& \acute{1}\acute{2}\acute{3}|456|\acute{7}\acute{8} \\
Parent 2& 6\grave{7}\grave{8}5\grave{2}\grave{3}4\grave{1}
\end{array}
\rightarrow
\begin{array}{cc}
Child& \grave{7}\grave{8}\grave{2}|456|\grave{3}\grave{1}
\end{array}
Determine some random positions $ r_1, \ dots, r_k $ and look at the genes $ g_ {r_1}, \ dots, g_ {r_k} $ at those positions in parent 1. Sort those genes in parent 2 $ g_ {r_1}, \ dots, g_ {r_k} $ in the order in parent 1 to generate chromosomes in child 1. Swap parent 1 and parent 2 and do the same to generate child 2.
\begin{array}{cc}
Parent 1& 1\acute{2}3\acute{4}\acute{5}67\acute{8} \\
Parent 2& 67\grave{8}\grave{5}\grave{2}3\grave{4}1
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& 67\acute{2}\acute{4}\acute{5}3\acute{8}1
\end{array}
Determine some random positions $ r_1, \ dots, r_k $. Child 1 inherits the genes $ g_ {r_1}, \ dots, g_ {r_k} $ at those positions from parent 1 as is. The missing genes are inherited from parent 2 in a consistent manner. Swap parent 1 and parent 2 and do the same to generate child 2.
\begin{array}{cc}
Parent 1& 1\acute{2}3\acute{4}\acute{5}67\acute{8} \\
Parent 2& \grave{6}\grave{7}852\grave{3}4\grave{1}
\end{array}
\rightarrow
\begin{array}{cc}
Child& \grave{6}\acute{2}\grave{7}\acute{4}\acute{5}\grave{3}\grave{1}\acute{8}
\end{array}
If there is only one child, uniform order crossing and uniform position crossing are equivalent.
It searches two chromosomes for a section in which the same set of genes is lined up, and exchanges the sections to generate children 1 and 2.
\begin{array}{cc}
Parent 1& 1\acute{2}\acute{3}\acute{4}\acute{5}678 \\
Parent 2& 678\grave{5}\grave{3}\grave{2}\grave{4}1
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& 1\grave{5}\grave{3}\grave{2}\grave{4}678 \\
Child 2& 678\acute{2}\acute{3}\acute{4}\acute{5}1
\end{array}
--Partial Mapped Crossover (PMX) --Edge Recombination Crossover (ERX) --Edge Assembly Crossover (EAX)
Genes are exchanged for each interval of two randomly selected intervals.
\begin{array}{cc}
parent& 1|\acute{2}\acute{3}\acute{4}|56|\grave{7}\grave{8}|
\end{array}
\rightarrow
\begin{array}{cc}
Child& 1|\grave{7}\grave{8}|56|\acute{2}\acute{3}\acute{4}|
\end{array}
Moves a randomly selected section to another position.
\begin{array}{cc}
parent& 1|\acute{2}\acute{3}\acute{4}\acute{5}|678
\end{array}
\rightarrow
\begin{array}{cc}
Child& 167|\acute{2}\acute{3}\acute{4}\acute{5}|8
\end{array}
Randomly select two genes, $ g_1, g_2 $, and move $ g_1 $ before $ g_2 $. A special form of translocation.
\begin{array}{cc}
parent& 1\acute{2}3456\grave{7}8
\end{array}
\rightarrow
\begin{array}{cc}
Child& 1\grave{7}\acute{2}34568
\end{array}
Inverts the order of genes for randomly selected intervals.
\begin{array}{cc}
parent& 1|\acute{2}\acute{3}\acute{4}|5678
\end{array}
\rightarrow
\begin{array}{cc}
Child& 1|\acute{4}\acute{3}\acute{2}|5678
\end{array}
Randomly rearrange the order of genes for randomly selected intervals.
\begin{array}{cc}
parent& 1|\acute{2}\acute{3}\acute{4}\acute{5}|678
\end{array}
\rightarrow
\begin{array}{cc}
Child& 1|\acute{4}\acute{2}\acute{5}\acute{3}|678
\end{array}
Two chromosomes are cut and exchanged at one random location.
\begin{array}{cc}
Parent 1& g_1 g_2 g_3 | g_4 g_5 g_6 \\
Parent 2& h_1 h_2 h_3 | h_4 h_5 h_6
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& g_1 g_2 g_3 | h_4 h_5 h_6 \\
Child 2& h_1 h_2 h_3 | g_4 g_5 g_6
\end{array}
Swap the sections where two chromosomes are cut at two random points.
\begin{array}{cc}
Parent 1& g_1 g_2 | g_3 g_4 g_5 | g_6 \\
Parent 2& h_1 h_2 | h_3 h_4 h_5 | h_6
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& g_1 g_2 | h_3 h_4 h_5 | g_6 \\
Child 2& h_1 h_2 | g_3 g_4 g_5 | h_6
\end{array}
Swap the sections where two chromosomes are cut at several random points.
\begin{array}{cc}
Parent 1& g_1 | g_2 g_3 | g_4 g_5 | g_6 \\
Parent 2& h_1 | h_2 h_3 | h_4 h_5 | h_6
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& g_1 | h_2 h_3 | g_4 g_5 | h_6 \\
Child 2& h_1 | g_2 g_3 | h_4 h_5 | g_6
\end{array}
The genes of each of the two chromosomes are exchanged with an appropriate probability.
\begin{array}{cc}
Parent 1& g_1 g_2 g_3 g_4 g_5 g_6 \\
Parent 2& h_1 h_2 h_3 h_4 h_5 h_6
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& g_1 h_2 g_3 h_4 h_5 g_6 \\
Child 2& h_1 g_2 h_3 g_4 g_5 h_6
\end{array}
Replaces a gene at a random position with another gene.
\begin{array}{cc}
parent& g_1 g_2 g_3 g_4 g_5 g_6
\end{array}
\rightarrow
\begin{array}{cc}
Child& g_1 h_2 g_3 h_4 g_5 g_6
\end{array}
It produces entirely new individuals, not crossovers or mutations.
I was planning to publish the program by briefly introducing the genetic algorithm / traveling salesman problem, but since the introduction of the genetic algorithm has become a long sentence, I will write the program in a separate article.
Recommended Posts