# I made a program in Java that solves the traveling salesman problem with a genetic algorithm

I happened to be doing research to solve with a genetic algorithm, I made it in the sense of increasing the number of steps for the future.

# What is the traveling salesman problem?

It is a problem to think about the order in which the shortest passage can be made by writing a single stroke at a certain point in $n$. If the number of cities to write in one stroke is $n$, the number of possible combinations is $\ frac {(n-1)!} {2}$.

# What is a genetic algorithm?

It is one of the evolutionary computations that model the evolution of living things. The combination of solutions is regarded as a chromosome, and the solution is evolved by repeating genetic manipulations such as crossover and mutation to finally find the best solution.

# Class flow

Since we are programming in Java, we will look at each class in order. The class diagram is as below. Click here for the references used in the design [^ 1]

The class of this program is as follows.

• Point class: $x$, $y$ Stores each coordinate.
• Map class: A class that aggregates the data of the Point class and puts it together as a map.
• Tour class: A class that stores routes based on the Map class.
• TourWithGA class: A class derived from the Tour class that can solve genetic algorithms.
• tspSample class: This is the main class.
• tspIO class: A class that reads coordinates from a file.
• Sort class: A class for ranking individuals. (Not included in the class diagram above.)
• Chromosome class: A class of chromosomes. The genes are included in a sequence.
• Individuals class: A class of populations. A set of Chromosome classes and an elite chromosome of Chromosome type are stored.

Among them, three genetic algorithms are used: ** TourWithGA **, ** Chromosome **, and ** Individuals **.

Regarding genetic manipulation, the Chromosome class implements methods for calculating mutation and gene compatibility, and the Individual class implements crossover, gene regeneration / selection, and elite conservation as methods.

As a strategy, it is created with reference to the branch exchange crossing (EXX) method proposed by Maekawa et al. [^ 2].

# Behavior of genetic algorithm

Since there is a lot of source code, please see the source code for details. I will explain the specifications by focusing on the operation in the genetic algorithm.

## Map data

Enter the location of the city in the traveling salesman problem as a CSV file. The contents of the CSV file are

#### berlin52.csv


52,
565,575
25,185
345,750
945,685
845,655
880,660
25,230
・
・
・


In the first line, enter the number of cities that can be read. From the second line onward, enter the $x$ and $y$ coordinates of the city.

The main class of the program is ** tspSample **. When running in java

$java tspSample <CSV file to read>  Please execute like this. If the reading is successful, city_num= 52 565.000 575.000 25.0000 185.000 345.000 750.000 945.000 685.000 845.000 655.000 ・ ・ ・  In this way, the number of read cities and city data are output to the console. ## Initial individual generation Generate an initial population that genetically manipulates Individuals type children. Specifically, in the Chromosome class, it is expressed as a permutation of cities as a gene sequence. The Tour class has a method called makeTour () that randomly generates a circuit. After generating the circuit, the circuit is copied to the gene sequence. ## Goodness of fit calculation Goodness of fit is calculated based on power law scaling. The number of cities$ n $can be obtained from the Map class, and the circuit length$ l $can be obtained by the getDistance () method of the Tour class. \mathrm{Fitness} = \left(\frac{n}{l} \right)^3  ## Play / select In the program, we use two Individuals classes. Perform genetic manipulation with ** parents **. In the next generation, ** children ** will survive the chromosomes in their parents based on the goodness of fit of the chromosomes.$ N_ {\ mathrm {p}} $is stored in parents, and$ N_ {\ mathrm {c}} $is stored in children. (However,$ N_ {\ mathrm {p}} \ geq N_ {\ mathrm {c}} $) As for survival, the top$ N_ {\ mathrm {c}} $chromosomes in parents are survived in descending order of fitness. However, since elite individuals must always be included in the population, if the goodness of fit of the chromosome with the highest goodness of fit in the population is greater than the goodness of fit of the elite chromosome, the elite individual is the best fit in the population. Update to a high chromosome. If not, replace the$ N_ {\ mathrm {c}} $th chromosome with an elite chromosome. The method of selecting the parent chromosome to be genetically engineered from the surviving population children is to calculate the expected value of survival in the parent individual from the following formula when the goodness of fit of the chromosome$ C_i $in the children is$ f_i $. \mathrm{Experience} = \frac{N_{\mathrm{p}} \times f_i}{N_{\mathrm{c}} \times \overline{f}} $ \ Overline {f} $is the average goodness of fit of chromosomes in children. Select the parent chromosome as many as the expected value (rounded down to the nearest whole number) and copy it to parent. The part after the decimal point is used for roulette selection. For all chromosomes, if you select as many parents as you expect, parents will have some empty chromosomes. The remaining parents select their parents by roulette selection. In order to prevent the same chromosome from being selected from the selected parents, two are randomly selected from the parent chromosomes so that they do not overlap, and they are crossed. ## Crossing / mutation Crossover is based on crossover probability, and mutation is based on mutation probability. See reference [2] for more information. ## Gene correction The genetic algorithm has a strong ability to find a global optimum solution, but has a weakness in local search ability. Therefore, the gene is corrected by the 2-opt method, which is one of the sequential improvement methods for the traveling salesman problem, before the calculation of the goodness of fit. # I actually tried it$ N_ {\ mathrm {c}} = $100,$ N_ {\ mathrm {p}} = \$ 150, maximum number of generations: 300, crossover probability: 0.8, mutation probability: 0.1, 52 city benchmark problem Berlin52 (optimal value) : 7544.366)). .. For the completed route, the data will be output to the result.csv in the same directory.

city_num= 52
565.000  575.000
25.0000  185.000
345.000  750.000

(Omitted)

595.000  360.000
1340.00  725.000
1740.00  245.000
start...
0    7742.647
1    7742.647
2    7742.647
3    7742.647
4    7544.366
5    7544.366
6    7544.366

(Omitted)

294    7544.366
295    7544.366
296    7544.366
297    7544.366
298    7544.366
299    7544.366
300    7544.366
7544.366
stop.
Processing time: 3157/ms


# at the end

This branch exchange crossing is said to be the second best method after the branch assembly crossing method in solving the cyclic scheduling problem with a genetic algorithm. Currently, the method using branch assembly crossover is said to be the best method for solving the genetic algorithm for the traveling salesman problem. I would like to try it someday. Since I am studying scheduling problems, I would like to create a program that solves scheduling problems.

# Source code

The source code is open to the public. The code may be dirty because I was in a lab that doesn't use much Java ... (Please feel free to give me advice.) https://github.com/SHIMADONBEY/GAwithTSP.git