Introduction

Motivation

Somehow I came up with it.

If you want to solve Sudoku without human power, it is well known that you can solve it by using linear programming or Hopfield network. However, I thought that even if I didn't know the rules of Sudoku, I could solve it by pushing the number of trials, so I actually tried it.

What is Sudoku?

A type of puzzle game. See wikipedia for details. Sudoku

What is oputuna

optuna A python library for optimizing hyperparameters developed by Preferred Networks. The hyperparameter search algorithm uses TPE, which is the same as HyperOpt.

In a nutshell, omitting detailed explanations, even if the optimization target is a black box, it will optimize the objective function in a nice way.

Method

As mentioned above, we will use optuna. Therefore, the objective function is set so that the value becomes smaller as the answer approaches the correct answer.

Objective function

The objective function decides to count the number of rule violations. Here, fewer violations are synonymous with getting closer to the correct answer. I don't know the correct answer explicitly, but if I minimize the value of the objective function, the number of violations will eventually reach 0, and the correct answer will be reached.

def evaluate(answer):
    #Returns the number that violates the Sudoku loop. If the answer is correct, the evaluation will be 0
    #answer is a 9x9 two-dimensional array with numbers 1-9 in each element
    tmp = np.reshape(answer, [3, 3, 3, 3])
    loss = np.sum((
        #Number of violations in each column
        np.sum([np.count_nonzero(np.logical_not(np.any(answer == i, axis=0))) for i in range(9)]),
        #Number of violations in each line
        np.sum([np.count_nonzero(np.logical_not(np.any(answer == i, axis=1))) for i in range(9)]),
        #Number of violations per 3x3 area
        np.sum([np.count_nonzero(np.logical_not(np.any(tmp == i, axis=(1, 3)))) for i in range(9)]),
    ))
    return loss

The number of violations does not count the number of duplicate numbers, but the number of numbers that did not appear.

If the rules are met, 1-9 will appear once for each condition. If there is a duplicate due to a rule violation, the number of appearances of one of the other numbers has decreased to 0, so count this.

In the case of duplication, there are various patterns from 2 to 9 times, but if the number of appearances decreases, there are only 0 times, so I think it will be simpler to count this.

Calling optuna, etc. (whole code)

The problem I wanted to solve was hard-coded in the source code. It's basically the same as the code in the optuna documentation.

from itertools import product

import numpy as np
import optuna
import click


"""
The problem is https://ja.wikipedia.org/wiki/%E6%95%B0%E7%8B%From the image of the AC example
 -----------------        
|5|3| | |7| | | | |
|-+-+-+-+-+-+-+-+-|
|6| | |1|9|5| | | |
|-+-+-+-+-+-+-+-+-|
| |9|8| | | | |6| |
|-+-+-+-+-+-+-+-+-|
|8| | | |6| | | |3|
|-+-+-+-+-+-+-+-+-|
|4| | |8| |3| | |1|
|-+-+-+-+-+-+-+-+-|
|7| | | |2| | | |6|
|-+-+-+-+-+-+-+-+-|
| |6| | | | |2|8| |
|-+-+-+-+-+-+-+-+-|
| | | |4|1|9| | |5|
|-+-+-+-+-+-+-+-+-|
| | | | |8| | |7|9|
 -----------------        
"""
#The part with the value in advance
preset = {'p00': 5, 'p01': 3, 'p04': 7,
          'p10': 6, 'p13': 1, 'p14': 9, 'p15': 5,
          'p21': 9, 'p22': 8, 'p27': 6,
          'p30': 8, 'p34': 6, 'p38': 3,
          'p40': 4, 'p43': 8, 'p45': 3, 'p48': 1,
          'p50': 7, 'p54': 2, 'p58': 6,
          'p61': 6, 'p66': 2, 'p67': 8,
          'p73': 4, 'p74': 1, 'p75': 9, 'p78': 5,
          'p84': 8, 'p87': 7, 'p88': 9}


def evaluate(answer):
    #As above
    
    
def objective(trial):
    candidate = (1, 2, 3, 4, 5, 6, 7, 8, 9)
    
    answer = np.empty([9, 9], dtype=np.uint8)
    for i, j in product(range(9), repeat=2):
        key = 'p{}{}'.format(i, j)
        if key in preset:
            answer[i, j] = preset[key]
        else:
            answer[i, j] = trial.suggest_categorical(key, candidate)
            
    return evaluate(answer)
    
    
def run(n_trials):
    study_name = 'sudoku'
    study = optuna.create_study(study_name=study_name, storage='sqlite:///sudoku.db', load_if_exists=True)
    study.optimize(objective, n_trials=n_trials)
    
    show_result(study.best_params, study.best_value)
    
    df = study.trials_dataframe()
    df.to_csv('tpe_result.csv')


def show_result(best_params, best_value):
    for i in range(9):
        for j in range(9):
            key = 'p{}{}'.format(i, j)
            if key in preset:
                print('*{:1d}'.format(preset[key]), end='')
            else:
                print('{:2d}'.format(best_params[key]), end='')
        print('')
    print('loss: {}'.format(best_value))
        

@click.command()
@click.option('--n-trials', type=int, default=1000)
def cmd(n_trials):
    run(n_trials)
    
    
def main():
    cmd()
    
    
if __name__ == '__main__':
    main()

Experimental result

I'm sorry for those who read this far in anticipation.

I couldn't solve it at all.

The time required for each evaluation is much longer than expected, and the number of trials is not enough. The number of trials is 0 for about 7 seconds and 200 for about 25 seconds.

Even if we continue as it is, it will take hundreds of thousands to millions of trials to reach the correct answer, so I think that it is impossible to achieve in reality.

Finally, for the time being, the output at the end of 200 times is shown.

*5*3 5 5*7 2 6 1 2
*6 7 8*1*9*5 7 3 5
 4*9*8 6 3 7 8*6 4
*8 1 4 7*6 9 7 6*3
*4 3 9*8 5*3 2 7*1
*7 8 9 3*2 1 5 4*6
 3*6 1 6 7 9*2*8 7
 5 7 3*4*1*9 8 1*5
 4 7 6 2*8 4 3*7*9
loss: 73.0

The numbers with * on the left side are the numbers from the beginning as hints for the answer.

Summary

Sudoku cannot be solved with optuna (TPE algorithm). I was planning to push by the number of trials, but the calculation time was much longer than expected and I could not push.

optuna (TPE algorithm) takes longer to sample as the number of trials increases.

Future plans

The cause of this failure was the long calculation time of the TPE algorithm, so I think that this problem can be solved by using another algorithm. Since the TPE algorithm does not consider conditional distributions in the first place, it is a rather unsuitable algorithm for solving Sudoku.

Next time, I would like to try an algorithm that samples from conditional distributions. Therefore, you need to understand the rules of Sudoku before you solve it. It would have been a reckless attempt to reach the correct answer without knowing the rules of Sudoku like this time.

The schedule is completely undecided. I think that the algorithm will be simulated annealing, but the posting time is undecided because I will investigate it from now on.

I tried to let optuna solve Sudoku