Optimization frameworks include Optuna </ font> and Hyperopt </ font>. I was wondering which one was better, so I will compare it using the function optimization problem. The two frameworks are introduced in a separate article, so please refer to that. Try function optimization using Optuna Try function optimization using Hyperopt
This time
x^2+y^2+z^2
We will optimize the minimization problem of. The results will be different for each trial, so I'll try three times.
The code used for the experiment this time is as follows.
# -*- coding: utf-8 -*-
import optuna
import hyperopt
from hyperopt import hp
from hyperopt import fmin
from hyperopt import tpe
from hyperopt import Trials
import matplotlib.pyplot as plt
#Set objective function for optuna(This time x^2+y^2+z^2)
def objective_optuna(trial):
#Set parameters to optimize
param = {
'x': trial.suggest_uniform('x', -100, 100),
'y': trial.suggest_uniform('y', -100, 100),
'z': trial.suggest_uniform('z', -100, 100)
}
#Returns the evaluation value(It is designed to be minimized by default)
return param['x'] ** 2 + param['y'] ** 2 + param['z'] ** 2
#Optimized with optuna
def optuna_exe():
#study object creation
study = optuna.create_study()
#Optimization execution
study.optimize(objective_optuna, n_trials=500)
#Best parameter display
print(study.best_params)
#Show best objective function value
print(study.best_value)
epoches = [] #For storing the number of trials
values = [] #For storing the best value
best = 100000
#best update
for i in study.trials:
if best > i.value:
best = i.value
epoches.append(i.number + 1)
values.append(best)
return epoches, values
#Set objective function for hyperopt
def objective_hyperopt(args):
x, y, z = args
return x ** 2 + y ** 2 + z ** 2
#Optimized with hyperopt
def hyperopt_exe():
#Search space settings
space = [
hp.uniform('x', -100, 100),
hp.uniform('y', -100, 100),
hp.uniform('z', -100, 100)
]
#An object for recording the state of the search
trials = Trials()
#Start exploration
best = fmin(objective_hyperopt, space, algo=tpe.suggest, max_evals=500, trials=trials)
#Output the result
print(best)
epoches = [] #For storing the number of trials
values = [] #For storing the best value
best = 100000
#best update
for i, n in zip(trials.trials, range(500)):
if best > i['result']['loss']:
best = i['result']['loss']
epoches.append(n+1)
values.append(best)
return epoches, values
def plot_graph():
result_optuna = optuna_exe()
result_hyperopt = hyperopt_exe()
epoch_optuna = result_optuna[0]
value_optuna = result_optuna[1]
epoch_hyperopt = result_hyperopt[0]
value_hyperopt = result_hyperopt[1]
#Drawing a graph
fig, ax = plt.subplots()
ax.set_xlabel("trial")
ax.set_ylabel("value")
ax.set_title("Optuna vs Hyperopt")
ax.grid() #Insert grid lines
ax.plot(epoch_optuna, value_optuna, color="red", label="Optuna")
ax.plot(epoch_hyperopt, value_hyperopt, color="blue", label="Hyperopt")
ax.legend(loc=0) #Usage Guide
plt.show() #graph display
if __name__ == '__main__':
plot_graph()
Optuna: 'x': 0.2690396239515218, 'y': -1.75236444646743, 'z': 0.3724308175904496, best_value:3.2818681863901693
Hyperopt: 'x': -2.9497423868903834, 'y': 0.13662455602710644, 'z': -3.844496541052724, best_value:23.499800072493738
Optuna is excellent in the final best_value. It seems that Optuna is superior to the graph in terms of convergence speed.
Optuna: 'x': 0.7811129871251672, 'y': 0.4130867942356189, 'z': 0.6953642534092288, best_value:1.2643096431468364
Hyperopt: 'x': -3.7838067947126675, 'y': -2.595648793357423, 'z': -2.683504623035553, best_value:28.255783580024783
Is Optuna superior in terms of final best_value and convergence speed in the second time as well as the first time?
Optuna: 'x': -0.19339325990518663, 'y': -0.0030977352573082623, 'z': 0.4961595538587318, best_value:0.2835848518257752
Hyperopt: 'x': 2.810074634010315, 'y': -1.2603362587820195, 'z': -0.7356174272489406, best_value:10.026099933181214
Optuna was superior in the final best_value value for the third time as well. I feel that the convergence speed does not change much.
We came to the conclusion that Optuna is superior in terms of final best objective function value and convergence speed. If we make the optimization problem a little more difficult, will it make a big difference ...