There is a more generalized method of AlphaZero called MuZero announced by DeepMind. It is very powerful that it can be applied not only to interpersonal games with clear rules but also to single-player games such as Atari games, and it seems that the performance is quite high. I haven't even existed for about a year (because I personally took childcare leave) (...), and recently I've been in the news again and I finally got to know it, so I've been playing around with it lately. I will share it.
I've already published a repository of very nice PyTorch-based implementations called muzero-general, so I'll mainly introduce them.
-Various games are included from the beginning, and it is very easy to add by yourself (just add one file) --Easy to adjust Hyperparameters --If you can use GPU, it will be used --You can see the status of Reward acquisition, Loss transition, and learning & SelfPlay digestion speed in real time on TensorBoard. --Easy to start using --The source code is also easy to read
It's good to see Getting started, but you can start like this.
git clone https://github.com/werner-duvaud/muzero-general.git
cd muzero-general
pip install -r requirements.txt
python muzero.py
# ->Menu is displayed
# ->Game type selection
# ->what will you do: Training, LoadModel, TestPlay(MuZero vs Human), ViewPlay(MuZero Vs MuZero), etc
At first, I think it would be better to look at the "Marubatsu game" in Japanese, which is called tictactoe
.
Just add the files to the games /
directory.
We will implement methods like reset
, step
, to_play
, legal_actions
in a class called Game
. There are a number of Game implementations in the directory that you can use as a reference, so it's easy to understand what to do.
Maybe there is a bug in the implementation of the two-player game and if it doesn't work, try applying this Pull Request. At first, tictactoe remained weak even after half a day, but after applying this PR, it became a satisfactory strength in a few hours.
We have added common games to the game app on your smartphone, so if you are interested, please.
--File: make2048.py
--File: x2blocks.py
I feel like I have a very fun toy. If you learn connect4 or 1 to 2 days, it will become quite strong. If you do it properly, you can do it with an unpleasant Trap ...
Recommended Posts