The open source of reinforcement learning announced by OpenAI on May 24, 2017 seems to be easy to use, so I tried using it: smiley: "DQN" and its three variations of reinforcement learning algorithms released by artificial intelligence research group "OpenAI" Baselines github
The following two are introduced as tutorials to be executed baselines.deepq.experiments.train_cartpole baselines.deepq.experiments.train_pong
Mac OS Sierra 10.12.4 python 3.6.1 By the way, note that it cannot be done with python2.7 series.
cartpole This tutorial seems to be a game to prevent the stick of the dolly from falling For the time being, execute without thinking
It is said that there are not enough modules, so install it hard
#Commands for learning
python -m baselines.deepq.experiments.train_cartpole
#Command to play with the model of learning result
python -m baselines.deepq.experiments.enjoy_cartpole
Reinforcement learning ... By the way, episodes stopped at 690.
How do you play this ...: thinking: If it is determined that the black object has fallen, it feels like it has been reloaded, but it makes no sense.
pong This seems to be a competitive game like ice hockey
The module that is not enough here is also installed
By the way, if you are told that there is no cv2
, it means OpenCV, so you should refer to the following.
Make OpenCV3 available from python3 installed with pyenv
As you can see from the article below, opencv settings are quite troublesome on linux. The easiest way to use OpenCV with python So I switched to anaconda and ran it. (Recommended because it can be done soon)
#Commands for learning
python -m baselines.deepq.experiments.train_pong
#Commands for playing with the model of the learning result
python -m baselines.deepq.experiments.enjoy_pong
I'm learning ...
If 1 episode is 90 seconds and is repeated 690 times, it will take about 62,100 seconds, 17 hours and 15 minutes ...
~~ I stopped halfway through ~~
I tried: innocent:
There were about 1160 episodes, so it took a long time ... I can't say anything because it stopped in sleep mode on the way, but I think it took about 8 hours.
Please check the video below for the results of playing. [Try OpenAI Baselines on windows (winpython). ] (http://qiita.com/tmizu23/items/ff1d5c89bc99292410c0)
(By the way, I was wondering if this could be a battle with humans vs. machine learning, but that's not the case ... I wanted to fight reinforcement learning ...)
Recommended Posts