It is assumed that you have achieved up to Reinforcement Learning 12. I will do it on Ubuntu 18.04. I tried to replace CartPole-v0 with MountainCar-v0 with the CartPole I made earlier. It seems that the difficulty level is increasing.
I replaced it as it was, but something was different. .. .. .. I set gamma to 0.99.
Looking around the site, there is a lot of learning. Is it the trick to do more? I set it as follows.
chainerrl.experiments.train_agent_with_evaluation(
agent, env,
steps=1000000, # Train the agent for 2000 steps
eval_n_steps=None, # 10 episodes are sampled for each evaluation
eval_n_episodes=1, # 10 episodes are sampled for each evaluation
eval_max_episode_len=200, # Maximum length of each episodes
eval_interval=100, # Evaluate the agent after every 1000 steps
outdir='result') # Save everything to 'result' directory
print('Finished.')
I set epsilon = 0.003.
It took some time to study, but I can climb it.
Click here for the 2000 learning curve.
Here is the curve of learning 10000 times.
If it is 10,000 times, it will take 85 minutes. I wish I had a computer that I wasn't using. What should I do with a computer that I usually use on mobile?
I'm thinking of starting GPU with the goal of about 30th time. I'm investigating the preparation, but Chainer has an extremely small amount of programs. About 8Mb on the HDD. The tensorflow is large and over 300Mb. I want to use Radeon as GPU, but I wonder if chainer works.
Recommended Posts