0. Introduction

I checked how to draw OpenAI Gym on Google Colab, so make a note.

Referenced sites

1. Challenges

I get a NoSuchDisplayException error when trying to display the environment with therender ()method of gym.Env.

import gym
env = gym.make('CartPole-v1')
env.reset()
env.render()

NoSuchDisplayException                    Traceback (most recent call last)
<ipython-input-3-74ea9519f385> in <module>()
      2 env = gym.make('CartPole-v1')
      3 env.reset()
----> 4 env.render()

2. Countermeasures

As far as I investigated, I found that there are three ways to use Gym's drawing function on Colab. Each method has advantages and disadvantages, and I could not narrow down to one, so I will describe all three types.

2.1 Common preparation

All three methods use the X11 virtual display Xvfb. install.

!apt update
!apt install xvfb

(When starting Jupyter Notebook independently with Docker image etc., OpenGL related is also required, so ʻapt install python-opengl`. )

Furthermore, in order to use Xvfb from Google Colab (Jupyter Notebook), use PyVirtualDisplay.

!pip install pyvirtualdisplay

from pyvirtualdisplay import Display

d = Display()
d.start()

There was a description that {display number}. {Screen number} was set in the " DISPLAY " environment variable, but [I was told by the author of PyVirtualDisplay that it is unnecessary](https: // github.com/ponty/PyVirtualDisplay/issues/54).

According to him, the screen number is a value used when there are multiple displays, and since PyVirtualDisplay generates only one screen, it is fixed to 0, and if the screen number is not written, it is automatically interpreted as 0. Because of that. (See StackOverflow)

In other words, since the environment variable is set in pyvirtualdisplay.Display.start (), it is not necessary to change it from the outside. (At least confirmed in 1.3.2, the latest version as of June 18, 2020)

2.2 Method 1

The first is to simply draw the screen data with matplotlib and repeat erasing.

The disadvantage is that it is not very fast and is displayed only once, but it is a method that can handle even if the drawing data becomes long because it keeps overwriting without retaining the drawing data.

import gym
from IPython import display
from pyvirtualdisplay import Display
import matplotlib.pyplot as plt

d = Display()
d.start()

env = gym.make('CartPole-v1')

o = env.reset()

img = plt.imshow(env.render('rgb_array'))
for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN

    display.clear_output(wait=True)
    img.set_data(env.render('rgb_array'))
    plt.axis('off')
    display.display(plt.gcf())

    if d:
        env.reset()

2.3 Method 2

The second is to use matplotlib.animation.FuncAnimation to display the animation.

While the drawing screen can be displayed repeatedly and the display speed for each frame can be set freely, it requires a lot of memory because it is necessary to retain drawing data, and the screen size to be displayed and the number of displays must be adjusted. Can cause memory errors. (If you get an error during long learning ...)

import gym
from IPython import display
from pyvirtualdisplay import Display
import matplotlib.pyplot as plt
from matplotlib import animation


d = Display()
d.start()

env = gym.make('CartPole-v1')

o = env.reset()

img = []
for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN

    display.clear_output(wait=True)
    img.append(env.render('rgb_array'))

    if d:
        env.reset()

dpi = 72
interval = 50 # ms

plt.figure(figsize=(img[0].shape[1]/dpi,img[0].shape[0]/dpi),dpi=dpi)
patch = plt.imshow(img[0])
plt.axis=('off')
animate = lambda i: patch.set_data(img[i])
ani = animation.FuncAnimation(plt.gcf(),animate,frames=len(img),interval=interval)
display.display(display.HTML(ani.to_jshtml()))

2.4 Method 3

The last method is to save the drawing data as a movie using gym.wrappers.Monitor. The render () method is not required and is automatically saved when you call the step (action) method.

import base64
import io
import gym
from gym.wrappers import Monitor
from IPython import display
from pyvirtualdisplay import Display

d = Display()
d.start()

env = Monitor(gym.make('CartPole-v1'),'./')

o = env.reset()

for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN

    if d:
        env.reset()

for f in env.videos:
    video = io.open(f[0], 'r+b').read()
    encoded = base64.b64encode(video)

    display.display(display.HTML(data="""
        <video alt="test" controls>
        <source src="data:video/mp4;base64,{0}" type="video/mp4" />
        </video>
        """.format(encoded.decode('ascii'))))

3. Library: Gym-Notebook-Wrapper

Since it is troublesome to write the above method every time, I made it into a library.

3.1 Installation

It's published on PyPI, so you can install it with pip install gym-notebook-wrapper.

!apt update && apt install xvfb
!pip install gym-notebook-wrapper

Of course, it can be used other than Google Colab, but Linux is a prerequisite for using Xvfb.

3.2 How to use

The gym-notebook-wrapper has a long hyphen (-), so the module name that can be imported is gnwrapper.

Method 1 → gnwrapper.Animation
Method 2 → gnwrapper.LoopAnimation
Method 3 → gnwrapper.Monitor

3.2.1 `gnwrapper.Animation` (= 2.2 Method 1)

import gnwrapper
import gym

env = gnwrapper.Animation(gym.make('CartPole-v1')) #Xvfb is started

o = env.reset()

for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN
    env.render() #Here, the previous drawing is erased and a new step is drawn.
    if d:
        env.reset()

3.2.2 `gnwrapper.LoopAnimation` (= 2.3 Method 2)

import gnwrapper
import gym

env = gnwrapper.LoopAnimation(gym.make('CartPole-v1')) #Xvfb is started

o = env.reset()

for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN
    env.render() #Now save the drawing data
    if d:
        env.reset()

env.display() #Here, the saved drawing data is displayed as an animation.

3.2.3 `gnwrapper.Monitor` (= 2.4 Method 3)

import gnwrapper
import gym

env = gnwrapper.Monitor(gym.make('CartPole-v1'),directory="./") #Xvfb is started

o = env.reset()

for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN
    if d:
        env.reset()

env.display() #Here, the drawing data saved as a video is displayed.

4. Finally

I organized various information on the net and summarized three ways to draw OpenAI Gym on Google Colab. It should be the code that I actually ran and confirmed several times, but I'm sorry if I did a copy pemis.

Gym-Notebook-Wrapper is still rough and may have bugs, so feel free to set up issue if you have any questions. I'm glad if you get it.

[Reinforcement learning] How to draw OpenAI Gym on Google Corab (2020.6 version)