Running Universe in kubernetes


#1

Hello! I’m trying to run Universe in a kubernetes pod, but my agent container keeps erroring out with this stacktrace

  File "KubePodTest.py", line 10, in <module>
    observation_n, reward_n, done_n, info = env.step(action_n)
  File "/usr/local/lib/python3.5/dist-packages/gym/core.py", line 110, in step
    observation, reward, done, info = self._step(action)
  File "/usr/local/universe/universe/wrappers/timer.py", line 20, in _step
    observation_n, reward_n, done_n, info = self.env.step(action_n)
  File "/usr/local/lib/python3.5/dist-packages/gym/core.py", line 110, in step
    observation, reward, done, info = self._step(action)
  File "/usr/local/universe/universe/wrappers/render.py", line 30, in _step
    observation_n, reward_n, done_n, info_n = self.env.step(action_n)
  File "/usr/local/lib/python3.5/dist-packages/gym/core.py", line 110, in step
    observation, reward, done, info = self._step(action)
  File "/usr/local/universe/universe/wrappers/throttle.py", line 117, in _step
    observation_n, reward_n, done_n, info = self._substep(action_n)
  File "/usr/local/universe/universe/wrappers/throttle.py", line 132, in _substep
    observation_n, reward_n, done_n, info = self.env.step(action_n)
  File "/usr/local/lib/python3.5/dist-packages/gym/core.py", line 110, in step
    observation, reward, done, info = self._step(action)
  File "/usr/local/universe/universe/envs/vnc_env.py", line 476, in _step
    self._handle_crashed_n(info_n)
  File "/usr/local/universe/universe/envs/vnc_env.py", line 549, in _handle_crashed_n
    raise error.Error('{}/{} environments have crashed! Most recent error: {}'.format(len(self.crashed), self.n, errors))
universe.error.Error: 1/1 environments have crashed! Most recent error: {'0': 'Rewarder session failed: Lost connection: connection was closed uncleanly (peer dropped the TCP connection without previous WebSocket closing handshake) (clean=False code=1006)'}

The agent and remote containers are both running in the same pod, and the python I’m running is just the example code with env.configure(remotes="vnc://localhost:5900+15900"). Ports 15900 and 5900 are exposed as a NodePort service through kubernetes

EDIT: The image I’m using is quay.io/openai/universe.flashgames:0.20.29
EDIT2: I’m also curious what [nginx] 2017/08/09 21:10:07 [info] 65#65: *1 client sent invalid request while reading client request line, client: 127.0.0.1, server: , request: "CONNECT www.google.com:443 HTTP/1.1" means? I see it in the console output of the remote container


#2

Check out these Kubernetes tutorials for more help.