FrozenLake Save Action/State/Outcome


#1

I’d like to create 3 files using the FrozenLake env:

Initialize the game board in a random state, take an action, see the outcome, and save all 3. When I try to do the following it doesn’t save the observation, just a “0” value.

import gym
env = gym.make(‘FrozenLake-v0’)
observation = env.reset()

actions = []
state = []
outcome = []

for _ in range(1000):
render = env.render()
print ‘obs:’.format(observation)
action = env.action_space.sample()
state.append(observation)
actions.append(action)
env.step(action)

print actions, state

That outputs:
[0, 3, 1, 0, 3, 3, 3, 3, 1, 3, 1, 2, 0, 3, …] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …]

When I print “observation” it shows the game board, but appends as a 0.


#2

I found a solution for myself, posting here in case it is handy for future uses:

import gym
from copy import copy

env = gym.make(‘FrozenLake-v0’)
observation = env.reset()

actions = [] # lists to store
state = []

for _ in range(1000):
render = env.render(mode=‘ansi’) # set mode to 'ansi’
contents = render.getvalue() # use StringIO’s .getvalue()
action = env.action_space.sample()
state.append(contents) # append state and action to list
actions.append(action)
env.step(action)

print actions, state
Summarised, this outputs (which can be cleaned easily):

[0, 3, 1, 0, 3, 3, 3, 3, 1, 3, 1, 2, 0, 3, 2, 0, 0, 0, 2, 1, 2, 3, 3, 2, 0, 1, 1, 1, 1, 0, 1, 0, 3, 0, 3, 1, 2, 3, 3, 0, 2, 3, 0, 1, 3, 1, 3, 3, 2, 3, 0, 1, 1, 1, 3, 0, 3, 2, 0, 3, 3, 2, 3, 2, 3, 0, 2, 0, 0, 0, 1, 1, 2, 0, 0, 1, 3, 0, 1, 2, 2, 3, 0, 1, 1, 3, 1, 1, 3, 2, 3, 3, 2, 2, 3, 0, 2, 3, 1, 0, 1, 2, 0, 3, 0, 2, 0, 3, 3, 0, 3, 0, 0, 0, 0, 2, 3, 0, 3, 2, 3, 3, 1, 1, 1, 0, 1, 1, 1, 3, 0, 3, 1, 2, 0, 1, 2, 0, 2, 0, 1, 3, 2, 2, 1, 0, 3, 1, 1, 3, 0, 2, 2, 3, 2, 3, 3, 3, 2, 1, 2, 2, 3, 2, 3, 3, 2, 2, 3, 0, 1, 2, 3, 2,… ] […(Down)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Right)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Left)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Down)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Right)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Down)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Right)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Down)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Up)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Right)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Up)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Down)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Right)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Up)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Right)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Right)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Right)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ (Left)\nSFFF\nF\x1b[41mH\x1b[0mFH\nFFFH\nHFFG\n’, u’ ]
This could be of potential use for future individuals, but if this is not an appropriate place just delete!