Pendulum - Done state


#1

Hi All,

please help me to get my head around this. In the pendulum environment

the def step(self,u) returns “self._get_obs(), -costs, False, {}”. Where “false” stands for the “done” bit.

def step(self,u):
    th, thdot = self.state # th := theta

    g = 10.
    m = 1.
    l = 1.
    dt = self.dt

    u = np.clip(u, -self.max_torque, self.max_torque)[0]
    self.last_u = u # for rendering
    costs = angle_normalize(th)**2 + .1*thdot**2 + .001*(u**2)

    newthdot = thdot + (-3*g/(2*l) * np.sin(th + np.pi) + 3./(m*l**2)*u) * dt
    newth = th + newthdot*dt
    newthdot = np.clip(newthdot, -self.max_speed, self.max_speed) #pylint: disable=E1111

    self.state = np.array([newth, newthdot])
    return self._get_obs(), -costs, False, {}

If the step method always returns “False”. How does the environment issues “DONE” bit for the final state? At which point are we done?

Thank you.