Learning Path


Hi everyone!

I’m fairly new to RL and Deep RL.
I have found that a good place to start is David silver + Sutton book.

Please suggest any other ways/learning curve.
I want to play with the Gym once I have a good grasp :slight_smile:

I’m a DL, CV geek.


I am also working from Sutton & Barto, and watching the David Silver lectures.

First, if you are using the old printed version of Sutton & Barto, then check out the latest draft of second edition (it is free to download as a PDF): http://incompleteideas.net/book/the-book-2nd.html

IMO, it is worth doing some of the exercises from the early chapters of the book, before diving into the more advanced algorithms.

There is a gap between the book theory and practice with actual code using neural networks for value approximation (or policy networks). This is where things like Gym come in - but you will want to build a learning agent yourself as opposed to pre-built library where you just set learning parameters. I am struggling myself with ways to bridge this gap other than just “having a go” at personal projects, trying to code some of the algorithms from scratch.

I have found it useful to rehearse the pseudo-code for a variety of algorithms from Sutton & Barto; as opposed to looking at it once, understanding it, then moving on. Some of the key algorithms, such as Q learning, Actor-Critic etc, I have actually memorised the pseudo-code and practice just writing it out in a notebook every now and then (obviously not just copying, but thinking about the steps and why each is needed). Also, writing or helping with very small projects such as writing a TicTacToe bot from scratch, including writing the environment. It is not much effort to write a really simple environment.