Tic Tac Toe



I’m a software programmer who wants to learn how to use machine learning. I’m starting with zero knowledge of AI.

The idea of reinforcement learning seems really attractive to me, so I’d like to start there if possible.

Using OpenAI, I want to make a simple tic tac toe bot, with the goal of grasping the fundamentals.

Where do I start?


Depends how you like to learn. I started with a Udacity course: https://classroom.udacity.com/courses/ud600 and went on to read Reinforcement Learning: An Introduction. For me that has been a theory-heavy approach and I only recently wrote a TicTacToe bot (using Q-learning). However, when I did get around to writing it, I was very confident about what I was doing and why.

The starting point in OpenAI could be to follow a tutorial on one of the other simple environments (e.g. CartPole) - of which there are many online, and once you understand that, try changing the environment to TicTacToe and adapting the code to learn that.

At some point you will need to take a detour from “pure” reinforcement learning and learn at least a little supervised machine learning. Andrew Ng’s course on Coursera might be a good start for that.

You may want to brush up on basic stats, calculus and linear algebra (matrices, vectors etc). You don’t need to know them in great depth, but the basics are essential to understanding the theory and taking things further - both in reinforcement learning and supervised learning.


My recommended concepts progression progression of things your should learn and try

  • Biological Neurones and Neural Networks (NN)
  • Artificial perceptron
  • Artificial Neural Network (Weights, Activation, forward feed, back propagation)
  • Exercise check point
    • build a very simple NN without any library (plenty of examples out there)
    • build an NN to to do simple classifications (with or without library)
  • Neural Network continued (NN as function approximators, NN convergence, NN limitations, effects of NN architecture, effects of learning rate, effects of different activation functions)
  • Exercise check point (I recommend you use python with Keras)
    • build a simple image recognition agent ( I recommend building a classifier for digits from 0-9 based on image input)
    • build an agent that can win all the time at tic tac toe using NN
  • Recurrent Neural Network
  • Long Short Term Memory
  • Exercise check point
    • Build a character generator NN that can generate text similar to what it has been trained on (I built one that generate Bible verses. I think it thinks too highly of itself and doesn’t really like Eve, more training needed)
  • Understanding Markov Decision Process
  • Understanding how to use Bellman equation to solve an MDP
  • Reinforcement Learning (Concepts)
  • Exercise point: You are now ready to install OpenAI gym and try some exercise
    • Build an agent that can win all the time at tic tac toe using Q Learning and/or Q prediction (Tree search), or Q memory (storing all Q values for all states)
    • Solving Cartpole using deterministic Bellman / Q tree Search
    • Solving Cartpole with Q function estimator
    • Using your algorithm on other examples to solve harder problems
  • Further learnings
    • Actor Critic
    • A2C and A3C
    • TRPO
    • Generalized Advantage estimators
  • Exercise point
    • Solving complex 3D motion with very delayed rewards
    • Bipedal walker
    • Bipedal walker playing soccer
    • Playing Atari games and other gym / universe games
    • Sky is the limit (You have now reach state of the art AI research)

There are so many links and research articles about all these sections that it would be too long a task (and post). You can check out my github for some examples, but really you should google and youtube a lot of these sections. (I might create such a resouces on my Github Page https://github.com/FitMachineLearning/FitML)

Also each of these sections involve a lot of math and sometimes math intuition, I highly recommend sitting through many of the free AI university classes online (Youtube). Message me also if you can’t find good material and teachers about a specific section (some teachers are bad, and some don’t know the material that well.)

PS: You are doing the right thing, i.e. asking the right question from the start. Half of the problem is formulating the right question.


I don’t remember more than basic algebra; would you mind sharing a detailed list of things to brush up on (immediately after basic algebra) up to calculus and linear algebra?

I have started brushing up on matrices and vectors, but calculus and linear algebra are comprehensive and I don’t want to waste time learning parts I won’t use in Udacity.

For example, at the end of a linear algebra online course the instructor started talking about angles, sin cos tan, things I remember almost nothing about and totally threw me off.

So I had to add trigonometry to the learning list, but I don’t know what my full list is.

I’d like to learn the pre-requisites and then do the Udacity course within a 6 month timespan (2 for prereq, 4 for course).


Caveat: I cannot advise in great detail because I don’t know you well enough to assess your current level of knowledge. Also I am not a teacher or lecturer, so cannot construct a course plan, and definitely not something that fits any kind of timetable.

Instead, I would advise you to pick up minimal basics but in a way that makes your confident that you really understand them. Then start tackling the machine learning that is your goal - whenever you hit some part which you don’t understand, take a step back and research that topic. As long as you are not enrolled in some class where you have to submit work every week, there is no need to try and create your own complete course guide from the outset.

There is not much subject area in between basic algebra and linear algebra or linear algebra and calculus. If you have basic algebra then both linear algebra and calculus are understandable. You only need the basics in each subject. E.g. in linear algebra you will need:

  • What vectors and matrices are

  • How to add, subtract and multiply vectors and matrices

In calculus you will need:

  • What a derivative is, and what a partial derivative is

  • The “chain rule” or how to take a derivative when you have a function of a function

In stats you will need:

  • Different types of average (mean vs median, geometric mean vs arithmetic mean)

  • What the variance, standard distribution and standard error are

That’s the basics you will need to know comfortably well in order to understand the theories and notation in machine learning. It doesn’t hurt to understand them deeper.