Q-Learning with continuous states


I would like to try Q-learning for trading (am relatively new in this field).

Problem I’m dealing is that I have continuous states (for example price closing price or trend).

I have read that one approach is to discretize the values. As far as I understand then discretization uses min, max values. Does not the entire model break if for example price of the stock increases/decreases 50%?

Is there any other simple way how to handle this?

Thankful for any kind of advice :slight_smile:


You can just use function approximation for continuous state spaces.


Check out the landmark DQN paper(https://arxiv.org/pdf/1312.5602.pdf) . It is fairly standard practice nowadays to use a nonlinear function approximator such as a neural network to approximate a continuous state space. In your case, I believe a standard feedforward neural net will work well as a function approximator for the states


It used to throw me off hearing “function approximation” but that also can be called supervised learning.


I have done this like 10 years ago with forex.

Essentially, you need a sliding window over priceaction.
In your window, there will be a min and max value, i.e. 1.31 and 1.55. You standardize it to 0->1.
The output is whether the price went up or down by x points in the next say hour, day, et, binarizing this data to 0 or 1.

You train that network to predict your output. As a heads up you will be oh so disappointed to find it can only do that 55% correctly, wtf? Well, that is what a 5% edge looks like on the biggest game of blackjack in the world. As a heads up you will quickly realize that YOU can’t get rich with this little edge, because you need something else, but this is not comp finance forums so I stop here