Gym and agent for algorithmic stock and cryptocurrency trading


Hi everyone,

I don’t have much experience with AI in general (other than the basics), or TensorFlow, Gym, Universe in specific. But when I read about Universe, I was wondering if a gym could be created that contains past trading data that gets shown in chronological order (just like a market ticker), with “buy” and “sell” as options for the agent to interact with the gym. The money made by trading is the score / reassurence for the agent.

  1. Does this make sense at all or are there better ways to train an AI for algo trading?
  2. Why is this not more popular? Is algo trading already “solved”, making this not a useful case for Gym / Universe?
  3. How would you start doing this? Which are the best starting points for creating said gym and agent?

Anyone interested in joining forces? I’m a software engineer and Bitcoin / Blockchain enthusiast / trader.


I think it’s because Algo trading can be done via API’s , you would just be adding an extra layer, the human interface, that would slow down how fast the trading bot could work.


Well, for high frequency trading yes. But wouldn’t it also be interesting
to see how an AI behaves with the tools / possibilities of a day trader?

And do Universe gyms and agents require an actual full blown GUI? Maybe a
command line interface is enough? That would decrease the performance loss.


I would be interesting :slight_smile:


I made this few months back, which create trading environment but not in universe.

If you are interested we can talk, i am currently working on making a dynamic portfolio optimization algorithm in stocks using RL.


I’m interested and posted a similar message a few days ago. We can join forces. I’ve been successfully trading for 3 years writing MetaTrade experts. joe at joemagicdeveloper dot com.


I’ve put together a very simple gym environment for single-instrument trading reinforcement learning algos here:


I’m new to Openai and gym. Would you mind posting the process and sequence of execution of commands in order to create a new gym environment like yours( (I’m using Ubuntu 14.04 and have all the dependencies installed with me)? I’ve searched for an entire day but couldn’t find any detailed explanation.
Thank you.


Step 1: Install all the dependencies. You said you’ve already done this. If any of the below steps fail, this may be due to missing bits, in which case you need to find out what’s missing and install, either a base package on your machine or something e. g. python-specific via pip install.
Step 2: Get an existing custom gym environment running. For this, git clone e. g. the gym-trading github project that you mention above and run e. g. pip install . (or pip install --user .) in said project’s main directory (the one that contains Then you will want to use some test code that you either write yourself (e. g. import gym_trading and go from there) or that has been provided (gym_trading provides several examples called
Step 3: Once you have someone else’s custom env working, have a look at the different files in the existing one and figure out what you’d need to provide your own version. Or you could just fork the repo and make changes in the fork. Essentially you need to provide a SomethingEnv class that inherits from gym.Env. I’ve figured out how to do these myself and am in the middle of making some custom envs, checkout my github and look for a gym-minimal repo.


I just started searching how cryptocurrency, day trade, and artificial intelligence works and would love to receive some recommendations where to begin on those topics. My goal is also develop a solution able to predict Bitcoin(or other virtual coins) prices to help anyone do better trades. I would love if anyone could suggest me some good materials. My contact is ‘me at felipedearaujo dot com’.



Thinking of modifying script in order to carry out the following

  1. Buy bar open exit close (0)
  2. Sell bar open exit close (1)
  3. Do nothing (2)
    Should run on any time frame 1 minute 5,30, 60, 240, D, W, M , Time Frame

And perhaps add some indicators ATR, and Moving average
This would provide a time base option of exiting trades rather than price

Any thought on this




I am actively working on building a stock market gym for OpenAI. If you are interested in working together on it let me know.


Im new to universe and ai but this is a project that I would like to work on. Is there a group working on this that I could follow along with?


this sounds like a very convoluted way to backtest… which most good platforms support out of the box…

My understanding of reinforced learning, is that it assumes that something is repeatable, there’s rewards, and costs… simply sticking the Gym on a trading platform is no different than backtesting… please correct me if I’m wrong


Current ruining some test which is producing results using 25% of sample data and 75% test data and 2 indicators

Have a look here, requires CSV data or your favorite stock or forex pair

Still under development new indicators to be added soon



Hi everyone!
I used the Henry Bee project (with some improvements) to gather all knowledge and merge it with OpenAI Baseline.

I hope you will find interesting and let me know whatever question or doubt.



Hi Everyone,

I’m relatively new to Q-learning, I’ve done research using machine learning techniques on the following topic which is an extension of this one. I’ll dig into that in the coming days and weeks, but does anyone have an idea how we could change the problem architecture a little bit to have the agent selecting a basket of assets within an universe at every period? We can describe the problem from going to a pure time-series approach to a cross-sectional one. As an example, let’s say we have the S&P 500 and we would like the agent to select 25 stocks out of the 500 every month based on stocks specific features. We could easily replicate the problem with 10 currency pairs to start with, and having the agent selecting 3 of them. The reward function could be relatively unchanged, but we could add a risk/reward measure instead of a pure reward measure. Let me know if any of you would like to help on this journey.




Have been developing and auto bot system based on Q learning and deep Q network, after training trades would automatically be executed, have interesting results so far the prototype development still ongoing

Let me know what you think info found here



We have developed an automated high speed cryptocurrency trading code that can be adapted to all possible trading strategies.

Visit our page for more infos. Contact us if you want to use the system or our tested bots.



I’ve been interested in this idea as well. I thought about using a classification algorithm instead of reinforcement to just determine if a stock/asset is overpriced or underpriced and to buy or sell accordingly. It seems futile to try and compete with high frequency traders and quant funds who already have trading algorithms figured out, unless you have the money and manpower to brute force a higher speed or efficiency. My idea was to approach it more from a Warren Buffet style value based approach to optimize low frequency, low volume investments or buy and hold strategies.