Reinforcement learning code

RL software for this course

There are many frameworks for reinforcement learning available on the web. We will be using most widely used interface, called RL-glue. The interface is meant to provide only the basic functionality required to run a reinforcement learning experiment. The idea is to standardize the communication between three components common to every RL experiment:

  • the agent program (learning algorithm and action selection mechanism)
  • the environment program (defines the problem, a bandit or MDP)
  • the experiment program (defines how an experiment is run and what performance measures are used)

RL-glue is a collection of functions that are called by the experiment program. These functions in turn call the functions defined by the agent and environment programs. RL-glue is the glue that holds the three parts together, standardizing communication and also preventing the agent from directly communicating with the environment. 

 rl-glue.jpg

 The figure above visualizes this simple interface, and the arrows indicate which entity calls which. 

 

The best way to learn to use RL-glue is to actually look at the code and try it out. The code is usually self evident or commented. I have provided the interface in C, and a sample agent, environment, and experiment. I personally believe C is the best language for RL experiments because it is very fast and RL experiments are typically much more computationally intensive than supervised learning experiments.

We will also be allowing C++ for implementation. Please use SILO or SHARKS as a test bed for which modern C++ standards/features you can use. Personally I would stick with the c++98 or c++11 standards. Please see the README.md file for more details on how to use C++.

Source code

The code linked below contains an agent that just selects random actions, ignoring the state/observation; an environment that randomly generates rewards and observations; and an experiment program that runs many episodes and plots the cumulative reward verse episode. The code includes all the RL-glue files in a subdirectory and a configurable makefile to compile everything. If you find any bugs let me know!!

RLClass_dist.zip Download RLClass_dist.zip

The code above could be considered a stripped-down version of RL-glue. The official RL-glue software uses network socket communication to facilitate multi-language support. The socket stuff is far to complex for what we want to achieve in this course, and it is certainly much slower than the code I have provided. 

 

To run your first experiment

To get the code working follow these steps:

  1. download the zip file 
  2. unzip the file
  3. on the command line (e.g. terminal app in OS X) type make
  4. on the command line type: ./RL_Exp or make run
  5. You should see RL_EXP_OUT.dat once the experiment is complete.
  6. To plot - Open R on the command line and type 'source("plot.r")'

This demo requires a C/C++ compiler (the makefile uses gcc/g++ by default) to run and R to plot (Don't worry about the plotting as much. You can choose your plotting language).

If everything works out you should see an output similar to this:

 RLplotwindow.jpg

And thats it!

 

Requirements for programming questions and projects

You are required to use the RL-glue specification for all your programming questions.  

 

Some Tips:

  1. Read the README.md in the framework, I worked hard on this and it has useful tips and general explanations of how most parts of the code works (If you find mistakes let me know).
  2. Make sure you understand what each function is doing and in what order things are called.
  3. Get it working on your setup before the day homework is due. (RL Experiments can contain illusive bugs).
  4. Ask questions if you are stuck. The point of using the framework is to make everyones lives much easier!
  5. If you want to completely recompile use the command 'make clean' and this will remove your current compilation completely.
  6. Make sure your code compiles of SILO/SHARKs as a last resort. This will be where the TAs will grade if your code doesn't work on their machines. (You don't have to worry about ensuring your LIBS and INCLUDE variables in the makefile are correct for our machines when submitting.)

 

What to turn in with your assignment:

Zip/tar your code and upload it along side your .pdf with your answers. The compressed file should contain at the bare minimum: your plotting code, all your agents, environments, and experiments, your Makefile.settings file. You can also just zip all the code, including the provided code and submit this but please don't make any changes to any files in the rlglue folder.

If you are unsure on whether to include a file include it.

Test on SILO/SHARKS to make sure it will compile on a machine everyone has access to.