Reinforcement learning code

RL software for this course

There are many frameworks for reinforcement learning available on the web. We will be using most widely used interface, called RL-glue. The interface is meant to provide only the basic functionality required to run a reinforcement learning experiment. The idea is to standardize the communication between three components common to every RL experiment:

the agent program (learning algorithm and action selection mechanism)
the environment program (defines the problem, a bandit or MDP)
the experiment program (defines how an experiment is run and what performance measures are used)

RL-glue is a collection of functions that are called by the experiment program. These functions in turn call the functions defined by the agent and environment programs. RL-glue is the glue that holds the three parts together, standardizing communication and also preventing the agent from directly communicating with the environment.

The figure above visualizes this simple interface, and the arrows indicate which entity calls which.

The best way to learn to use RL-glue is to actually look at the code and try it out. The code is usually self evident or commented. I have provided the interface in C, and a sample agent, environment, and experiment. I personally believe C is the best language for RL experiments because it is very fast and RL experiments are typically much more computationally intensive than supervised learning experiments.

Source code

The code linked below contains an agent that just selects random actions, ignoring the state/observation; an environment that randomly generates rewards and observations; and an experiment program that runs many episodes and plots the cumulative reward verse episode. The code includes all the RL-glue files in a subdirectory and a simple makefile to compile everything. If you find any bugs let me know!!

The code above could be considered a stripped-down version of RL-glue. The official RL-glue software uses network socket communication to facilitate multi-language support. The socket stuff is far to complex for what we want to achieve in this course, and it is certainly much slower than the code I have provided.

To run your first experiment

To get the code working follow these steps:

download the zip file
unzip the file
on the command line (e.g. terminal app in OS X) type make
on the command line type: ./RL_exp

This demo requires a C compiler and gnuplot installed (and callable from the command line). If you don't have gnuplot just comment out line 68 in SimpleExp.c

plotResults(result, numEpisodes);

If everything works out you should see an output similar to this:

And thats it!

Requirements for programming questions and projects

You are required to use the RL-glue specification for all your programming questions. You are not required to use C, that is a suggestion and if you follow it you get free sample code. I will not provide code in other languages.

If you chose to program in a language other than C

If you decide not to use C and program in another language, you will have to implement your own RL-glue interface. That means all the functionality described here Links to an external site.. All code must be compatible with the RL-glue interface specification.