Projects

Ambition

The idea behind the project is you should begin a real research study, something that is novel in some way. For example, (1) ask an empirical question about existing methods that is not known in the literature, (2) come up with a new learning method (could be heuristic), or (3) implement and test a novel application of RL on a practically relevant/interesting domain. You don't have to submit something that is a complete conference paper, but if you were given more time---say 1-2 more months---you would have something that would be worthy of submission to a major AI or machine learning conference.

Possible topics

Below I list several possible projects with links to get you started. In most cases I have just listed the title of the project. Over the term I will add more projects to this list and in most cases add a bit more description to each of them. Please feel free to ask me about any of these in person. A few of the projects were specified by Prof Martha White. In those cases she would be happy to meet with you once and describe what she had in mind about those projects.

An empirical evaluation framework for value function learning (prediction)
Representation learning (adapting the features during learning)
Review implement and compare continuous action methods in reinforcement learning. Good starting place Links to an external site.
Empirical comparison of step-size adaption methods. Start here Links to an external site.
Implement and test the alpha bounds Links to an external site.meta learning algorithm on the arcade learning environment Links to an external site.
Implement and test a lambda adaption algorithm Links to an external site. on the arcade learning environment Links to an external site.
Empirical comparisons of state-of-the-art bandit learning methods. Start here Links to an external site.
Adapting contewtextual bandit algorithms to reinforcement learning problems (Prof Martha White)
A comparison of evolutionary computing and value-function learning on reinforcement learning tasks. When should be expect value function methods to out perform EC, illustrated with experimental evidence. Previous study Links to an external site.
Empirical comparison of fixed feature representations in RL, specifically tile-coding Links to an external site., radial basis functions Links to an external site., large random representations Links to an external site., and Fourier basis Links to an external site.
Learning General value functions on robot data. Learning multiple predictions about the sensory data and then using those predictions as features. Code and data available. Paper Links to an external site.
Hard applications:

HIV management Links to an external site.
Octopus arm Links to an external site.
Helicopter hovering Links to an external site.(safe exploration)
simulated soccer Links to an external site.
More potential applications here Links to an external site.

Projects that involve experiments

Please read my page on how to conduct RL experiments!!!
I highly recommend you use the RL-glue code provided to the class. It is well designed, and has been used for research for over a decade.

Project proposal

The objective of this one or two page document is to

specify the research question
cite the relevant literature and explain why the research to be done will be novel
motivate why this is an important question
outline a plan for how you expect to complete the project
how you plan to evaluate your contribution

There are no marks for the project proposal and it is optional, but it is your chance to get my feedback about your project early on!!!

Final report

Should include the motivation for the topic, problem description, and related work sections .

In addition the report will describe your approach.

for application this means how you got your agent to learn, and how you modeled the domain (what are the states, actions, rewards, etc)
for empirical comparisons this is how you conducted your experiment
for new algorithms this is how you came up with the method (derived from an objective?)

Your report will also have to evaluate your contribution. Usually an experiment showing that your application/new-method is working compared to reasonable baselines (what is the current state of the art? random policy, hand coded policy, etc). If your project is a empirical comparison between existing methods your contribution is the design and execution of the experiments.

Sample projects

Since the idea is something that with more time could become a paper, the sample projects are published RL papers:

Group work

Projects must be done individually.