Research Projects
Ambition
The idea behind the project is you should begin a real research study, something that is novel in some way. Ask an empirical question about exisiting methods that is not known in the literature, or come up with a new learning method (could be heuristic), or prove some open result, or implement and test a novel application of RL on a practically relevant domain. You don't have to submit something that is already a conference paper, but if you were given more time say 1-2 more months you would have something that would be worthy of submission to a major AI or machine learning conference.
Possible topics
Below I list several possible projects with links to get you started. In most cases I have just listed the title of the project. Over the term I will add more projects to this list and in most cases add a bit more description to each of them. Please feel free to ask me about any of these in person. A few of the projects were specified by Prof Martha White. In those cases she would be happy to meet with you once and describe what she had in mind about those projects.
- An empirical evaluation framework for value function learning (prediction)
- Representation learning (adapting the features during learning)
- Review implement and compare continuous action methods in reinforcement learning. Good starting place Links to an external site.
- Empirical comparison of step-size adaption methods. Start here Links to an external site.
- Improving learning algorithms for Temporal difference networks (Prof Martha White). Start here Links to an external site.
- Implement and test the alpha bounds Links to an external site.meta learning algorithm on the arcade learning environment Links to an external site.
- Implement and test a lambda adaption algorithm Links to an external site. on the arcade learning environment Links to an external site.
- Empirical comparisons of state-of-the-art bandit learning methods. Start here Links to an external site.
- Adapting contewtextual bandit algorithms to reinforcement learning problems (Prof Martha White)
- Theory project: survey of current finite-sample results for value function learning in RL. Work towards extending these bounds to other algorithms (Prof Martha White). Start here Links to an external site.
- A comparison of evolutionary computing and value-function learning on reinforcement learning tasks. When should be expect value function methods to out perform EC, illustrated with experimental evidence. Previous study Links to an external site.
- Empirical comparison of fixed feature representations in RL, specifically tile-coding Links to an external site., radial basis functions Links to an external site., large random representations Links to an external site., and Fourier basis Links to an external site.
- Learning General value functions on robot data. Learning multiple predictions about the sensory data and then using those predictions as features. Code and data available. Paper Links to an external site.
- Applying reinforcement learning to Kuhn poker Links to an external site.(3 card poker). Can we learn and use a model of our opponent? Can we train through self play and outperform humans?
- Comparison of LSTD-Q and Sarsa with function approximation in control domains. Specifically investigate if Sarsa has an advantage in non-stationary domains. How much non-stationarity is needed and what kinds.
- Hard applications:
Projects that involve experiments
- Please read my page on how to conduct RL experiments!!!
- I highly recommend you use the RL-glue code provided to the class. It is well designed and been used for research for over a decade.
Project proposal
The objective of this one or two page document is to
- specify the research question
- cite the relevant literature and explain why the research to be done will be novel
- motivate why this is an important question
- outline a plan for how you expect to complete the project
- how you plan to evaluate your contribution
Final report
A larger version of the proposal with some other stuff. You should be able to reuse the motivation, problem description , and related work sections from your proposal (integrating the feedback from the instructor).
In addition the report will describe your approach.
- for application this means how you got your agent to learn
- for empirical comparisons this is how you conducted your experiment
- for new algorithms this is how you came up with the method (derived from an objective?)
- for theory this would be your proofs
Your report will also have to evaluate your contribution. Usually an experiment showing that your application/new-method is working compared to reasonable baselines (what is the current state of the art? random policy, hand coded policy, etc). For theory you might want to describe how your result compares to existing results in the literature. If your project is a empirical comparison between existing methods your contribution is the design and execution of the experiments.
Sample projects
Since the idea is something that with more time could become a paper, the sample projects are published RL papers:
- Empirical comparison example Links to an external site.
- New method example Links to an external site.
- Theory example Links to an external site.
- Novel application example Links to an external site.
Group work
Projects can be done in groups up to 3 people. You will submit one project proposal and one final report. All group member will receive the same grade, regardless of how much work each person contributes. Pick partners wisely