CS B659: Reinforcement Learning

News:

Notes on converting between expectations and matrix forms: matrixBellman.pdf
A great reference to brush up on statistics is the book All of Statistics
Research Project page is now available. It describes what is expected for your projects, and list several possible project. It will be updated over the term so check it every once and a while.
Here are two projects from last year that were well done: here and here
The lecture slides are available below in the Class schedule
Notes on MDPs and Value functions: markovMath.pdf

Basic info:

Class meets Tuesday & Thursday 4:00pm - 5:15 pm, Swain West 103
Instructor: Adam White

office: 301h Lindley Hall
office hours: Friday 1-3pm
email: adamw@indiana.edu
web: adamwhite.ca

AI: Mrinmoy Maity

office: Lindley Hall 406
office hours: Tuesday 11am - 1pm
email: mmaity@umail.iu.edu

Class schedule

This schedule is tentative and may change throughout the semester.

Readings and assignment questions are from the main course text: Sutton and Barto

Supplementary material can be found in Szepesvari's book.

These slides will be released progressively; look at the date on the first page to see how recently they have been updated.

Week	Date	Lecture	Readings and Deadlines
1	Jan 12	Chapter 1: Introduction	-read Chapters 1 of Sutton and Barto
	Jan 14	Chapter 2: Evaluative feedback
2	Jan 19	Chapter 2: Evaluative feedback continued Chapter2_2white.pdf	Though questions about chapter 1&2 due Jan 18th at 11:59pm Assignment #1 released (bandit problems and MDPs)
	Jan 21	Chapter 3: The reinforcement learning problemChapter3white.pdf
3	Jan 26	Chapter 3 continuedChapter3white_v2.pdf	Thought Questions about Chapter 3 due
	Jan 28	Chapter 3 completed. Start of Chapter 4 Dynamic programming Chapter4white.pdf
4	Feb 2	Finish Chapter 4: Dynamic programming Chapter4white_v2.pdf
	Feb 4	Chapter 5: Monte Carlo methodsChapter5white.pdf
5	Feb 9	Chapter 5: Monte Carlo methodsChapter5white_v2.pdf	Assignment #1 due Assignment #2 released (elementary solution methods)
	Feb 11	Course ProjectsProjects_Review.pdf
6	Feb 16	Chapter 6: Temporal difference learningChapter6white.pdf	Thought Questions about Chapter 4 & 5 due
	Feb 18	Chapter 6: Temporal difference learning
7	Feb 23	Chapter 6:Review lecture,review_Ch2thruCh6.pdf	Thought questions about Chapter 6 due
	Feb 25	Chapter 7: Eligibility traces	Assignment #2 due Assignment #3 released (TD methods and eligibility traces)
8	Mar 1	Chapter 7: Eligibility tracesTD_lambdaCh7.pdf
	Mar 3	Chapter 7: Eligibility tracesTD_lambdaCh7_v2.pdf	Thought Questions about Chapter 7 due
9	Mar 8	Chapter 8: Planning and learningChapter8.pdf
	Mar 10	Chapter 8: Planning and learningChapter8_v2.pdf	Project proposals due
10	Mar 15	No classes: Spring break
	Mar 17	No classes: Spring break
11	Mar 22	Chapter 9:On Policy Prediction with Approximation:Chapter9_white.pdf	Assignment #3 due Assignment #4 released (tabular planning methods)
	Mar 24	Chapter 9 On Policy Prediction with Approximation: Chapter9_white_v2.pdf	Thought Questions about Chapter 8 due
12	Mar 29	Chapter 9 continued, projects, & average reward RL: Projects-AverageReward.pdf
	Mar 31	Policy gradient methods: AverageReward-PG.pdf	Thought Questions about Chapter 9 due
13	Apr 5	Advanced topics (Off-policy gradient TD): ofPolicy-GTD.pdf
	Apr 7	Advanced topics (Off-policy gradient TD cont...): ofPolicy-GTD2.pdf	Assignment #4 due Assignment #5 released (RL and function approximation)
14	Apr 12	Advanced topics (Least squares TD): LSTD.pdf
	Apr 14	Research project discussion
15	Apr 19	RL and Psychology: animalLearning.pdf
	Apr 21	Chapter 11: Case studies & Review: summary.pdf
16	Apr 26	No class	Assignment #5 due
	Apr 28	No class Office hours still on
17	May 3	Office hours still on
	May 5	No exam	Research projects due