Home page
CS B659: Reinforcement Learning
News:
- Notes on converting between expectations and matrix forms: matrixBellman.pdf Download matrixBellman.pdf
- A great reference to brush up on statistics is the book All of Statistics Links to an external site.
- Research Project page is now available. It describes what is expected for your projects, and list several possible project. It will be updated over the term so check it every once and a while.
- Here are two projects from last year that were well done: here Download here and here Download here
- The lecture slides are available below in the Class schedule
- Notes on MDPs and Value functions: markovMath.pdf Download markovMath.pdf
Basic info:
- Class meets Tuesday & Thursday 4:00pm - 5:15 pm, Swain West 103
- Instructor: Adam White
- office: 301h Lindley Hall
- office hours: Friday 1-3pm
- email: adamw@indiana.edu
- web: adamwhite.ca Links to an external site.
- AI: Mrinmoy Maity
- office: Lindley Hall 406
- office hours: Tuesday 11am - 1pm
- email: mmaity@umail.iu.edu
Class schedule
This schedule is tentative and may change throughout the semester.
Readings and assignment questions are from the main course text: Sutton and Barto Links to an external site.
Supplementary material can be found in Szepesvari's book Links to an external site..
These slides will be released progressively; look at the date on the first page to see how recently they have been updated.
Week | Date | Lecture | Readings and Deadlines |
1 | Jan 12 | Chapter 1: Introduction |
-read Chapters 1 of Sutton and Barto |
Jan 14 | Chapter 2: Evaluative feedback Download Chapter 2: Evaluative feedback |
|
|
2 | Jan 19 | Chapter 2: Evaluative feedback continued Chapter2_2white.pdf Download Chapter2_2white.pdf |
Though questions about chapter 1&2 due Jan 18th at 11:59pm Assignment #1 released (bandit problems and MDPs) |
Jan 21 | Chapter 3: The reinforcement learning problemChapter3white.pdf
Download Chapter3white.pdf |
||
3 | Jan 26 |
Chapter 3 continuedChapter3white_v2.pdf Download Chapter3white_v2.pdf |
Thought Questions about Chapter 3 due |
Jan 28 |
Chapter 3 completed. Start of Chapter 4 Dynamic programming Chapter4white.pdf Download Chapter4white.pdf |
|
|
4 | Feb 2 | Finish Chapter 4: Dynamic programming Chapter4white_v2.pdf Download Chapter4white_v2.pdf | |
Feb 4 |
Chapter 5: Monte Carlo methodsChapter5white.pdf
Download Chapter5white.pdf |
|
|
5 | Feb 9 |
Chapter 5: Monte Carlo methodsChapter5white_v2.pdf Download Chapter5white_v2.pdf |
Assignment #1 due Assignment #2 released (elementary solution methods)
|
Feb 11 |
Course ProjectsProjects_Review.pdf Download Projects_Review.pdf |
|
|
6 | Feb 16 | Chapter 6: Temporal difference learningChapter6white.pdf
Download Chapter6white.pdf |
Thought Questions about Chapter 4 & 5 due |
Feb 18 | Chapter 6: Temporal difference learning |
||
7 | Feb 23 | Chapter 6:Review lecture,review_Ch2thruCh6.pdf Download review_Ch2thruCh6.pdf |
Thought questions about Chapter 6 due |
Feb 25 |
Chapter 7: Eligibility traces |
Assignment #2 due Assignment #3 released (TD methods and eligibility traces) |
|
8 | Mar 1 |
Chapter 7: Eligibility tracesTD_lambdaCh7.pdf Download TD_lambdaCh7.pdf |
|
Mar 3 | Chapter 7: Eligibility tracesTD_lambdaCh7_v2.pdf Download TD_lambdaCh7_v2.pdf |
Thought Questions about Chapter 7 due |
|
9 | Mar 8 | Chapter 8: Planning and learningChapter8.pdf
Download Chapter8.pdf |
|
Mar 10 |
Chapter 8: Planning and learningChapter8_v2.pdf
Download Chapter8_v2.pdf |
Project proposals due | |
10 | Mar 15 |
No classes: Spring break |
|
Mar 17 |
No classes: Spring break |
|
|
11 | Mar 22 | Chapter 9:On Policy Prediction with Approximation:Chapter9_white.pdf
Download Chapter9_white.pdf |
Assignment #3 due Assignment #4 released (tabular planning methods) |
Mar 24 | Chapter 9 On Policy Prediction with Approximation: Chapter9_white_v2.pdf
Download Chapter9_white_v2.pdf |
Thought Questions about Chapter 8 due | |
12 | Mar 29 | Chapter 9 continued, projects, & average reward RL: Projects-AverageReward.pdf
Download Projects-AverageReward.pdf |
|
Mar 31 |
Policy gradient methods: AverageReward-PG.pdf Download AverageReward-PG.pdf |
Thought Questions about Chapter 9 due | |
13 | Apr 5 |
Advanced topics (Off-policy gradient TD): ofPolicy-GTD.pdf Download ofPolicy-GTD.pdf |
|
Apr 7 |
Advanced topics (Off-policy gradient TD cont...): ofPolicy-GTD2.pdf Download ofPolicy-GTD2.pdf |
Assignment #4 due Assignment #5 released (RL and function approximation)
|
|
14 | Apr 12 | Advanced topics (Least squares TD): LSTD.pdf Download LSTD.pdf |
|
Apr 14 | Research project discussion |
|
|
15 | Apr 19 | RL and Psychology: animalLearning.pdf Download animalLearning.pdf | |
Apr 21 | Chapter 11: Case studies & Review: summary.pdf Download summary.pdf | ||
16 | Apr 26 | No class |
Assignment #5 due |
Apr 28 |
No class Office hours still on |
|
|
17 | May 3 |
Office hours still on |
|
May 5 |
No exam |
Research projects due |