Algorithmic Foundations of Interactive Learning

Spring 2025. 17-740. Tuesday / Thursday 11:00-12:20.

Announcements 📣

Hello World!

Nov 10 · 0 min read

We can’t wait to meet you! 👋

Course Overview 📝

Interactive learning is a dynamic approach to machine learning where systems learn and adapt through continuous interaction with their environment or users, receiving feedback and adjusting their behavior in response. These techniques are currently experiencing a resurgence across various domains of artificial intelligence and machine learning, from robotics to language modeling. In this advanced theory course, students will explore interactive learning from its foundational principles to recent applications, including fine-tuning Large Language Models (LLMs) and robot learning from demonstration.

Key topics include:

  1. Online Learning: Learning under distribution shift.
  2. Game Solving: Using no-regret algorithms to compute equilibria.
  3. Reinforcement Learning: Sequential decision making. Model-free, model-based, and hybrid RL.
  4. Imitation Learning & Applications to Robotics: Learning from demonstrations. Behavioral cloning, DAgger, and inverse RL.
  5. RL from Human Feedback & Applications to Language Modeling: Learning from preferences. PPO, DPO, SPO.

Schedule (Tentative) 📅

Online Learning

Jan. 14
Course Overview
Syllabus
Jan. 16
Intro to Online Learning / Hedge
JAB Course Note, Hedge
Jan. 21
Information Theory and Maximum Entropy
MacKay Ch. 2, MaxEnt
Jan. 23
Online Gradient Descent
OGD
Jan. 26
Buffer / Follow-the-Leader
FTRL

Game Solving

Jan. 28
Computing Equilibria I
Adaboost, Roth Textbook Ch.2
Feb. 4
Computing Equilibria II
Adaboost, Roth Textbook Ch.2

Sequential Decision Making (RL/IL)

Feb. 6
Foundations of MDPs
Nan’s Note 1, HW #1 Out
Feb. 11
Covariate Shift in IL
Invitation to Imitation, DAgger
Feb. 13
Approximate Policy Iteration
CPI, PSDP, NRPI
Feb. 18
Policy Gradients
Wen’s Slides
Feb. 20
The Natural Policy Gradient
NPG, Covariant Policy Search
Feb. 25
TRPO & PPO
TRPO, PPO
Feb. 27
HW #1 Pres.
HW #2 Out

Spring Break 🏝️

Mar. 11
Model-based RL
Wen’s Simulation Lemma Note,
Mar. 13
Hybrid RL (Guest Lecture: Yuda Song)
HyQ, LAMPS
Mar. 18
Learning by Cheating & Model-Predictive Control
LBC, SequIL, MCTS
Mar. 20
Abstraction in MBRL
Nan’s Note 4, DREAMER, ACS

Inverse RL

Mar. 25
IL as Game Solving
MaxEnt IRL, Moment Matching
Mar. 27
Efficient Inverse RL
FILTER, Hybrid IRL
Apr. 1
IRL2: Inverse RL In Real Life (Guest Lecture: Sanjiban Choudhury)
Diffusion Policy
Apr. 3
HW #2 Pres.

RLHF

Apr. 8
RLHF I
RL from Prefs., PPO+RM, DPO
Apr. 10
RLHF II
RL from Prefs., PPO+RM, DPO
Apr. 15
REBEL and REFUEL (Guest Lecture: Wen Sun)
REBEL, REFUEL
Apr. 17
RLHF as Game Solving
SPO

Project Presentations

Apr. 22
Project Pres.
Presenters TBD
Apr. 25
Project Pres.
Presenters TBD

Instructors 👨‍🏫

Avatar
Avatar
Avatar