Algorithmic Foundations of Interactive Learning
Spring 2025. 17-740. Tuesday / Thursday 11:00-12:20. GHC 4215.
Announcements 📣
Course Overview 📝
Interactive learning is a dynamic approach to machine learning where systems learn and adapt through continuous interaction with their environment or users, receiving feedback and adjusting their behavior in response. These techniques are currently experiencing a resurgence across various domains of artificial intelligence and machine learning, from robotics to language modeling. In this advanced theory course, students will explore interactive learning from its foundational principles to recent applications, including fine-tuning Large Language Models (LLMs) and robot learning from demonstration.
Key topics include:
- Online Learning: Learning under distribution shift.
- Game Solving: Using no-regret algorithms to compute equilibria.
- Reinforcement Learning: Sequential decision making. Model-free, model-based, and hybrid RL.
- Imitation Learning & Applications to Robotics: Learning from demonstrations. Behavioral cloning, DAgger, and inverse RL.
- RL from Human Feedback & Applications to Language Modeling: Learning from preferences. PPO, DPO, SPO.
Schedule (Tentative) 📅
Online Learning
- Jan. 14
- Jan. 16
- Intro to Online Learning / Hedge
- JAB Course Note, Weighted Majority, Hedge, Universal Portfolios
- Jan. 21
- Information Theory and Maximum Entropy
- MacKay Ch. 2, MaxEnt
- Jan. 23
- Online Gradient Descent
- OGD
- Jan. 26
- Buffer / Follow-the-Leader
- FTRL
Game Solving
- Jan. 28
- Computing Equilibria I
- Adaboost, Roth Textbook Ch.2
- Feb. 4
- Computing Equilibria II
- Adaboost, Roth Textbook Ch.2
Sequential Decision Making (RL/IL)
- Feb. 6
- Foundations of MDPs
- Nan’s Note 1, HW #1 Out
- Feb. 11
- DAgger & Covariate Shift in IL
- Invitation to Imitation, DAgger
- Feb. 13
- Feb. 18
- Policy Gradients
- Wen’s Slides
- Feb. 20
- The Natural Policy Gradient
- NPG, Covariant Policy Search
- Feb. 25
- Feb. 27
- HW #1 Pres.
- HW #2 Out
Spring Break 🏝️
- Mar. 11
- Model-based RL
- Wen’s Simulation Lemma Note, Nan’s Note 4
- Mar. 13
- Mar. 18
- Mar. 20
- Model Predictive Control & Test-Time Scaling
- MCTS
Inverse RL
- Mar. 25
- IL as Game Solving
- MaxEnt IRL, Moment Matching
- Mar. 27
- Efficient Inverse RL
- FILTER, Hybrid IRL
- Apr. 1
- IRL2: Inverse RL In Real Life (Guest Lecture: Sanjiban Choudhury)
- Diffusion Policy, DREAMER
- Apr. 3
- HW #2 Pres.
RLHF
Project Presentations
- Apr. 22
- Project Pres.
- Presenters TBD
- Apr. 24
- Project Pres.
- Presenters TBD