Algorithmic Foundations of Interactive Learning
Spring 2025. 17-740. Tuesday / Thursday 11:00-12:20.
Announcements 📣
Course Overview 📝
Interactive learning is a dynamic approach to machine learning where systems learn and adapt through continuous interaction with their environment or users, receiving feedback and adjusting their behavior in response. These techniques are currently experiencing a resurgence across various domains of artificial intelligence and machine learning, from robotics to language modeling. In this advanced theory course, students will explore interactive learning from its foundational principles to recent applications, including fine-tuning Large Language Models (LLMs) and robot learning from demonstration.
Key topics include:
- Online Learning: Learning under distribution shift.
- Game Solving: Using no-regret algorithms to compute equilibria.
- Reinforcement Learning: Sequential decision making. Model-free, model-based, and hybrid RL.
- Imitation Learning & Applications to Robotics: Learning from demonstrations. Behavioral cloning, DAgger, and inverse RL.
- RL from Human Feedback & Applications to Language Modeling: Learning from preferences. PPO, DPO, SPO.
Schedule (Tentative) 📅
Online Learning
- Jan. 14
- Course Overview
- Syllabus
- Jan. 16
- Intro to Online Learning / Hedge
- JAB Course Note, Hedge
- Jan. 21
- Information Theory and Maximum Entropy
- MacKay Ch. 2, MaxEnt
- Jan. 23
- Online Gradient Descent
- OGD
- Jan. 26
- Buffer / Follow-the-Leader
- FTRL
Game Solving
- Jan. 28
- Computing Equilibria I
- Adaboost, Roth Textbook Ch.2
- Feb. 4
- Computing Equilibria II
- Adaboost, Roth Textbook Ch.2
Sequential Decision Making (RL/IL)
- Feb. 6
- Foundations of MDPs
- Nan’s Note 1, HW #1 Out
- Feb. 11
- Covariate Shift in IL
- Invitation to Imitation, DAgger
- Feb. 13
- Feb. 18
- Policy Gradients
- Wen’s Slides
- Feb. 20
- The Natural Policy Gradient
- NPG, Covariant Policy Search
- Feb. 25
- Feb. 27
- HW #1 Pres.
- HW #2 Out
Spring Break 🏝️
- Mar. 11
- Model-based RL
- Wen’s Simulation Lemma Note,
- Mar. 13
- Mar. 18
- Mar. 20
- Abstraction in MBRL
- Nan’s Note 4, DREAMER, ACS
Inverse RL
- Mar. 25
- IL as Game Solving
- MaxEnt IRL, Moment Matching
- Mar. 27
- Efficient Inverse RL
- FILTER, Hybrid IRL
- Apr. 1
- IRL2: Inverse RL In Real Life (Guest Lecture: Sanjiban Choudhury)
- Diffusion Policy
- Apr. 3
- HW #2 Pres.
RLHF
- Apr. 8
- RLHF I
- RL from Prefs., PPO+RM, DPO
- Apr. 10
- RLHF II
- RL from Prefs., PPO+RM, DPO
- Apr. 15
- Apr. 17
- RLHF as Game Solving
- SPO
Project Presentations
- Apr. 22
- Project Pres.
- Presenters TBD
- Apr. 25
- Project Pres.
- Presenters TBD