Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 5 – Value Function Approximation
Professor Emma Brunskill, Stanford University
http://onlinehub.stanford.edu/
Professor Emma Brunskill
Assistant Professor, Computer Science
Stanford AI for Human Impact Lab
Stanford Artificial Intelligence Lab
Statistical Machine Learning Group
To follow along with the course schedule and syllabus, visit: http://web.stanford.edu/class/cs234/index.html
0:00 Introduction
1:19 Class Structure
3:18 Value Function Approximation (VFA)
4:26 Motivation for VFA
5:01 Benefits of Generalization
10:03 Function Approximators
11:16 Review: Gradient Descent
13:47 Value Function Approximation for Policy Evaluation with an Oracle
15:11 Stochastic Gradient Descent
18:02 Model Free VFA Policy Evaluation
18:22 Model Free VFA Prediction / Policy Evaluation
19:06 Feature Vectors
30:06 MC Linear Value Function Approacimation for Policy Evaluation
35:48 Baird (1995)-Like Example with MC Policy Evaluation
43:55 Convergence Guarantees for Linear Value Function Approximation for Policy Evaluation: Preliminaries
50:43 Batch Monte Carlo Value Function Approximation
53:48 Recall: Temporal Difference Learning w/ Lookup Table
54:42 Temporal Difference (TD(0)) Learning with Value Function Approximation
57:40 TD(0) Linear Value Function Approximation for Policy Evaluation
58:10 Baird Example with TD(0) On Policy Evaluation
Add comment