CS 294: Deep Reinforcement Learning, Fall 2017
|Instructors: Sergey Levine, Abhishek Gupta, Joshua Achiam|
|Lectures:Monday and Wednesday 10:30am-12pm in 306 Soda Hall|
|Communication: Piazza will be used for announcements, general questions and discussions, clarifications about assignments, student questions to each other, and so on. To sign up, go to Piazza and sign up with “UC Berkeley” and “CS294-112”. We request that enrollment be restricted to students enrolled in the class or on the waitlist.|
|For people who are not enrolled, but interested in following and discussing the course, there is a subreddit forum here: reddit.com/r/berkeleydeeprlcourse/|
|For any student looking to enroll in the fall 2017 offering of this course: here is the form that you may fill out to provide us with some information about your background and sign up for the waitlist. Please do not email the instructors about enrollment: the form will be used to collect all information we need. This waitlist will be used as the official waitlist, not the one on CalCentral.
All students should enroll in three units by default. Students may enroll in fewer units, but the course content, homeworks, and grading are exactly the same.
CS189/CS289A, or an equivalent course from another institution, is a strict prerequisite. Online courses (e.g., Coursera, Udacity) do not satisfy this requirement, it must be a university machine learning course.
Fall 2017 Materials
Lectures, Readings, and AssignmentsComing soon!
Previous OfferingsA full version of this course was offered in Spring 2017. Lecture videos from Spring 2017, are available here
An abbreviated version of this course was offered in Fall 2015.
CS189 or equivalent is a prerequisite for the course. This course will assume some familiarity with reinforcement learning, numerical optimization and machine learning. Students who are not familiar with the concepts below are encouraged to brush up using the references provided right below this list. We’ll review this material in class, but it will be rather cursory.
- Reinforcement learning and MDPs
- Definition of MDPs
- Exact algorithms: policy and value iteration
- Search algorithms
- Numerical Optimization
- gradient descent, stochastic gradient descent
- backpropagation algorithm
- Machine Learning
- Classification and regression problems: what loss functions are used, how to fit linear and nonlinear models
- Training/test error, overfitting.
For introductory material on RL and MDPs, see
- CS188 EdX course, starting with Markov Decision Processes I
- Sutton & Barto, Ch 3 and 4.
- For a concise intro to MDPs, see Ch 1-2 of Andrew Ng’s thesis
- David Silver’s course, links below
For introductory material on machine learning and neural networks, see
John's lecture series at MLSS
- Lecture 1: intro, derivative free optimization
- Lecture 2: score function gradient estimation and policy gradients
- Lecture 3: actor critic methods
- Lecture 4: trust region and natural gradient methods, open problems
- Dave Silver’s course on reinforcement learning / Lecture Videos
- Nando de Freitas’ course on machine learning
- Andrej Karpathy’s course on neural networks
- Deep Learning
- Sutton & Barto, Reinforcement Learning: An Introduction
- Szepesvari, Algorithms for Reinforcement Learning
- Bertsekas, Dynamic Programming and Optimal Control, Vols I and II
- Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Powell, Approximate Dynamic Programming