Prediction and Control with Function Approximation

  • 4.8
Approx. 22 hours to complete

Course Summary

This course teaches you how to make predictions and control functions using function approximation. You will learn how to use the techniques of linear regression, decision trees, and neural networks to make predictions and control functions.

Key Learning Points

  • Learn how to use function approximation to make predictions and control functions
  • Master the techniques of linear regression, decision trees, and neural networks
  • Apply your knowledge to real-world problems

Related Topics for further study


Learning Outcomes

  • Understand the fundamentals of function approximation
  • Be able to apply linear regression, decision trees, and neural networks to real-world problems
  • Master the techniques of prediction and control functions using function approximation

Prerequisites or good to have knowledge before taking this course

  • Basic knowledge of programming and statistics
  • Familiarity with Python programming language

Course Difficulty Level

Intermediate

Course Format

  • Online Self-paced Course
  • Video Lectures
  • Assignments and Quizzes

Similar Courses

  • Applied Machine Learning
  • Data Science Essentials
  • Machine Learning for Business Professionals

Related Education Paths


Notable People in This Field

  • Andrew Ng
  • Deepti Sharma
  • Hugo Bowne-Anderson

Related Books

Description

In this course, you will learn how to solve problems with large, high-dimensional, and potentially infinite state spaces. You will see that estimating value functions can be cast as a supervised learning problem---function approximation---allowing you to build agents that carefully balance generalization and discrimination in order to maximize reward. We will begin this journey by investigating how our policy evaluation or prediction methods like Monte Carlo and TD can be extended to the function approximation setting. You will learn about feature construction techniques for RL, and representation learning via neural networks and backprop. We conclude this course with a deep-dive into policy gradient methods; a way to learn policies directly without learning a value function. In this course you will solve two continuous-state control tasks and investigate the benefits of policy gradient methods in a continuous-action environment.

Outline

  • Welcome to the Course!
  • Course 3 Introduction
  • Meet your instructors!
  • Read Me: Pre-requisites and Learning Objectives
  • Reinforcement Learning Textbook
  • On-policy Prediction with Approximation
  • Moving to Parameterized Functions
  • Generalization and Discrimination
  • Framing Value Estimation as Supervised Learning
  • The Value Error Objective
  • Introducing Gradient Descent
  • Gradient Monte for Policy Evaluation
  • State Aggregation with Monte Carlo
  • Semi-Gradient TD for Policy Evaluation
  • Comparing TD and Monte Carlo with State Aggregation
  • Doina Precup: Building Knowledge for AI Agents with Reinforcement Learning
  • The Linear TD Update
  • The True Objective for TD
  • Week 1 Summary
  • Module 1 Learning Objectives
  • Weekly Reading: On-policy Prediction with Approximation
  • On-policy Prediction with Approximation
  • Constructing Features for Prediction
  • Coarse Coding
  • Generalization Properties of Coarse Coding
  • Tile Coding
  • Using Tile Coding in TD
  • What is a Neural Network?
  • Non-linear Approximation with Neural Networks
  • Deep Neural Networks
  • Gradient Descent for Training Neural Networks
  • Optimization Strategies for NNs
  • David Silver on Deep Learning + RL = AI?
  • Week 2 Review
  • Module 2 Learning Objectives
  • Weekly Reading: On-policy Prediction with Approximation II
  • Constructing Features for Prediction
  • Control with Approximation
  • Episodic Sarsa with Function Approximation
  • Episodic Sarsa in Mountain Car
  • Expected Sarsa with Function Approximation
  • Exploration under Function Approximation
  • Average Reward: A New Way of Formulating Control Problems
  • Satinder Singh on Intrinsic Rewards
  • Week 3 Review
  • Module 3 Learning Objectives
  • Weekly Reading: On-policy Control with Approximation
  • Control with Approximation
  • Policy Gradient
  • Learning Policies Directly
  • Advantages of Policy Parameterization
  • The Objective for Learning Policies
  • The Policy Gradient Theorem
  • Estimating the Policy Gradient
  • Actor-Critic Algorithm
  • Actor-Critic with Softmax Policies
  • Demonstration with Actor-Critic
  • Gaussian Policies for Continuous Actions
  • Week 4 Summary
  • Congratulations! Course 4 Preview
  • Module 4 Learning Objectives
  • Weekly Reading: Policy Gradient Methods
  • Policy Gradient Methods
English
Available now
Approx. 22 hours to complete
Martha White, Adam White
University of Alberta, Alberta Machine Intelligence Institute
Coursera

Instructor

Martha White

  • 4.8 Raiting
Share
Saved Course list
Cancel
Get Course Update
Computer Courses