Fundamentals of Reinforcement Learning

Offrez à votre carrière le cadeau de Coursera Plus avec $160 de réduction, facturé annuellement. Économisez aujourd’hui.

Fundamentals of Reinforcement Learning

Name: Fundamentals of Reinforcement Learning
Rating: 4.760071942446043 (2780 reviews)

Ce cours fait partie de Spécialisation Reinforcement Learning

Instructeurs : Martha White

93 098 déjà inscrits

Inclus avec Coursera Plus

5 modules

Obtenez un aperçu d'un sujet et apprenez les principes fondamentaux.

4.8

(2,780 avis)

niveau Intermédiaire

Expérience recommandée

Planning flexible

Env. 15 heures

Apprenez à votre propre rythme

92%

La plupart des apprenants ont aimé ce cours

5 modules

Obtenez un aperçu d'un sujet et apprenez les principes fondamentaux.

4.8

(2,780 avis)

niveau Intermédiaire

Expérience recommandée

Planning flexible

Env. 15 heures

Apprenez à votre propre rythme

92%

La plupart des apprenants ont aimé ce cours

Ce que vous apprendrez

Formalize problems as Markov Decision Processes
Understand basic exploration methods and the exploration / exploitation tradeoff
Understand value functions, as a general-purpose tool for optimal decision-making
Know how to implement dynamic programming as an efficient solution approach to an industrial control problem

Compétences que vous acquerrez

Catégorie : Function Approximation
Catégorie : Artificial Intelligence (AI)
Catégorie : Reinforcement Learning
Catégorie : Machine Learning
Catégorie : Intelligent Systems

Détails à connaître

Certificat partageable

Ajouter à votre profil LinkedIn

Évaluations

5 devoirs

Enseigné en Anglais

Découvrez comment les employés des entreprises prestigieuses maîtrisent des compétences recherchées

En savoir plus sur Coursera pour les affaires

Élaborez votre expertise du sujet

Ce cours fait partie de la Spécialisation Reinforcement Learning

Lorsque vous vous inscrivez à ce cours, vous êtes également inscrit(e) à cette Spécialisation.

Apprenez de nouveaux concepts auprès d'experts du secteur
Acquérez une compréhension de base d'un sujet ou d'un outil
Développez des compétences professionnelles avec des projets pratiques
Obtenez un certificat professionnel partageable

Obtenez un certificat professionnel

Ajoutez cette qualification à votre profil LinkedIn ou à votre CV

Partagez-le sur les réseaux sociaux et dans votre évaluation de performance

Il y a 5 modules dans ce cours

Reinforcement Learning is a subfield of Machine Learning, but is also a general purpose formalism for automated decision-making and AI. This course introduces you to statistical learning techniques where an agent explicitly takes actions and interacts with the world. Understanding the importance and challenges of learning agents that make decisions is of vital importance today, with more and more companies interested in interactive agents and intelligent decision-making.

This course introduces you to the fundamentals of Reinforcement Learning. When you finish this course, you will: - Formalize problems as Markov Decision Processes - Understand basic exploration methods and the exploration/exploitation tradeoff - Understand value functions, as a general-purpose tool for optimal decision-making - Know how to implement dynamic programming as an efficient solution approach to an industrial control problem This course teaches you the key concepts of Reinforcement Learning, underlying classic and modern algorithms in RL. After completing this course, you will be able to start using RL for real problems, where you have or can specify the MDP. This is the first course of the Reinforcement Learning Specialization.

Welcome to: Fundamentals of Reinforcement Learning, the first course in a four-part specialization on Reinforcement Learning brought to you by the University of Alberta, Onlea, and Coursera. In this pre-course module, you'll be introduced to your instructors, get a flavour of what the course has in store for you, and be given an in-depth roadmap to help make your journey through this specialization as smooth as possible.

Inclus

4 vidéos2 lectures1 sujet de discussion

For the first week of this course, you will learn how to understand the exploration-exploitation trade-off in sequential decision-making, implement incremental algorithms for estimating action-values, and compare the strengths and weaknesses to different algorithms for exploration. For this week’s graded assessment, you will implement and test an epsilon-greedy agent.

Inclus

8 vidéos3 lectures1 devoir1 devoir de programmation1 sujet de discussion2 plugins

8 vidéosTotal 46 minutes

Sequential Decision Making with Evaluative Feedback5 minutesPrévisualiser le module
Learning Action Values4 minutes
Estimating Action Values Incrementally5 minutes
What is the trade-off?7 minutes
Optimistic Initial Values6 minutes
Upper-Confidence Bound (UCB) Action Selection5 minutes
Jonathan Langford: Contextual Bandits for Real World Reinforcement Learning8 minutes
Week 1 Summary3 minutes

3 lecturesTotal 70 minutes

Module 1 Learning Objectives10 minutes
Weekly Reading30 minutes
Chapter Summary30 minutes

1 devoirTotal 45 minutes

Sequential Decision-Making45 minutes

1 devoir de programmationTotal 30 minutes

Bandits and Exploration/Exploitation30 minutes

1 sujet de discussionTotal 10 minutes

Compare bandits to supervised learning10 minutes

2 pluginsTotal 30 minutes

Let's play a game!15 minutes
What's underneath?15 minutes

When you’re presented with a problem in industry, the first and most important step is to translate that problem into a Markov Decision Process (MDP). The quality of your solution depends heavily on how well you do this translation. This week, you will learn the definition of MDPs, you will understand goal-directed behavior and how this can be obtained from maximizing scalar rewards, and you will also understand the difference between episodic and continuing tasks. For this week’s graded assessment, you will create three example tasks of your own that fit into the MDP framework.

Inclus

7 vidéos2 lectures1 devoir1 évaluation par les pairs1 sujet de discussion

7 vidéosTotal 36 minutes

Markov Decision Processes6 minutesPrévisualiser le module
Examples of MDPs4 minutes
The Goal of Reinforcement Learning3 minutes
Michael Littman: The Reward Hypothesis12 minutes
Continuing Tasks5 minutes
Examples of Episodic and Continuing Tasks3 minutes
Week 2 Summary1 minute

2 lecturesTotal 40 minutes

Module 2 Learning Objectives10 minutes
Weekly Reading30 minutes

1 devoirTotal 45 minutes

MDPs45 minutes

1 évaluation par les pairsTotal 60 minutes

Graded Assignment: Describe Three MDPs60 minutes

1 sujet de discussionTotal 10 minutes

Is the reward hypothesis sufficient?10 minutes

Once the problem is formulated as an MDP, finding the optimal policy is more efficient when using value functions. This week, you will learn the definition of policies and value functions, as well as Bellman equations, which is the key technology that all of our algorithms will use.

Inclus

9 vidéos3 lectures2 devoirs1 sujet de discussion

9 vidéosTotal 56 minutes

Specifying Policies4 minutesPrévisualiser le module
Value Functions6 minutes
Rich Sutton and Andy Barto: A brief History of RL7 minutes
Bellman Equation Derivation6 minutes
Why Bellman Equations?5 minutes
Optimal Policies7 minutes
Optimal Value Functions5 minutes
Using Optimal Value Functions to Get Optimal Policies8 minutes
Week 3 Summary4 minutes

3 lecturesTotal 53 minutes

Module 3 Learning Objectives10 minutes
Weekly Reading30 minutes
Chapter Summary13 minutes

2 devoirsTotal 90 minutes

[Practice] Value Functions and Bellman Equations45 minutes
[Graded] Value Functions and Bellman Equations45 minutes

1 sujet de discussionTotal 10 minutes

Check-in10 minutes

This week, you will learn how to compute value functions and optimal policies, assuming you have the MDP model. You will implement dynamic programming to compute value functions and optimal policies and understand the utility of dynamic programming for industrial applications and problems. Further, you will learn about Generalized Policy Iteration as a common template for constructing algorithms that maximize reward. For this week’s graded assessment, you will implement an efficient dynamic programming agent in a simulated industrial control problem.

Inclus

10 vidéos3 lectures1 devoir1 devoir de programmation1 sujet de discussion

10 vidéosTotal 72 minutes

Policy Evaluation vs. Control4 minutesPrévisualiser le module
Iterative Policy Evaluation8 minutes
Policy Improvement4 minutes
Policy Iteration8 minutes
Flexibility of the Policy Iteration Framework4 minutes
Efficiency of Dynamic Programming5 minutes
Warren Powell: Approximate Dynamic Programming for Fleet Management (Short)7 minutes
Warren Powell: Approximate Dynamic Programming for Fleet Management (Long)21 minutes
Week 4 Summary2 minutes
Congratulations!3 minutes

3 lecturesTotal 70 minutes

Module 4 Learning Objectives10 minutes
Weekly Reading30 minutes
Chapter Summary30 minutes

1 devoirTotal 45 minutes

Dynamic Programming45 minutes

1 devoir de programmationTotal 30 minutes

Optimal Policies with Dynamic Programming30 minutes

1 sujet de discussionTotal 10 minutes

Where can you use dynamic programming?10 minutes

Instructeurs

Évaluations de l’enseignant

4.7 (800 évaluations)

Martha White

University of Alberta

4 Cours98 420 apprenants

Adam White

University of Alberta

4 Cours98 420 apprenants

Offert par

University of Alberta

Alberta Machine Intelligence Institute

Recommandé si vous êtes intéressé(e) par Machine Learning

IBM
IBM Machine Learning
Certificat Professionnel
H2O.ai
H2O Driverless AI Starter Course
Cours
Johns Hopkins University
Mastering Neural Networks and Model Regularization
Cours
LearnQuest
Machine Learning Models in Science
Cours

Pour quelles raisons les étudiants sur Coursera nous choisissent-ils pour leur carrière ?

Felipe M.

Étudiant(e) depuis 2018

’Pouvoir suivre des cours à mon rythme à été une expérience extraordinaire. Je peux apprendre chaque fois que mon emploi du temps me le permet et en fonction de mon humeur.’

Jennifer J.

Étudiant(e) depuis 2020

’J'ai directement appliqué les concepts et les compétences que j'ai appris de mes cours à un nouveau projet passionnant au travail.’

Larry W.

Étudiant(e) depuis 2021

’Lorsque j'ai besoin de cours sur des sujets que mon université ne propose pas, Coursera est l'un des meilleurs endroits où se rendre.’

Chaitanya A.

’Apprendre, ce n'est pas seulement s'améliorer dans son travail : c'est bien plus que cela. Coursera me permet d'apprendre sans limites.’

Avis des étudiants

Affichage de 3 sur 2780

4.8

2 780 avis

5 stars
81,71 %
4 stars
14,55 %
3 stars
2,55 %
2 stars
0,43 %
1 star
0,75 %

Révisé le 1 juil. 2021

Révisé le 8 août 2023

Révisé le 11 avr. 2024

Voir plus d’avis

Ouvrez de nouvelles portes avec Coursera Plus

Accès illimité à plus de 7 000 cours de renommée internationale, à des projets pratiques et à des programmes de certificats reconnus sur le marché du travail, tous inclus dans votre abonnement

Faites progresser votre carrière avec un diplôme en ligne

Obtenez un diplôme auprès d’universités de renommée mondiale - 100 % en ligne

Découvrir les diplômes

Rejoignez plus de 3 400 entreprises mondiales qui ont choisi Coursera pour les affaires

Améliorez les compétences de vos employés pour exceller dans l’économie numérique

Foire Aux Questions

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy.