What Is Reinforcement Learning?

Written by Coursera Staff • Updated on

Reinforcement learning is a type of algorithm for machine learning that allows a robot or other artificial intelligence to solve problems through trial and error in unpredictable environments. Discover the uses of reinforcement learning below.

[Featured Image] A woman uses reinforcement learning to fine-tune her robotics work as she examines a small robot next to an open laptop.

Reinforcement learning, sometimes called deep reinforcement learning, is a set of tools for machine learning. For example, you could use various reinforcement techniques to teach a robot how to perform a task. The key reinforcement learning component is that the robot rewards itself for correctly performing the task. 

Reinforcement learning is a term coined over a century ago by experimental psychologist Ivan Pavlov in 1927. At the time, Pavlov described a form of learning that requires a stimulus to add positive or negative reinforcement to a behavior. Pavlov’s work helped demonstrate that if we perform an action and get a reward, we’re more likely to repeat that behavior, while the opposite is true for a punishment. 

Today, reinforcement learning refers to the decision-making process of machine learning algorithms and automated intelligence, such as robotic factory equipment or autonomous drivers. 

What is reinforcement learning?

Reinforcement learning is a type of machine learning that processes information through trial and error, similar to how humans might approach a problem. Unlike other kinds of learning, such as supervised learning, reinforcement learning can happen without a human watching and responding to the interaction. Reinforcement learning allows a machine learning algorithm to learn through experience by trying different things and assigning a positive or negative association to each action based on the outcome. 

This allows robots and machines to navigate new or unexpected situations and learn which actions lead to the best outcomes without direct human supervision. 

How is reinforcement learning used?

Reinforcement learning is useful when a machine learning agent, such as a robot, attempts to complete a task in an unexplored or hard-to-predict environment. It borrows from the mathematical framework called the Markov decision process, which is a framework for understanding how decisions are made when a random element affects the outcome. 

A reinforcement learning system must have four parts:

  1. The agent or the machine learning algorithm

  2. The environment that the agent works in

  3. The policy or basic instructions the agent is following

  4. A reward signal for the agent to observe based on their actions. 

Types of reinforcement learning algorithms

Reinforcement learning can fall into one of two main categories: model-free or model-based. A model-based algorithm begins to solve the problem by building a model of its environment and all possible actions that it can take. In contrast, a model-free algorithm skips that step and begins to solve the problem with a trial-and-error approach, observing which actions have the best outcome. 

Model-free algorithms also appear in two main types: value-based and policy-based. A value-based reinforcement learning algorithm assigns a value to each action based on the outcome. This value can also be called “goodness,” or the extent to which the given action was “good.” The algorithm will then proceed through trial and error to determine which actions result in the highest value or the most amount of “goodness.” On the other hand, a policy-based algorithm will determine the best actions to take based on programmed policies or rules. 

Examples of reinforcement learning

Reinforcement learning allows artificial intelligence to begin tackling more complicated problems. A few examples of the kind of problems that reinforcement learning can solve include: 

  • AI gaming: In video games, reinforcement learning makes it possible for artificial intelligence opponents to respond to the unique moves of human players. 

  • Autonomous vehicles: A car on the road encounters so many variables that reinforcement learning can help the algorithm predict the best course of action in unpredictable situations. 

  • Automated robots: Some stores use robots to measure the inventory on the shelves and order more as needed, using reinforced learning to navigate a store with unpredictable moving customers. 

Benefits of reinforcement learning

One of the main benefits of reinforcement learning algorithms is the ability to solve problems in complicated and unpredictable environments. This ability is what may someday enable safe autonomous vehicles that can accurately respond to variables such as pedestrians. 

Another advantage is that reinforcement learning doesn’t need a pre-programmed data set to begin. Instead, the algorithm can learn on its own through trial and error to find the best possible actions to maximize an outcome. Reinforcement learning algorithms can operate without a human supervisor directing the learning. 

One additional benefit to consider is that reinforcement learning often yields more original results than models operating from a large data set. Supervised learning algorithms risk developing biases based on the individual characteristics, experiences, and knowledge of the human who programmed them, but reinforcement learning allows the algorithm to look for novel ways to solve the problem free from the biases of the humans observing the work. 

Who uses reinforcement learning? And how to get started 

Reinforcement learning is applicable for various uses, including industries like health care, automotive, video games, retail, civil engineering, energy, and more. A few potential careers in reinforcement learning include machine learning engineer, data scientist, and machine learning research scientist. 

Machine learning engineer

Average annual salary in the US (Glassdoor): $123,136 [1]

Job outlook (projected growth from 2022 to 2032): 12 percent [2]

Education requirements: A bachelor’s degree in computer science or a related field is typically required to become a machine learning engineer. Some employers may require you to earn a master’s degree. 

As a machine learning engineer, you will create algorithms that use artificial intelligence to solve problems. You will also evaluate existing models to identify areas for improvement, test for bugs, and provide appropriate documentation. Machine learning engineers work in industries such as cybersecurity, health care, finance, and more. 

Data scientist

Average annual salary in the US (Glassdoor): $113,340 [3]

Job outlook (projected growth from 2022 to 2032): 35 percent [4]

Education requirements: The most common educational requirement for a data scientist is a bachelor’s degree, although some employers will prefer or require a master’s degree. Common areas of study include math, statistics, computer science, and engineering. 

As a data scientist, you will collect, process, and analyze data to look for trends, make recommendations, and communicate your findings with senior leadership. You must be familiar with data visualization tools and the models and algorithms needed for machine learning. In this field, you will have the chance to work on several projects, from improving business processes to conducting research. 

Machine learning research scientist

Average annual salary in the US (Glassdoor): $147,841 [5]

Job outlook (projected growth from 2022 to 2032): 23 percent [6]

Education requirements: You will usually need a master's degree to become a machine learning research scientist, typically in computer science or a related field such as robotics or data science. 

Machine learning research scientists work to create artificial intelligence capable of learning on its own with accurate results. In this role, you may also create algorithms and models to analyze large data sets. You’ll work with data scientists and engineers, as well as other professionals. As a machine learning research scientist, you can work on projects for health care, marketing, human resources, and more. 

Learn more about reinforcement learning with Coursera

In machine learning, reinforcement learning is a technique for training machines through trial and error. In this technique, machines receive positive or negative reinforcement to learn to process data accurately.

If you’re ready to take the next step and start a career in reinforcement learning and artificial intelligence, consider completing the Reinforcement Learning Specialization offered by the University of Alberta on Coursera. This four-course series takes approximately two months to complete at 10 hours per week and can help you learn skills in artificial intelligence, machine learning, reinforcement learning, function approximation, and intelligent systems. 

Article sources

1

Glassdoor. “Salary: Machine Learning Scientist, https://www.glassdoor.com/Salaries/machine-learning-scientist-salary-SRCH_KO0,26.htm.” Accessed July 30, 2024.

Keep reading

Updated on
Written by:

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.