Machine Learning: Regression

Discover new skills with $120 off courses from industry experts. Save now.

Machine Learning: Regression

Name: Machine Learning: Regression
Rating: 4.755778534312848 (5581 reviews)

This course is part of Machine Learning Specialization

Instructors: Emily Fox

165,351 already enrolled

Included with Coursera Plus

Learn more

8 modules

Gain insight into a topic and learn the fundamentals.

4.8

(5,581 reviews)

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

94%

Most learners liked this course

8 modules

Gain insight into a topic and learn the fundamentals.

4.8

(5,581 reviews)

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

94%

Most learners liked this course

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

15 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the Machine Learning Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

There are 8 modules in this course

Case Study - Predicting Housing Prices

In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression. In this course, you will explore regularized linear regression models for the task of prediction and feature selection. You will be able to handle very large sets of features and select between models of various complexity. You will also analyze the impact of aspects of your data -- such as outliers -- on your selected models and predictions. To fit these models, you will implement optimization algorithms that scale to large datasets. Learning Outcomes: By the end of this course, you will be able to: -Describe the input and output of a regression model. -Compare and contrast bias and variance when modeling data. -Estimate model parameters using optimization algorithms. -Tune parameters with cross validation. -Analyze the performance of the model. -Describe the notion of sparsity and how LASSO leads to sparse solutions. -Deploy methods to select between models. -Exploit the model to form predictions. -Build a regression model to predict prices using a housing dataset. -Implement these techniques in Python.

Regression is one of the most important and broadly used machine learning and statistics tools out there. It allows you to make predictions from data by learning the relationship between features of your data and some observed, continuous-valued response. Regression is used in a massive number of applications ranging from predicting stock prices to understanding gene regulatory networks.This introduction to the course provides you with an overview of the topics we will cover and the background knowledge and resources we assume you have.

What's included

5 videos4 readings

5 videosTotal 20 minutes

Welcome!1 minute
What is the course about?3 minutes
Outlining the first half of the course5 minutes
Outlining the second half of the course5 minutes
Assumed background4 minutes

4 readingsTotal 35 minutes

Important Update regarding the Machine Learning Specialization10 minutes
Slides presented in this module10 minutes
Reading: Software tools you'll need10 minutes
Get help and meet other learners. Join your Community!5 minutes

Our course starts from the most basic regression model: Just fitting a line to data. This simple model for forming predictions from a single, univariate feature of the data is appropriately called "simple linear regression". In this module, we describe the high-level regression task and then specialize these concepts to the simple linear regression case. You will learn how to formulate a simple regression model and fit the model to data using both a closed-form solution as well as an iterative optimization algorithm called gradient descent. Based on this fitted function, you will interpret the estimated model parameters and form predictions. You will also analyze the sensitivity of your fit to outlying observations. You will examine all of these concepts in the context of a case study of predicting house prices from the square feet of the house.

What's included

25 videos5 readings2 assignments

25 videosTotal 122 minutes

A case study in predicting house prices1 minute
Regression fundamentals: data & model8 minutes
Regression fundamentals: the task2 minutes
Regression ML block diagram4 minutes
The simple linear regression model2 minutes
The cost of using a given line6 minutes
Using the fitted line6 minutes
Interpreting the fitted line6 minutes
Defining our least squares optimization objective3 minutes
Finding maxima or minima analytically7 minutes
Maximizing a 1d function: a worked example2 minutes
Finding the max via hill climbing6 minutes
Finding the min via hill descent3 minutes
Choosing stepsize and convergence criteria6 minutes
Gradients: derivatives in multiple dimensions5 minutes
Gradient descent: multidimensional hill descent6 minutes
Computing the gradient of RSS7 minutes
Approach 1: closed-form solution5 minutes
Approach 2: gradient descent7 minutes
Comparing the approaches1 minute
Influence of high leverage points: exploring the data4 minutes
Influence of high leverage points: removing Center City7 minutes
Influence of high leverage points: removing high-end towns3 minutes
Asymmetric cost functions3 minutes
A brief recap1 minute

5 readingsTotal 50 minutes

Slides presented in this module10 minutes
Optional reading: worked-out example for closed-form solution10 minutes
Optional reading: worked-out example for gradient descent10 minutes
Download notebooks to follow along10 minutes
Fitting a simple linear regression model on housing data10 minutes

2 assignmentsTotal 60 minutes

Simple Linear Regression30 minutes
Fitting a simple linear regression model on housing data30 minutes

The next step in moving beyond simple linear regression is to consider "multiple regression" where multiple features of the data are used to form predictions. More specifically, in this module, you will learn how to build models of more complex relationship between a single variable (e.g., 'square feet') and the observed response (like 'house sales price'). This includes things like fitting a polynomial to your data, or capturing seasonal changes in the response value. You will also learn how to incorporate multiple input variables (e.g., 'square feet', '# bedrooms', '# bathrooms'). You will then be able to describe how all of these models can still be cast within the linear regression framework, but now using multiple "features". Within this multiple regression framework, you will fit models to data, interpret estimated coefficients, and form predictions. Here, you will also implement a gradient descent algorithm for fitting a multiple regression model.

What's included

19 videos5 readings3 assignments

19 videosTotal 86 minutes

Multiple regression intro0 minutes
Polynomial regression3 minutes
Modeling seasonality8 minutes
Where we see seasonality3 minutes
Regression with general features of 1 input2 minutes
Motivating the use of multiple inputs4 minutes
Defining notation3 minutes
Regression with features of multiple inputs3 minutes
Interpreting the multiple regression fit7 minutes
Rewriting the single observation model in vector notation6 minutes
Rewriting the model for all observations in matrix notation4 minutes
Computing the cost of a D-dimensional curve9 minutes
Computing the gradient of RSS3 minutes
Approach 1: closed-form solution3 minutes
Discussing the closed-form solution4 minutes
Approach 2: gradient descent2 minutes
Feature-by-feature update9 minutes
Algorithmic summary of gradient descent approach4 minutes
A brief recap1 minute

5 readingsTotal 50 minutes

Slides presented in this module10 minutes
Optional reading: review of matrix algebra10 minutes
Exploring different multiple regression models for house price prediction10 minutes
Numpy tutorial10 minutes
Implementing gradient descent for multiple regression10 minutes

3 assignmentsTotal 90 minutes

Multiple Regression30 minutes
Exploring different multiple regression models for house price prediction30 minutes
Implementing gradient descent for multiple regression30 minutes

Having learned about linear regression models and algorithms for estimating the parameters of such models, you are now ready to assess how well your considered method should perform in predicting new data. You are also ready to select amongst possible models to choose the best performing. This module is all about these important topics of model selection and assessment. You will examine both theoretical and practical aspects of such analyses. You will first explore the concept of measuring the "loss" of your predictions, and use this to define training, test, and generalization error. For these measures of error, you will analyze how they vary with model complexity and how they might be utilized to form a valid assessment of predictive performance. This leads directly to an important conversation about the bias-variance tradeoff, which is fundamental to machine learning. Finally, you will devise a method to first select amongst models and then assess the performance of the selected model. The concepts described in this module are key to all machine learning problems, well-beyond the regression setting addressed in this course.

What's included

14 videos2 readings2 assignments

14 videosTotal 93 minutes

Assessing performance intro0 minutes
What do we mean by "loss"?4 minutes
Training error: assessing loss on the training set7 minutes
Generalization error: what we really want8 minutes
Test error: what we can actually compute4 minutes
Defining overfitting2 minutes
Training/test split1 minute
Irreducible error and bias6 minutes
Variance and the bias-variance tradeoff6 minutes
Error vs. amount of data6 minutes
Formally defining the 3 sources of error14 minutes
Formally deriving why 3 sources of error20 minutes
Training/validation/test split for model selection, fitting, and assessment7 minutes
A brief recap1 minute

2 readingsTotal 20 minutes

Slides presented in this module10 minutes
Polynomial Regression10 minutes

2 assignmentsTotal 60 minutes

Assessing Performance30 minutes
Exploring the bias-variance tradeoff30 minutes

You have examined how the performance of a model varies with increasing model complexity, and can describe the potential pitfall of complex models becoming overfit to the training data. In this module, you will explore a very simple, but extremely effective technique for automatically coping with this issue. This method is called "ridge regression". You start out with a complex model, but now fit the model in a manner that not only incorporates a measure of fit to the training data, but also a term that biases the solution away from overfitted functions. To this end, you will explore symptoms of overfitted functions and use this to define a quantitative measure to use in your revised optimization objective. You will derive both a closed-form and gradient descent algorithm for fitting the ridge regression objective; these forms are small modifications from the original algorithms you derived for multiple regression. To select the strength of the bias away from overfitting, you will explore a general-purpose method called "cross validation". You will implement both cross-validation and gradient descent to fit a ridge regression model and select the regularization constant.

What's included

16 videos5 readings3 assignments

16 videosTotal 84 minutes

Symptoms of overfitting in polynomial regression2 minutes
Overfitting demo7 minutes
Overfitting for more general multiple regression models3 minutes
Balancing fit and magnitude of coefficients7 minutes
The resulting ridge objective and its extreme solutions5 minutes
How ridge regression balances bias and variance1 minute
Ridge regression demo9 minutes
The ridge coefficient path4 minutes
Computing the gradient of the ridge objective5 minutes
Approach 1: closed-form solution6 minutes
Discussing the closed-form solution5 minutes
Approach 2: gradient descent9 minutes
Selecting tuning parameters via cross validation3 minutes
K-fold cross validation5 minutes
How to handle the intercept6 minutes
A brief recap1 minute

5 readingsTotal 50 minutes

Slides presented in this module10 minutes
Download the notebook and follow along10 minutes
Download the notebook and follow along10 minutes
Observing effects of L2 penalty in polynomial regression10 minutes
Implementing ridge regression via gradient descent10 minutes

3 assignmentsTotal 90 minutes

Ridge Regression30 minutes
Observing effects of L2 penalty in polynomial regression30 minutes
Implementing ridge regression via gradient descent30 minutes

A fundamental machine learning task is to select amongst a set of features to include in a model. In this module, you will explore this idea in the context of multiple regression, and describe how such feature selection is important for both interpretability and efficiency of forming predictions. To start, you will examine methods that search over an enumeration of models including different subsets of features. You will analyze both exhaustive search and greedy algorithms. Then, instead of an explicit enumeration, we turn to Lasso regression, which implicitly performs feature selection in a manner akin to ridge regression: A complex model is fit based on a measure of fit to the training data plus a measure of overfitting different than that used in ridge. This lasso method has had impact in numerous applied domains, and the ideas behind the method have fundamentally changed machine learning and statistics. You will also implement a coordinate descent algorithm for fitting a Lasso model. Coordinate descent is another, general, optimization technique, which is useful in many areas of machine learning.

What's included

22 videos4 readings3 assignments

22 videosTotal 125 minutes

The feature selection task3 minutes
All subsets6 minutes
Complexity of all subsets3 minutes
Greedy algorithms7 minutes
Complexity of the greedy forward stepwise algorithm2 minutes
Can we use regularization for feature selection?3 minutes
Thresholding ridge coefficients?4 minutes
The lasso objective and its coefficient path7 minutes
Visualizing the ridge cost7 minutes
Visualizing the ridge solution6 minutes
Visualizing the lasso cost and solution7 minutes
Lasso demo5 minutes
What makes the lasso objective different3 minutes
Coordinate descent5 minutes
Normalizing features3 minutes
Coordinate descent for least squares regression (normalized features)8 minutes
Coordinate descent for lasso (normalized features)5 minutes
Assessing convergence and other lasso solvers2 minutes
Coordinate descent for lasso (unnormalized features)1 minute
Deriving the lasso coordinate descent update19 minutes
Choosing the penalty strength and other practical issues with lasso5 minutes
A brief recap3 minutes

4 readingsTotal 40 minutes

Slides presented in this module10 minutes
Download the notebook and follow along10 minutes
Using LASSO to select features10 minutes
Implementing LASSO using coordinate descent10 minutes

3 assignmentsTotal 90 minutes

Feature Selection and Lasso30 minutes
Using LASSO to select features30 minutes
Implementing LASSO using coordinate descent30 minutes

Up to this point, we have focused on methods that fit parametric functions---like polynomials and hyperplanes---to the entire dataset. In this module, we instead turn our attention to a class of "nonparametric" methods. These methods allow the complexity of the model to increase as more data are observed, and result in fits that adapt locally to the observations. We start by considering the simple and intuitive example of nonparametric methods, nearest neighbor regression: The prediction for a query point is based on the outputs of the most related observations in the training set. This approach is extremely simple, but can provide excellent predictions, especially for large datasets. You will deploy algorithms to search for the nearest neighbors and form predictions based on the discovered neighbors. Building on this idea, we turn to kernel regression. Instead of forming predictions based on a small set of neighboring observations, kernel regression uses all observations in the dataset, but the impact of these observations on the predicted value is weighted by their similarity to the query point. You will analyze the theoretical performance of these methods in the limit of infinite training data, and explore the scenarios in which these methods work well versus struggle. You will also implement these techniques and observe their practical behavior.

What's included

13 videos2 readings2 assignments

13 videosTotal 62 minutes

Limitations of parametric regression3 minutes
1-Nearest neighbor regression approach8 minutes
Distance metrics4 minutes
1-Nearest neighbor algorithm3 minutes
k-Nearest neighbors regression7 minutes
k-Nearest neighbors in practice3 minutes
Weighted k-nearest neighbors4 minutes
From weighted k-NN to kernel regression6 minutes
Global fits of parametric models vs. local fits of kernel regression6 minutes
Performance of NN as amount of data grows7 minutes
Issues with high-dimensions, data scarcity, and computational complexity3 minutes
k-NN for classification1 minute
A brief recap1 minute

2 readingsTotal 20 minutes

Slides presented in this module10 minutes
Predicting house prices using k-nearest neighbors regression10 minutes

2 assignmentsTotal 60 minutes

Nearest Neighbors & Kernel Regression30 minutes
Predicting house prices using k-nearest neighbors regression30 minutes

In the conclusion of the course, we will recap what we have covered. This represents both techniques specific to regression, as well as foundational machine learning concepts that will appear throughout the specialization. We also briefly discuss some important regression techniques we did not cover in this course. We conclude with an overview of what's in store for you in the rest of the specialization.

What's included

5 videos1 reading

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings

4.8 (176 ratings)

Emily Fox

University of Washington

6 Courses491,897 learners

Carlos Guestrin

University of Washington

8 Courses492,684 learners

Offered by

University of Washington

Explore more from Machine Learning

Status: Free Trial
IBM
Supervised Machine Learning: Regression
Course
Status: Free Trial
University of Washington
Machine Learning: Classification
Course
Status: Free Trial
University of Colorado Boulder
Regression Analysis
Course
Status: Free Trial
University of Washington
Machine Learning Foundations: A Case Study Approach
Course

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

4.8

5,581 reviews

5 stars
80.92%
4 stars
15.89%
3 stars
1.88%
2 stars
0.46%
1 star
0.84%

Showing 3 of 5581

Reviewed on Apr 6, 2016

This is an excellent course. The presentation is clear, the graphs are very informative, the homework is well-structured and it does not beat around the bush with unnecessary theoretical tangents.

Reviewed on Aug 30, 2016

it's a nice course. I have learnt many new concepts. I am from information systems background and want my career towards data science. This course helped me a lot in learning new concepts.

Reviewed on Jun 11, 2016

This course start from problems. So this great to motivate the content and let student know why. However, there are lot of confusion questions that lead to miss understand the exercise problems.

View more reviews

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy.