Developing insights about your organization, business, or research project depends on effective modeling and analysis of the data you collect. Building effective models requires understanding the different types of questions you can ask and how to map those questions to your data. Different modeling approaches can be chosen to detect interesting patterns in the data and identify hidden relationships.
Modeling Data in the Tidyverse
This course is part of Tidyverse Skills for Data Science in R Specialization
Instructors: Carrie Wright, PhD
Sponsored by EdgePoint Software
What you'll learn
Describe different types of data analytic questions
Conduct hypothesis tests of your data
Apply linear modeling techniques to answer multivariable questions
Apply machine learning workflows to detect complex patterns in your data
Skills you'll gain
Details to know
Add to your LinkedIn profile
8 assignments
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
There are 11 modules in this course
Developing insights about your organization, business, or research project depends on effective modeling and analysis of the data you collect. Building effective models requires understanding the different types of questions you can ask and how to map those questions to your data. Different modeling approaches can be chosen to detect interesting patterns in the data and identify hidden relationships.
What's included
16 readings1 assignment
Inferential Analysis is what analysts carry out after they’ve described and explored their dataset. After understanding your dataset better, analysts often try to infer something from the data. This is done using statistical tests. We discussed a bit about how we can use models to perform inference and prediction analyses. What does this mean?
What's included
3 readings1 assignment
Linear models are the most commonly used models in data analysis because of their computational efficiency and their ease of interpretation. Having a solid understanding of linear models and how they work is critical for any work in data science. The tidyverse provides a set of tools for making linear modeling more efficient and streamlined.
What's included
12 readings1 assignment
Multiple linear regression is needed when you want to include confounding factors or other predictors in your model for the response. R provides a straightforward way to do this via the formula interface to the lm() function.
What's included
1 reading1 assignment
While we’ve focused on linear regression in this lesson on inference, linear regression isn’t the only analytical approach out there. However, it is arguably the most commonly used. And, beyond that, there are many statistical tests and approaches that are slight variations on linear regression, so having a solid foundation and understanding of linear regression makes understanding these other tests and approaches much simpler. For example, what if you didn’t want to measure the linear relationship between two variables, but instead wanted to know whether or not the average observed is different from expectation?
What's included
3 readings
Hypothesis testing describes a family of statistical techniques for determining whether the data you collect provides evidence for the value of an unknown parameter of interest. The goal of hypothesis tests is to make inferences while accounting for variability in the data that can lead to spurious results.
What's included
3 readings1 assignment1 plugin
Prediction modeling is an essential activity in data science and involves building systems for making predictions based on previously observed data. These models are typically very flexible and can capture a range of different relationships.
What's included
12 readings1 assignment
There are incredibly helpful packages available in R thanks to the work of RStudio. As mentioned above, there are hundreds of different machine learning algorithms. The tidymodels R packages have put many of them into a single framework, allowing you to use many different machine learning models easily.
What's included
5 readings1 assignment
This case study will demonstrate an approach to building a prediction model for predicting outdoor air pollution concentrations in the United States.
What's included
17 readings1 ungraded lab
The tidymodels collection of packages can be overwhelming at first glance. Here, we provide a quick summary chart to help navigate all of the packages and when they should be used.
What's included
1 reading
In this project, you will practice building models with the tidyverse for classifying consumer complaints data from the Consumer Financial Protection Bureau (CFPB). This project includes both a Peer Review step in which you'll upload R Markdown and knitted HTML files AND a Quiz step in which you'll answer questions about the predictions made by your classification algorithm.
What's included
1 reading1 assignment1 peer review
Instructors
Offered by
Why people choose Coursera for their career
Recommended if you're interested in Data Science
Coursera Project Network
Illinois Tech
University of Colorado Boulder
Johns Hopkins University
Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy