This course is an introduction to data science and statistical thinking. Learners will gain experience with exploring, visualizing, and analyzing data to understand natural phenomena and investigate patterns, model outcomes, and do so in a reproducible and shareable manner. Topics covered include data visualization and transformation for exploratory data analysis. Learners will be introduced to problems and case studies inspired by and based on real-world questions and data via lecture and live coding videos as well as interactive programming exercises. The course will focus on the R statistical computing language with a focus on packages from the Tidyverse, the RStudio integrated development environment, Quarto for reproducible reporting, and Git and GitHub for version control. The skills learners will gain in this course will prepare them for careers in a variety of fields, including data scientist, data analyst, quantitative analyst, statistician, and much more.
Data Visualization and Transformation with R
Instructors: Mine Çetinkaya-Rundel
Sponsored by Duke University
What you'll learn
Transform, visualize, summarize, and analyze data in R, with packages from the Tidyverse, using RStudio
Carry out analyses in a reproducible and shareable manner with Quarto
Learn to effectively communicate results through an optional written project version controlled with Git and hosted on GitHub
Details to know
Add to your LinkedIn profile
3 assignments
See how employees at top companies are mastering in-demand skills
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
There are 3 modules in this course
Hello World! In the first module, you will learn about what data science is and how data science techniques are used to make meaning from data and inform data-driven decisions. There is also discussion around the importance of reproducibility in science and the techniques used to achieve this. Next, you will learn the technology languages of R, RStudio, Quarto, and GitHub, as well as their role in data science and reproducibility.
What's included
4 videos10 readings1 assignment2 discussion prompts1 plugin
In our second module, we'll advance our understanding of R to set the stage for creating data visualizations using tidyverse’s data visualization package: ggplot2. We'll learn all about different data types and the appropriate data visualization techniques that can be used to plot these data. The majority of this module is to help best understand ggplot2 syntax and how it relates to the Grammar of Graphics. By the end of this module, you will have started building up the foundation of your statistical tool-kit needed to create basic data visualizations in R.
What's included
4 videos5 readings1 assignment1 discussion prompt1 plugin
In this module, we will take a step back and learn about tools for transforming data that might not yet be ready for visualization as well as for summarizing data with tidyverse’s data wrangling package: dplyr. In addition to describing distributions of single variables, you will also learn to explore relationships between two or more variables. Finally, you will continue to hone your data visualization skills with plots for various data types.
What's included
8 videos14 readings1 assignment2 discussion prompts1 plugin
Offered by
Why people choose Coursera for their career
Recommended if you're interested in Data Science
Johns Hopkins University
University of Michigan
Johns Hopkins University
Open new doors with Coursera Plus
Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy