Johns Hopkins University
Data Science Specialization
Johns Hopkins University

Data Science Specialization

Launch Your Career in Data Science. A ten-course introduction to data science, developed and taught by leading professors.

Roger D. Peng, PhD
Brian Caffo, PhD
Jeff Leek, PhD

Instructors: Roger D. Peng, PhD

Sponsored by ARS SCINet/AI-COE

493,400 already enrolled

Get in-depth knowledge of a subject
4.5

(38,736 reviews)

Beginner level

Recommended experience

7 months
at 10 hours a week
Flexible schedule
Learn at your own pace
Get in-depth knowledge of a subject
4.5

(38,736 reviews)

Beginner level

Recommended experience

7 months
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Use R to clean, analyze, and visualize data.

  • Navigate the entire data science pipeline from data acquisition to publication.

  • Use GitHub to manage data science projects.

  • Perform regression analysis, least squares and inference using regression models.

Details to know

Shareable certificate

Add to your LinkedIn profile

Taught in English

See how employees at top companies are mastering in-demand skills

Placeholder

Advance your subject-matter expertise

  • Learn in-demand skills from university and industry experts
  • Master a subject or tool with hands-on projects
  • Develop a deep understanding of key concepts
  • Earn a career certificate from Johns Hopkins University
Placeholder
Placeholder

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

Placeholder

Specialization - 10 course series

The Data Scientist’s Toolbox

Course 117 hours4.6 (33,933 ratings)

What you'll learn

  • Set up R, R-Studio, Github and other useful tools

  • Understand the data, problems, and tools that data analysts use

  • Explain essential study design concepts

  • Create a Github repository

Skills you'll gain

Category: R Programming
Category: Data Analysis
Category: Version Control
Category: Data Science
Category: Git (Version Control System)
Category: Configuration Management
Category: Software Development Tools
Category: Statistical Programming
Category: GitHub
Category: Rmarkdown
Category: Software Development Life Cycle
Category: Scientific Methods
Category: Software Versioning
Category: Software Development
Category: Research Methodologies
Category: General Science and Research
Category: Application Lifecycle Management
Category: Statistical Machine Learning
Category: Research
Category: Software Configuration Management

R Programming

Course 257 hours4.5 (22,245 ratings)

What you'll learn

  • Understand critical programming language concepts

  • Configure statistical programming software

  • Make use of R loop functions and debugging tools

  • Collect detailed information using R profiler

Skills you'll gain

Category: R Programming
Category: Statistical Programming
Category: Simulation and Simulation Software
Category: Engineering Software
Category: Software Engineering Tools
Category: Software Development
Category: Artificial Intelligence and Machine Learning (AI/ML)
Category: Applied Machine Learning
Category: Debugging
Category: Computer Programming
Category: Performance Tuning
Category: Machine Learning
Category: Software Development Tools
Category: Computer Programming Tools
Category: Statistical Machine Learning
Category: Computer Science
Category: Programming Principles
Category: Application Performance Management
Category: Software Engineering
Category: Simulations
Category: Machine Learning Methods

Getting and Cleaning Data

Course 319 hours4.5 (8,062 ratings)

What you'll learn

  • Understand common data storage systems

  • Apply data cleaning basics to make data "tidy"

  • Use R for text and date manipulation

  • Obtain usable data from the web, APIs, and databases

Skills you'll gain

Category: R Programming
Category: Tidyverse (R Package)
Category: Data Integration
Category: Data Quality
Category: Database Architecture and Administration
Category: Data Analysis
Category: Data Processing
Category: Data Wrangling
Category: Statistical Machine Learning
Category: Data Management
Category: Data Access
Category: Information Systems
Category: Data Mapping
Category: Data Transformation
Category: Data Science
Category: Data Governance
Category: Data Manipulation
Category: Database Systems
Category: Data Strategy
Category: Extract, Transform, Load
Category: Statistical Programming
Category: Database Management
Category: Data Engineering
Category: Database Administration
Category: Data Storage
Category: Databases
Category: Database Management Systems
Category: Data Import/Export
Category: Information Management
Category: Data Architecture
Category: Big Data

Exploratory Data Analysis

Course 454 hours4.7 (6,068 ratings)

What you'll learn

  • Understand analytic graphics and the base plotting system in R

  • Use advanced graphing systems such as the Lattice system

  • Make graphical displays of very high dimensional data

  • Apply cluster analysis techniques to locate patterns in data

Skills you'll gain

Category: Data Analysis
Category: Statistical Analysis
Category: Statistical Visualization
Category: Exploratory Data Analysis
Category: Data Science
Category: Data Presentation
Category: Data Visualization Software
Category: Ggplot2
Category: Plot (Graphics)
Category: Data Visualization
Category: Statistical Programming
Category: Data Storytelling
Category: Analytics
Category: Probability & Statistics
Category: Statistical Machine Learning
Category: Interactive Data Visualization
Category: R Programming
Category: Statistics
Category: Statistical Methods
Category: Dimensionality Reduction
Category: Dashboard

Reproducible Research

Course 57 hours4.6 (4,173 ratings)

What you'll learn

  • Organize data analysis to help make it more reproducible

  • Write up a reproducible data analysis using knitr

  • Determine the reproducibility of analysis project

  • Publish reproducible web documents using Markdown

Skills you'll gain

Category: Rmarkdown
Category: Knitr
Category: R Programming
Category: Data Analysis
Category: Information Management
Category: Data Management
Category: General Science and Research
Category: Research
Category: Data Science
Category: Research Methodologies
Category: Scientific Methods
Category: Data Sharing
Category: Data Governance
Category: Statistical Programming

Statistical Inference

Course 654 hours4.2 (4,434 ratings)

What you'll learn

  • Understand the process of drawing conclusions about populations or scientific truths from data

  • Describe variability, distributions, limits, and confidence intervals

  • Use p-values, confidence intervals, and permutation tests

  • Make informed data analysis decisions

Skills you'll gain

Category: Statistical Analysis
Category: Probability & Statistics
Category: Statistical Methods
Category: Statistics
Category: Statistical Inference
Category: Data Analysis
Category: Probability
Category: Applied Mathematics
Category: Mathematics and Mathematical Modeling
Category: Statistical Hypothesis Testing
Category: Data Science
Category: Mathematical Modeling
Category: Statistical Modeling
Category: Probability Distribution
Category: Advanced Mathematics
Category: Analytics

Regression Models

Course 753 hours4.4 (3,358 ratings)

What you'll learn

  • Use regression analysis, least squares and inference

  • Understand ANOVA and ANCOVA model cases

  • Investigate analysis of residuals and variability

  • Describe novel uses of regression models such as scatterplot smoothing

Skills you'll gain

Category: Probability & Statistics
Category: Statistical Analysis
Category: Statistics
Category: Statistical Modeling
Category: Mathematical Modeling
Category: Statistical Methods
Category: Data Analysis
Category: Data Science
Category: Analytics
Category: Regression Analysis
Category: Probability
Category: Mathematics and Mathematical Modeling
Category: Applied Mathematics
Category: Business Analytics

Practical Machine Learning

Course 88 hours4.5 (3,246 ratings)

What you'll learn

  • Use the basic components of building and applying prediction functions

  • Understand concepts such as training and tests sets, overfitting, and error rates

  • Describe machine learning methods such as regression or classification trees

  • Explain the complete process of building prediction functions

Skills you'll gain

Category: Data Science
Category: Predictive Modeling
Category: Predictive Analytics
Category: Statistical Modeling
Category: Machine Learning
Category: Applied Machine Learning
Category: Artificial Intelligence and Machine Learning (AI/ML)
Category: Statistical Programming
Category: Statistics
Category: Mathematical Modeling
Category: Probability & Statistics
Category: Statistical Analysis
Category: R Programming
Category: Feature Engineering
Category: Machine Learning Algorithms
Category: Artificial Intelligence
Category: Classification And Regression Tree (CART)
Category: Random Forest Algorithm
Category: Advanced Analytics
Category: Machine Learning Methods
Category: Statistical Machine Learning
Category: Business Analytics
Category: Data Analysis
Category: Computer Science
Category: Supervised Learning

Developing Data Products

Course 910 hours4.6 (2,255 ratings)

What you'll learn

  • Develop basic applications and interactive graphics using GoogleVis

  • Use Leaflet to create interactive annotated maps

  • Build an R Markdown presentation that includes a data visualization

  • Create a data product that tells a story to a mass audience

Skills you'll gain

Category: R Programming
Category: Rmarkdown
Category: Shiny (R Package)
Category: Statistical Visualization
Category: Leaflet (Software)
Category: Data Visualization
Category: Data Presentation
Category: Data Storytelling
Category: Data Visualization Software
Category: Plotly
Category: Interactive Data Visualization
Category: Statistical Programming

Data Science Capstone

Course 105 hours4.5 (1,226 ratings)

What you'll learn

  • Create a useful data product for the public

  • Apply your exploratory data analysis skills

  • Build an efficient and accurate prediction model

  • Produce a presentation deck to showcase your findings

Skills you'll gain

Category: Statistical Modeling
Category: Data Science
Category: Predictive Analytics
Category: Data Analysis
Category: Artificial Intelligence and Machine Learning (AI/ML)
Category: Predictive Modeling
Category: Mathematical Modeling
Category: Product Management
Category: Applied Machine Learning
Category: Business Analytics
Category: Exploratory Data Analysis
Category: Statistics
Category: Machine Learning
Category: Analytics
Category: Advanced Analytics
Category: Product Development
Category: Statistical Analysis
Category: Probability & Statistics
Category: Statistical Methods

Instructors

Roger D. Peng, PhD
Johns Hopkins University
37 Courses1,608,001 learners
Brian Caffo, PhD
Johns Hopkins University
30 Courses1,633,988 learners
Jeff Leek, PhD
Johns Hopkins University
32 Courses1,662,514 learners

Offered by

Industry partners

Partner 1
Partner 2

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Placeholder

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy