When you enroll in this course, you'll also be enrolled in this Professional Certificate.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate from IBM
There are 6 modules in this course
In an increasingly data-centric world, the ability to derive meaningful insights from raw data is essential. The IBM Data Analyst Capstone Project gives you the opportunity to apply the skills and techniques learned throughout the IBM Data Analyst Professional Certificate. Working with actual datasets, you will carry out tasks commonly performed by professional data analysts, such as data collection from multiple sources, data wrangling, exploratory analysis, statistical analysis, data visualization, and creating interactive dashboards. Your final deliverable will include a comprehensive data analysis report, complete with an executive summary, detailed insights, and a conclusion for organizational stakeholders.
Throughout the project, you will demonstrate your proficiency in tools such as Jupyter Notebooks, SQL, Relational Databases (RDBMS), and Business Intelligence (BI) tools like IBM Cognos Analytics. You will also apply Python libraries, including Pandas, Numpy, Scikit-learn, Scipy, Matplotlib, and Seaborn.
We recommend completing the previous courses in the Professional Certificate before starting this capstone project, as it integrates all key concepts and techniques into a single, real-world scenario.
In this module, you’ll apply key data collection and analysis techniques using APIs and web scraping. You’ll start by exploring HTTP requests and using APIs to retrieve and paginate job postings across different technologies. Then, you’ll work with a JSON endpoint to collect job data through API requests. Next, you’ll use web scraping techniques to download webpages, extract links and images, and gather data from HTML tables into a CSV file. By the end of this module, you’ll have hands-on experience with real-world data collection methods. You’ll also complete a graded quiz to check your understanding.
What's included
2 videos4 readings4 assignments5 app items
Show info about module content
2 videos•Total 7 minutes
Course Introduction•2 minutes
Project Overview•5 minutes
4 readings•Total 40 minutes
Prerequisites and Course Syllabus•5 minutes
Emerging Trends in Data Analytics•10 minutes
Project Scenario•10 minutes
About the Dataset•15 minutes
4 assignments•Total 62 minutes
Graded Quiz: Data Collection•30 minutes
Checklist: Collecting Data Using APIs•10 minutes
Checklist: Collecting Data Using Webscraping•8 minutes
Checklist: Exploring Data•14 minutes
5 app items•Total 180 minutes
(Optional) Lab 1: Review Of Accessing APIs•30 minutes
Lab 2: Collecting Data Using APIs•30 minutes
Lab 3: Review Of Web Scraping•30 minutes
Lab 4: Collecting Data Using Web Scraping•60 minutes
Lab 5: Exploring the Dataset•30 minutes
Data Wrangling
Module 2•6 hours to complete
Module details
In this module, you will perform essential data-wrangling techniques necessary for cleaning and preparing datasets for analysis. Throughout the module, you will engage in hands-on activities to identify and handle common data issues, including duplicate entries and missing values. You will strategically remove duplicate records, apply suitable imputation strategies for missing data, and normalize datasets to ensure consistency and accuracy. Additionally, you will have a graded quizz to assess your understanding and reinforce the concepts covered.
What's included
1 reading7 assignments6 app items
Show info about module content
1 reading•Total 5 minutes
Assignment Overview •5 minutes
7 assignments•Total 104 minutes
Graded Quiz: Data Wrangling•30 minutes
Checklist: Finding Duplicates•14 minutes
Checklist: Removing Duplicates•10 minutes
Checklist: Finding Missing Values•16 minutes
Checklist: Imputing Missing Values•12 minutes
Checklist: Normalizing Data•10 minutes
Checklist: Data Wrangling•12 minutes
6 app items•Total 240 minutes
Lab 6: Finding Duplicates•30 minutes
Lab 7: Removing Duplicates•30 minutes
Lab 8: Finding Missing Values•30 minutes
Lab 9: Impute Missing Values•60 minutes
Lab 10: Normalizing Data•60 minutes
Lab 11: Data Wrangling•30 minutes
Exploratory Data Analysis
Module 3•4 hours to complete
Module details
In this module, you will engage in essential exploratory data analysis (EDA) techniques to uncover meaningful insights from your data set. You will start by identifying the distribution of the data through plotting distribution curves and histograms, which are crucial for understanding how values are spread across different features. Next, you will detect outliers that may skew your analysis and learn how to effectively remove them to ensure data integrity. Additionally, you will explore correlations between various features in the data set, revealing relationships that can inform your overall analysis. Finally, you will create a new DataFrame to organize and present your findings. The module includes a graded quiz to test your knowledge.
What's included
1 reading5 assignments4 app items
Show info about module content
1 reading•Total 2 minutes
Assignment Overview•2 minutes
5 assignments•Total 92 minutes
Graded Quiz: Exploratory Data Analysis•30 minutes
Checklist: Exploratory Data Analysis•22 minutes
Checklist: Analyzing the Data Distribution•14 minutes
Checklist: Handling Outliers•14 minutes
Checklist: Correlation•12 minutes
4 app items•Total 120 minutes
Lab 12: Exploratory Data Analysis•30 minutes
Lab 13: Finding How The Data Is Distributed•30 minutes
Lab 14: Finding Outliers•30 minutes
Lab 15: Finding Correlation•30 minutes
Data Visualization
Module 4•7 hours to complete
Module details
In this lab, you will perform essential data visualization techniques to extract meaningful insights from the Stack Overflow survey data set. You will start by visualizing the distribution of data using histograms and box plots to understand the spread of compensation and age. Next, you will explore relationships between features through scatterplots and bubble plots, followed by examining the composition of data with pie charts and stacked charts. Additionally, you will compare data across categories using line and bar charts. The module includes a graded quizz that will assess your knowledge of these concepts, ensuring you are well prepared for further analysis in your final project.
What's included
1 reading6 assignments9 app items
Show info about module content
1 reading•Total 2 minutes
Assignment Overview•2 minutes
6 assignments•Total 78 minutes
Graded Quiz: Data Visualization•30 minutes
Checklist: Data Visualization•16 minutes
Checklist: Visualizing Distribution of Data•8 minutes
Checklist: Visualizing Relationship•8 minutes
Checklist: Visualizing Composition of Data•8 minutes
Checklist: Visualizing Comparison of Data•8 minutes
9 app items•Total 330 minutes
Lab 16: Data Visualization•60 minutes
Lab 17: Histograms•30 minutes
Lab 18: Box Plots•30 minutes
Lab 19: Scatter Plot•30 minutes
Lab 20: Bubble Plots•30 minutes
Lab 21: Pie Charts•30 minutes
Lab 22: Stacked Charts•30 minutes
Lab 23: Line Charts•60 minutes
Lab 24: Bar Charts•30 minutes
Building A Dashboard
Module 5•2 hours to complete
Module details
In this module, you will create dashboards using Stack Overflow survey data using either IBM Cognos Analytics or Google Looker Studio. The assignment is divided into Part A: Building a Dashboard with IBM Cognos Analytics and Part B: Building a Dashboard with Google Looker Studio. You will design a dashboard with sections on Current Technology Usage, Future Technology Trends, and Demographics. After completing the assignment, you will be required to submit the link of the Cognos or Looker Studio dashboard you complete. The module also includes a checklist that helps you ensure you have completed all necessary tasks before moving on.
What's included
1 reading2 assignments2 plugins
Show info about module content
1 reading•Total 10 minutes
Assignment Overview•10 minutes
2 assignments•Total 40 minutes
Graded Quiz: Building a Dashboard•30 minutes
Checklist: Dashboards•10 minutes
2 plugins•Total 60 minutes
Lab 25: Option A - Building A Dashboard With IBM Cognos Analytics•45 minutes
Lab 26: Option B - Building A Dashboard With Google Looker Studio•15 minutes
Final Assignment: Present Your Findings
Module 6•3 hours to complete
Module details
In the final module, you will focus on presenting your data findings effectively. You will begin by exploring key elements contributing to a successful data findings report, including structuring your report, using best practices for data visualization, and presenting complex information in an engaging, accessible format. The module also includes labs covering basics in PowerPoint, foundational presentation techniques, and saving your presentation as a PDF to ensure a polished, professional final product. Finally, you will complete and submit a final presentation highlighting insights derived from the Stack Overflow Developer Survey data for evaluation through AI Grading or Peer Review.
At IBM, we know how rapidly tech evolves and recognize the crucial need for businesses and professionals to build job-ready, hands-on skills quickly. As a market-leading tech innovator, we’re committed to helping you thrive in this dynamic landscape. Through IBM Skills Network, our expertly designed training programs in AI, software development, cybersecurity, data science, business management, and more, provide the essential skills you need to secure your first job, advance your career, or drive business success. Whether you’re upskilling yourself or your team, our courses, Specializations, and Professional Certificates build the technical expertise that ensures you, and your organization, excel in a competitive world.
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Learner reviews
4.6
1,344 reviews
5 stars
77.69%
4 stars
14.94%
3 stars
4.08%
2 stars
1.18%
1 star
2.08%
Showing 3 of 1344
B
BK
5·
Reviewed on Feb 14, 2025
Everything went quite well but instructions provided didn't help me at all. Had to take help from outside sources to complete this.
H
HD
5·
Reviewed on Dec 26, 2020
this course was a very good opportunity to practice almost all the materials studied in this specialization.taking the role of an associative data analyst was very helping
M
MK
5·
Reviewed on Jul 17, 2022
A good beginner friendly course in data analysis. Using the jupyter notebook was easier than going over to some websites to open the same jupyter notebook.
Data analysis is the process of inspecting, cleaning, transforming, and modeling data to uncover useful information, make informed decisions, and support business strategies. It involves techniques such as statistical analysis, visualization, and reporting to identify trends, patterns, and insights from datasets.
Can I enroll in the professional certificate program during or after I take the course?
When you subscribe to a course that is part of a Certificate, you’re automatically subscribed to the full Certificate. Visit your learner dashboard to track your progress.
Is this course really 100% online? Do I need to attend any classes in person?
This course is completely online, so there’s no need to show up to a classroom in person. You can access your lectures, readings, and assignments anytime and anywhere through the web or your mobile device.
What credentials will I earn by completing this course?
Each course you complete earns you an IBM badge to certify successful course completion. You will earn an IBM course badge. Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and for performance reviews.
What real-world dataset and tools will I use to build my analytics dashboard?
This capstone utilizes the extensive Stack Overflow Developer Survey dataset to simulate a large-scale corporate analysis. You will gain hands-on experience building dynamic, interactive business intelligence dashboards using industry-standard BI platforms: IBM Cognos Analytics or Google Looker Studio. Your dashboards will be meticulously structured into professional reporting sections tracking Current Technology Usage, Future Technology Trends, and global Developer Demographics, providing a highly visible asset to share with employers.
How does this project test my end-to-end data engineering and python skills?
Rather than giving you clean, pre-packaged data, this capstone requires you to manage the entire data lifecycle. You will write Python code to execute programmatic data collection via REST APIs (including pagination handling) and web scrape HTML tables using libraries like Pandas and NumPy. From there, you will perform essential data wrangling—such as duplicate removal, outlier detection, and data imputation—to prepare data frames for deep statistical analysis using SciPy and Scikit-learn.
What final deliverables will I produce to showcase to stakeholders?
The ultimate goal of a data analyst is data storytelling. For your final submission, you will compile your Exploratory Data Analysis (EDA) and visualizations into a comprehensive, boardroom-ready executive report and slide presentation. You will apply presentation best practices to translate complex technical correlations, histograms, and scatter plots into an accessible format for organizational leaders. Your work will undergo rigorous evaluation via peer review or AI grading to certify your readiness for a professional analytics role.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Certificate?
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.