Chevron Left
Back to Introduction to Data Science in Python

Learner Reviews & Feedback for Introduction to Data Science in Python by University of Michigan

26,999 ratings

About the Course

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses. This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python....

Top reviews


Invalid date

overall the good introductory course of python for data science but i feel it should have covered the basics in more details .specially for the ones who do not have any prior programming background .


Invalid date

super hard course I enjoyed it but the thing that I didn't like specially when this is supposed to be the learning phase the assignments were hard and it sometimes pushes you to look for the solution

Filter by:

3601 - 3625 of 5,942 Reviews for Introduction to Data Science in Python

By Shrikant S T

Aug 20, 2020


By Vaibhav D K

Aug 8, 2020



Jul 13, 2020


By Prathmesh M

Jul 9, 2020


By 장동희

Jun 10, 2020

By Priya G K

Jun 1, 2020


By Junaid L S

May 14, 2019


By merve s

Mar 11, 2019


By Srinivasarao M

Feb 4, 2019


By David A D V

Jan 22, 2019


By Anyi L

Aug 20, 2018


By Khushpreet S

Apr 23, 2018


By Thomas

Dec 29, 2017


By James R

Feb 5, 2017


By Howard C

Nov 6, 2017

Very good content. One downside for me was that being new to Python, Pandas, Numpy, Scipy etc, I found the amount of new information being thrown at me to be a bit overwhelming. Each of these languages/packages could be a separate course even before you start talking about Data Analysis concepts. I was able to complete all the assignments, but I feel like I know "just enough to be dangerous".

Speaking of the assignments, if you're a newbie like me, give yourself plenty of time to complete to work on them. My rule of thumb was to multiply the "estimated time" for each assignment by a factor of 4. The assignment that was supposed to take 2 hours ended up taking my whole Saturday and the 4 hour project at the end of the course pretty much consumed an entire weekend. This might not apply if you have previous experience in this development environment or are just smarter than me ;-)

Not everything that you need to know to do the homework is provided in the lecture, so expect to spend a lot of time in StackOverflow. The discussion forums are also very useful. Sometimes a teaching assistant will offer some hints that make all the difference.

One gripe I have is with the automated grader. It's a great idea, but sometimes you can submit a fairly complicated bit of code and the only feedback you get from the grader is: "Wrong!". My suggestion: have two data sets, one for testing and another for grading. Then students could openly discuss and debug their test results in the discussion forums without violating the Honor Code. They would still have to submit a valid algorithm to pass against the test data.

By Dionyssios M

Nov 19, 2017

I am a PhD scientist and heavy user of matlab, R, Stata, bash scripting, and some more esoteric computer languages. I took this course with the idea of covering some background in python skills in a structured manner, the goal being to move many of my data science and some of my data processing code to python.

I found the exercises useful. The lectures are not bad, I just felt they were an overview that either didn't connect much with some of the minutiae of the assignments or they were not always key to me given my background. Eg I found the week 2 videos more interesting; week 4 videos far less so especially the video about running a t-test in python (my statistical skillset is far more advanced).

The real point of frustration is the grader which is extremely sensitive to slight variations. I feel there should be a feedback system where users/students document such cases that could then become a FAQ. Examples:

Grader chokes on type but won't tell me: Submitting string 'True' instead of Boolean True.

Grader chokes on useless (non)significant digits: using round(*,2) at one point crashes the submitted work.

These "errors" are so slight that are almost beyond the human ability to catch them. The result is that, in part, the course turns from 'learning python skills' to 'getting to understand minutiae of what the grader does' which can be really frustrating.

In sum, I believe there is value in this course but the grader is fairly broken and needs a FAQ or similar to warn re choke points generated from trivial differences. I am subtracting stars in the review for that particular reason.

By Declan C

Sep 18, 2017

Overall I would certainly recommend this course, I've found it immediately relevant in my field of science/engineering. It is fast-paced and difficult yes, especially for those of us with limited python experience, but that kick it gives leaves you with some solid, immediately-applicable skills.

Where it goes well: Strong content and excellent delivery.

Fosters independence. The course starts from the basics but accelerates at a fast pace, introducing you to roots of concepts but expecting you to expand on them yourself with outside resources rather than rote-learning. In this manner it leaves you VERY prepared to tackle unscripted challenges.

Concise content. The lectures videos themselves contain almost zero fluff. The lecturer conveys relevant information in a very smooth and efficient manner. Replaying specific parts to revise or solidify understanding becomes a pleasure due to this.

Where it could be improved: Could be a more polished.

Time estimates for the assignments were WAY off. I do not mind a challenging assignment, however if it advertises that it will take 3 hours, I would hope not to expect to spend closer to 20 hours, however this certainly was the case. Multiply the estimated time by 4-5 to get a more realistic time.

The assignment wording can sometimes be a little ambiguous. It's almost mandatory to go through the forum posts for clarification. I realise some things were noticed after the publication of the course, and contained by pinned posts in the forum, but perhaps if the next installment of the course could be updated, ironing out some of these wrinkles.

By Nguyen P A

Feb 8, 2020

Teaching an online course is difficult since audience comes from various backgrounds. The lecturer in this course is very careful in choosing the materials to include so that the course covers the needs of a large base of audience. The title is quite misleading though, because it is neither an introduction course nor data science course, the true title should be: "Getting pro at data handling in Python"

Although the course has been pretty well design and truly at a level much better than many online courses. I think there are several improvements should be made to avoid learners' frustration:

1. Lecturer speaks too fast. I feel like he was busy reading from his script without truly care whether people are really following or not.

2. Assignments are poorly written: although the concept introduces in assignments are relevant to real Data Science problems, the instructions are unclear and getting worse from week 1 to week 4. The most unorganized instruction is week 4 assignment, which seems to be prepared by several different TAs and then hurriedly combined together.

3. The pre-requisite should clearly state that learners should have some real programming experience to be able to debug his own code. Completing one or two courses in Python previously will not help performing well in this course.

4. The instructor should also state who should not take this course, it should be a waste of time for someone with many years of programming experience.

By Mark P

Jul 1, 2017

On balance, I'd recommend the course, but use it as only one of many sources you'll use to learn data science. My main complaints: 1. the course concentrates on mechanics of Python rather than on understanding of Data Science as a discipline, 2. It's rushed.


Good information on how to use the pandas library to manipulate data.

Information on "advanced" Python functionality.

Challenging assignments. Give yourself time (a weekend or several evenings for each assignment) and you'll learn a lot.


Very little information is given in the way of conceptual understanding of data science as a discipline. Focuses on the "how" but not the "why." Not much context given for the examples, so they seem a bit contrived. (For the opposite extreme, see the John Hopkins Data Science specialization.)

Lectures are rushed. Looks like the prof is reading cue cars straight down without pausing for the students to digest anything. (The one or two lectures given by the TA just fly by without nary a pause for breath.) If you're trying to follow along, keep your finger on the pause button.

The Jupyter notebooks are useful, but contain minimal commenting (barely more than a few headers), so if you plan to refer to the notebooks later (for reference), you'll need to put comments in yourself.

Given code is sometimes sloppy (e.g., standard Python naming conventions are not always followed.)

The automated grader is very picky. Read through the discussion board!

By Tony K

May 14, 2020

I really did like the course overall. The content and its concept are excellent. Make sure you already have Python fundamentals or you will work a little harder (I already had Python experience under my belt). The final project was a good first hands-on approach to doing something most employers would consider to be a real project. Since an autograder is used, the data sets were already provided, but one could imagine where one would have to acquire the data themselves and then follow similar procedures to accomplish a statistical task. My only complaints are: 1 (Minor): the attitude of some of the instructors in the Discussion Forums is off-putting. I don't know if this is a language barrier problem or if they truly just lack "soft skills" in working with people, but you have to take them with a grain of salt. Not all of them, but some in particular. 2 (Major): the autograder is years behind. The APIs (mostly Pandas) that you will use in real life have features that will work for your assignments, but then fail in the Autograder. I spent 1.5 to 2x as long on this month of the specialization due to how archaic the autograder is. I will definitely be warning my employer about this if other people in my organization want to take the course. Figuring out these issues exist requires going to the Discussion Forums (the warnings are in the assignments), and there you will encounter problem #1 I pointed out.

By Eric N

Jan 8, 2020

Most of the course is great. The lecture content is interesting, well explained and backed up by examples. The notebooks to for following along the videos are very good too. The assignment problems are also interesting, applicable and to a reasonable level of difficulty. However there is one significant problem which, while not big enough to stop me recommending the course, is very frustrating. The assignment auto-grader has so many issues and peculiarities that on the third and fourth assignments I have spent more time getting the autograder to accept my answers which are already correct, than I spent actually obtaining the answers in the first place. Just one example of such a problem is that the autograder works with an older version of pandas so there is some syntax which will work perfectly when ran in the jupyter notebook but which will fail when the autograder tries to run it. I know it is not practical to have all assignments reviewed by a real person but something needs to be done about this, when it can take more time to get the assignment submitted than to actually complete it. It would also be good to have access to optimised assignment solutions after the assignment has been passed because it is certainly possible to pass the assignments without having a very efficient or 'pandorable' solution.

By Robert C

Feb 20, 2021

I felt the course material was helpful to improve my skills in data preparation within Pandas. I believe the course is a bit advanced to be considered an "Introduction" but perhaps it only gets much harder from here :). The programming assignments are estimated at 3hr each week, but I found myself putting in much more time than this - at least 10 hrs per assignment. I do not consider this a bad thing - the assignments are where you roll up your sleeves and actually become proficient at this stuff. The biggest negative to the course (cost 1 star on my rating) are in the the layout and autograder for the assignments. Most of the assignments require significant levels of work that have nothing to do with data science or the course material -e.g., doing web searches on which city a major league team is based out of, interpreting census data acronyms, etc. This should be provided in some form of a summary or cheat sheet so the student can focus their time on learning the course material. On some assignments, the autograder has known issues and will mark a correct answer as wrong. This can be frustrating - and if known by the staff-should be corrected. In summary-good course, learned alot, could be great with better assignment planning and more robust autograder.

By Jim

Jul 2, 2017

It was a good course, and I found I learned a good amount from it. I feel the following changes would give future participants even more value:

First, assignment grading is too rigid. There are questions I failed that looked perfectly OK to me. If a question fails, hints on how to get it to pass need to be provided. One great example had something like "incorrect output. Be sure that DataFrame.loc['Texas']['2012q3'] returns a float64" or something like that. I tried that, found out I didn't strip spaces (I had 'Texas '), sorted that out, and got the right answer. I feel spending too much time on that is "hacking the grader" vs. learning the material.

Second, the course emphasized following along in the notebook, playing with it, and doing the assignments via checking Google / Stack Overflow. That works, but it doesn't lead to as "Pandorable" code as it could otherwise. Some of my solutions were slow, and I started to get into "one-trick pony" mode using .apply() everywhere. That oftentimes isn't fast enough. Pandas, by nature of optimizations under the hood, is a very "meta" language. Teaching the primitives and building to those pieces could use more emphasis.

By Sashi B

Jun 10, 2017

Overall a good learning experience. The assignments were challenging and time consuming at times which is primarily what I am basing my experience on.

The lectures on the other hand fell short of substance and not so helpful understanding philosophy behind using prescribed python tools. Most of my learning was self taught trying to solve the assignments which I really enjoyed! The lectures felt rushed and crammed up. Instructor focused more on python tools for Data Science rather than why use Python for data science in first place and pros and/or cons it(tools) brings to Data Science. I felt I learnt more on why use python for Data Science from the course specialization "Python for Everybody" taught by Prof. Charles Severance from same University.

The Jupyter notebook interface was great! I really like the fact that you could play with the code shown on lecture slides/videos.

This course dwells deep into Python tools such as pandas. For python newbies (like myself) I recommend taking some introductory python course(s) which would greatly help with solving assignments on this course.

By Nicholas B

Oct 22, 2017

the course material is relevant to real-world data science and, while not every topic needed for the assignments is covered in the class, this is also realistic as you commonly have to search documentation to discover the "right" object or method needed for your particular task/dataset.

The only reason I hold back from a five-star rating is that the assignments are often stated in vague terms and it's not always clear as to exactly what measure your code should return. For example, one assignment asks students to measure a decline in some value across a date period. this could be a difference, or a ratio, but it wasn't stated in the assignment. only in the forums was it clarified that we should calculate a ratio.

finally, the expected time requirements are very underestimated. There's no way that these assignments can be completed within only 2-4 hours unless you already know pandas. Having plenty of python experience and being new to pandas, i still had to search documentation extensively for the right methods to transform data appropriately.