Chevron Left
Back to Introduction to Data Science in Python

Learner Reviews & Feedback for Introduction to Data Science in Python by University of Michigan

4.5
stars
27,115 ratings

About the Course

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses. This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python....

Top reviews

YH

Sep 28, 2021

This is the practical course.There is some concepts and assignments like: pandas, data-frame, merge and time. The asg 3 and asg4 are difficult but I think that it's very useful and improve my ability.

CB

Feb 6, 2023

The assessments, quizzes, and course coverage are quite good. The main points are covered, although it does not cover everything. Additionally, it provides opportunities to learn and conduct research.

Filter by:

5101 - 5125 of 5,961 Reviews for Introduction to Data Science in Python

By Anantha K

Jun 4, 2019

Good

By Md M H

Aug 14, 2018

good

By Quang M

Sep 4, 2017

good

By jihoonbaek

Oct 27, 2019

di.

By Sweta c

Aug 27, 2020

gd

By BIKASH R H

Jan 22, 2025

.

By PRAVEEN M

Sep 19, 2020

i

By Mr. S D P

May 3, 2020

V

By Saulie

Mar 25, 2020

-

By 黄远(Mr.依然)

Dec 28, 2019

f

By Jin-Hee C

Feb 17, 2017

I do have background in OOP programming but not Python. But I do also programming in MATLAB. I chose this course to learn more about technical skills to do research in data mining, data science, and machine learning. The instructor is fine but the lecture note is not really helpful. Students need more clear written information than just lecturing in Video as referring to lecture notes as needed would be more efficient. Each week assigns one programming assignment consisting of multiple problems. I don't think the shown expected time to complete each assignment is totally unrealistic. It took tremendous time to complete and at least to pass it. One good thing is that as far as you complete it within the session, there is no penalty and you can get the passing scores. However, all the programming is graded by the autograder which has a lot of bugs. For example, the exactly same solutions that was passed by the autograder as correct previously are graded as incorrect in the next trial. The staff people try to help but I totally don't understand their approach in that 'this may help but I am not sure if the autograder can accept it'. I believe they try to fix the bugs and use the 'trustworthy' autograder so students can rely on the feedback provided the autograder. We are not learning how to make the autograder not generate errors. I believe many of students are taking this course to improve their career performance. In my case, I am working as a full-time while I am taking this course. Due to the lack of time and the inefficient/faulty autograder, I took a lot of my precious time. I got passed this course with 95 and will continue taking the rest of the courses in this specialization because I already paid the whole specialization package and supported by my company. But I believe the staff team should improve the technical aspects so students can save a lot of their time. Also many of the programming problems do not have the sufficient level of clarity. For each problem, there should be an example so students do not have to spend too much time to understand or guess the solution. I found many inconsistency in the solutions although I guessed it by spending a lot of time to pass this course.

By Ben L

Aug 16, 2017

I have mixed feelings about this course. On the one hand, finishing this course gives you some level of satisfaction due to the challenges and real world examples for you to work on. The Jupyter interface is excellent as a teaching tool by providing interactive code. However, there were several times where I thought this was more challenging than it needed to be and some improvements to pedagogy could be made. (To give you an idea of my skill level, I have coded in R for a few years and finished Dr. Chuck's entire Python specialization before starting this course.) For example, in most standard lessons, a new concept is introduced and then a learner can practice that concept with a simple problem. That simple problem can be followed by progressively harder problems. However, in this course, we're often asked to try hard problems soon after seeing the concept for the first time which can be frustrating. In addition, the lessons are often too fast and some examples are presented unexplained. One idea that comes to mind is in the discussion of the merge function. It requires passing in references to "left" and "right" but the instructor never explains what these refer to. I figured it out eventually but saying explicitly that "left refers to the first data frame and right refers to the second" takes 5 seconds to say and spares the user from resolving the source of the terms. Seeking resources outside the course platform is required throughout. This is expected to supplement learning every now and then in different courses. However in this course, the degree of seeking outside help just seems so high that I could do this on my own while performing my own data science projects. I understand that this is probably the first iteration of the course and that they will likely find ways to improve it. The lessons and assignments seem akin to something you would do in a real job, so finishing the tasks provides some feeling of accomplishment. The subject material is awesome and I think this course will remain popular.

By Jonathan A

Mar 11, 2018

I have mixed views about this course. The net result IS worthwhile and you definitely learn by being thrown in the deep end (this is not a softball "what's a for loop / Programming 101" type course.)

First of all, be aware that the "estimated time of completion" for the assignments is low to put it very mildly: assignments that are estimated "90 minutes" may be more like eight to ten hours to complete (verified by many different course-takers, all of whom had extensive previous programming experience.) Do not take this course unless you can spend at least ten hours a week completing the assignments (unless you're already a prodigy in Python/Pandas -- but if so, why take this course?)

Second of all, the lectures do not contain anywhere near all of the material you need to actually complete the assignments (the course creators even acknowledge this.) It took me a couple of assignments to realize this was so. It really made me think watching the lectures was a slight waste of time, so if you find yourself frustrated thinking you "missed something" because you don't know how to complete the assignment after viewing the lectures, you most likely _didn't_ miss anything: just expect to spend a lot of time Googling answers in order to find what you need.

Third: the autograder here is really quirky. Once I got the hang of it and just reviewed the whole .py file generated to see where the problem was (versus just using the IPython window) it clicked pretty well, but I definitely spent a few hours flailing around trying to get my code to submit successfully. I hadn't had any issues with any other courses in this department.

That being said: the skills learned are definitely "deep" and quantifiable and you get right into the thick of things after the first assignment. I'd venture to say that if you complete this entire 5 course sequence you'd probably have at least a passing knowledge of the subject matter for an interview in this field.

By Ershad S

Sep 17, 2017

Well, I was new to Python. And This course has great material and helps develop decent skills in working with Datafarmes, Series, and Pandas and Numpy libraries.

Overall, I am happy with What I learned. But there are a couple IMPORTANT points to consider:

1. The assignments requires a huge amount of self-study and it takes way more than the suggested time to complete.

2. The assignment grader may act up or be very picky on data types and do not credit you for the right answer

3. The course can get very frustrating as the answer to the final question of assignment#4 determines whether you pass the course or not (it is 50% of the total grade!!). Although, you should be able to get to the right answer through the tips in the forum, it is really hard to figure out the solution on your own since there are many rooms for little mistakes which lead to the wrong answer.

Suggestion: I think the questions should be designed in a way that there are small points for each step of the solution. So, We get to the solution step by step. Having 50% for just one final answer is not reasonable and makes it frustrating for the students.

4. The course subscription is monthly and it makes it even more frustrating when you are stuck with a wrong answer that holds up course completion. I think where huge effort is required to pass the course, having a monthly subscription rather than one-time payment for the course, is not a good incentive at all, but it is very frustrating when you get close to the deadline and can't fix your code to get the answer.

Finally, I want to thank Dr. Brooks and staff (specially Sophie) for the good course. I would like to sign up for the next course in this specialization. But I am debating since I don't want to get stuck with the unreasonable monthly subscription method and frustrating assignment grading system.

Thanks,

Ershad

By Ashvin L

Jan 1, 2017

The course is quite demanding from the get go. If you are trying to get by with casual interest, then this is not the course for you. Much of my December vacation (which I had earmarked for playing video games) was spent coding on this course.

The biggest gripe I have about the course is the grader system. It gives a binary output indicating whether you got the question right or wrong. Unfortunately, often that is not good enough, to debug your code. I spent hours figuring out what's going on. The forums were also not helpful, since there were not many taking the course.

At the end of all this, if I ask myself, did I learn a lot? The answer is probably no. However, I did get a lot of coding experience. Debugging experience.

We have many databases (MySQL, MongoDB, Cassandra, etc), which have far more powerful features than what pandas can do. Therefore, as a system designer, it is unclear to me as to why I would ever pick Pandas over the rest. It appears quite slow (compared to the likes of time tested databases). It offers very few features (when compared to a DB). Lastly, I can use it only with Python. To me, it appears to be a no-brainer to use any one of those DBs to store, modify and massage my data. Maybe there are valid applications that can make use of Pandas like features, but I did not learn that from the course

Summary:

Better Motivation to use Pandas over standard Databases

Better grader design.

By Shiqi A H

Apr 9, 2021

The course material is good and fine but some of the assignment instructions are very unclear. This has led to a lot of wasted time and frustration attempting to understand the instructions, especially when the autograder's requirements are so exact.

Assignment 3 has some unclear instructions, the most important of which is that the returned value from answer_one() is to be used as the dataframe for all the following questions. ("Questions 2-13 rely on your Question 1 answer" is ambiguous in literally requiring the returned value to be used.)

Assignment 4 has unnecessarily complicated instructions, the most egregious of which are in the first paragraph thereof. Researching whether or not a team is to be mapped to a certain metropolitan area is confusing and also seems to be extraneous to a coding test - especially when the information is so complicated and specific to a field (sports) in a geography (USA) one might not have any contextual information about.

Otherwise, the course also needs a lot more hands-on practice. Some of the lectures covered a lot of material, none of which was practiced until the last humongous assignment. By which time the vague instructions, lack of practice, and exact autograder requirements made it a frustrating experience.

By Paolo M

May 6, 2020

On the one hand, this course is stuck in 2016. Even the assignments' tools are dated. In fact, if you are working offline and you want to make sure your results match the grader, you'd better create a virtual environment with Pandas v0.19.2.

There isn't that much activity on the discussion forum. Nevertheless you can see there is always someone answering to students' questions. There are quite a few questions related to getting a better understanding of what the assignment expects. There are also some bugs the grader keeps showing since 2016.

It's also a very dense course, with a lot of information. For some videos, I had to spend a lot of time going through every single concept shared and do my own research.

On the other hand, I felt challenged to complete the assignments and, due to the fact the lectures only give you the basics to start investigating, I ended up learning a lot going through online resources (i.e. mainly Stack Overflow). Even though it's been hard, I feel like I've learned a lot.

I think Coursera should force courses' providers to either updating their material or making visible when it was last updated and, especially when it comes to using tools, which version of such tools should be used.

By Dmitry Z

Jun 2, 2017

Not a bad starter, considering it's a free course, but the lectures were somewhat short. Some things from lectures and tasks were pretty discouraging. For example, on week 4 there is a task that culminates in doing a T-test, but all that the lecturer says about it is "I'm not going to go into detail here, read wikipedia page or take a statistics course to know what the t-test is". Well, it's definitely not a nice thing to say on an introductory course. Same goes to Pearson correlation coefficient - it's in the task, but not explained anywhere on the course. Not all of us have a background in statistics.

Tasks are implemented in the form of IPython notebooks (essentially a web page where one can write Python code) and rated by an automated rater, which sometimes gives good clues, but sometimes "result was incorrect" is all you get in response. What I felt was not very convenient is that the rater is asynchronous - you can't run it directly on your current notebook, you have to send your work to be rated using a special button instead. Then after some minutes yo can see the result on a separate page. It works, but slows down the process a bit.

By john w

Jan 29, 2018

While there are some great things about this course, I was still somewhat disappointed in the manner of teaching. Too often, what was discussed was basic examples of pandas without really explaining how pandas functions. This lead to frustration and excessive scouring of the online API ,Stackoverflow, or the forums to find out how to program a task. Personally, I found learning SQL easier than the pandas library. There is a great deal of good stuff here though, such as the read and response tasks. These add a great deal of depth and perspective to the class and Data Science in general. Also, the subject of the assignments are mostly interesting and realistic problems. As is, though, I'm not sure I'd recommend this class. On one hand, the assignments do set deadlines and motivate a person to learn pandas and data manipulation. On the other hand, much of the pandas learning occurs using outside resources, which could be done without the class. On the whole, however, I have gained Data Science skills, knowledge and perspective from taking this class, and will continue with this series.

By Mark N

May 18, 2021

I was really looking forward to this course. The lectures and the readings are great. I learned much there. I spent an inordinate amount of time on the homeworks, though. The problem formulation and grading were a fiasco. Even when the questions were stated clearly, you had to contend with hidden assertion tests, that offered very little in output describing how to correct your answer. Also, I checked my work always against the files in which we were supposed to perform our exercises. Assignment 3 was particularly atrocious; in that, the population sums the autograder expected could not be had, since the estimates were normalized differently in the actual file. Additionally, the quizzes were not well planned. Once, we were asked for the top 3 ranked individuals in a class (greater than 4), but the answer required by the autograder wanted fourth rank included; you received credit for answering incorrectly and were dinged for answering correctly. I pointed this out, but received no response.

By Anastasios B

Jan 18, 2022

I think the pace of this course started out alright, but by Week 3, the assignment really turns it up to a new level. Similar for Week 4's assignment, even though there is barely any material covered in Week4. Almost seems like they rushed through half the course, essentially. It's nice that the Jupyter notebooks for the lectures are prepared with commentary, but it can get dull watching a lecture video which is essentially the instructor reading the notebook (and typing it out at the speaking pace). Unfortunately, while I learned a few things, I definitely still don't feel very well versed in the Python topics covered. Mostly I have lists of functions/methods in some libraries/classes to reference, with some idea of how to use them. But I would not feel confident working on a Python assignment yet, even if it was intended to only require Pandas, NumPy (and maybe a little RegEx).

By Antonio F

Feb 19, 2017

The course is fast paced and the videos do not cover all of what is necessary to know to pass the assignments. However, this is not a problem as all the necessary references are given. The problem is the way the programming assignments are designed and the automatic grader, which should definitely be improved. Some times you spend hours to figure out what is wrong and you finally find out that you have a precision error because of using a library function instead of another (same purpose) or you spend our to manipulate a Pandas dataframe to respond to the specifications, except that all that time is wasted as the grader will accept the first version even if the index does not respond to the requirement (for instance). So, if you want to take it, prepare to fight with the grader. Positive point: Excellent help on the forums, you will not be left alone.

By Yatin B

Jul 9, 2020

Well I would agree with many other low rated reviews that the course could have been more systematic less focused on self learning but in practical, work won't be straight like question and answer, in some cases there would be no solid answer, skimming through books, stackoverflow and looking things from others' perspective will make one's project/work really interesting and worthwhile. Plenty of resources already there on internet just we have to be more efficient in getting those. I won't recommend this course to a new candidate looking for very structured course but to those who are quite already familiar with programming field as the course says and self-learners. Course can be much better if instructor could provide more tips and tricks or simpler way things could have been done because at the end improvising is the goal.

By Brian D

Feb 26, 2017

Only 37 minutes of video, average per week. Really nice, pleasant video, but don't expect to learn how to solve the problems, because there is little connection from the problems to the videos.

Teaching assistants are hard -working and knowledgable and each has a different way to do things.

Very little in the way of effective educational design.

But they do create useful questions to answer, if you are stubborn enough not to need actual instruction.

The estimate of time required is woefully inadequate.

The best thing you can do is lookup Brandon Rhodes on Youtube. He will actually explain Pandas. Expect to invest about four hours in his videos and still have questions. Use the course forum extensively — only there will you get a hint of how to do what they ask.

Google and Stack Overflow: that's their extensive list of references.

By Nathan Z

Nov 16, 2016

The course was challenging, and although I would have liked a bit more information from the video lectures, and a bit more practice on using the basic functions, I learned a lot.

I thought the video lectures were a little sparse. There were only 20-30 minutes worth of video lectures per week and the assignments required you to stretch the knowledge that you gained from the lecture videos quite a bit.

More examples in video lectures would have helped. I would have also liked to see a few simple problems in each assignments just to get comfortable using the various functions introduced in the lecture.

So overall, the course was challenging and definitely a fair amount of work coming from someone new to python (but moderately experienced in C programming), and although I would have liked a bit more guidance, I learned a lot.