Chevron Left
Back to Introduction to Data Science in Python

Learner Reviews & Feedback for Introduction to Data Science in Python by University of Michigan

4.5
stars
27,057 ratings

About the Course

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses. This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python....

Top reviews

YH

Sep 28, 2021

This is the practical course.There is some concepts and assignments like: pandas, data-frame, merge and time. The asg 3 and asg4 are difficult but I think that it's very useful and improve my ability.

CB

Feb 6, 2023

The assessments, quizzes, and course coverage are quite good. The main points are covered, although it does not cover everything. Additionally, it provides opportunities to learn and conduct research.

Filter by:

3626 - 3650 of 5,948 Reviews for Introduction to Data Science in Python

By Nguyen P A

•

Feb 8, 2020

Teaching an online course is difficult since audience comes from various backgrounds. The lecturer in this course is very careful in choosing the materials to include so that the course covers the needs of a large base of audience. The title is quite misleading though, because it is neither an introduction course nor data science course, the true title should be: "Getting pro at data handling in Python"

Although the course has been pretty well design and truly at a level much better than many online courses. I think there are several improvements should be made to avoid learners' frustration:

1. Lecturer speaks too fast. I feel like he was busy reading from his script without truly care whether people are really following or not.

2. Assignments are poorly written: although the concept introduces in assignments are relevant to real Data Science problems, the instructions are unclear and getting worse from week 1 to week 4. The most unorganized instruction is week 4 assignment, which seems to be prepared by several different TAs and then hurriedly combined together.

3. The pre-requisite should clearly state that learners should have some real programming experience to be able to debug his own code. Completing one or two courses in Python previously will not help performing well in this course.

4. The instructor should also state who should not take this course, it should be a waste of time for someone with many years of programming experience.

By Mark P

•

Jul 1, 2017

On balance, I'd recommend the course, but use it as only one of many sources you'll use to learn data science. My main complaints: 1. the course concentrates on mechanics of Python rather than on understanding of Data Science as a discipline, 2. It's rushed.

Pros

Good information on how to use the pandas library to manipulate data.

Information on "advanced" Python functionality.

Challenging assignments. Give yourself time (a weekend or several evenings for each assignment) and you'll learn a lot.

Cons

Very little information is given in the way of conceptual understanding of data science as a discipline. Focuses on the "how" but not the "why." Not much context given for the examples, so they seem a bit contrived. (For the opposite extreme, see the John Hopkins Data Science specialization.)

Lectures are rushed. Looks like the prof is reading cue cars straight down without pausing for the students to digest anything. (The one or two lectures given by the TA just fly by without nary a pause for breath.) If you're trying to follow along, keep your finger on the pause button.

The Jupyter notebooks are useful, but contain minimal commenting (barely more than a few headers), so if you plan to refer to the notebooks later (for reference), you'll need to put comments in yourself.

Given code is sometimes sloppy (e.g., standard Python naming conventions are not always followed.)

The automated grader is very picky. Read through the discussion board!

By Tony K

•

May 14, 2020

I really did like the course overall. The content and its concept are excellent. Make sure you already have Python fundamentals or you will work a little harder (I already had Python experience under my belt). The final project was a good first hands-on approach to doing something most employers would consider to be a real project. Since an autograder is used, the data sets were already provided, but one could imagine where one would have to acquire the data themselves and then follow similar procedures to accomplish a statistical task. My only complaints are: 1 (Minor): the attitude of some of the instructors in the Discussion Forums is off-putting. I don't know if this is a language barrier problem or if they truly just lack "soft skills" in working with people, but you have to take them with a grain of salt. Not all of them, but some in particular. 2 (Major): the autograder is years behind. The APIs (mostly Pandas) that you will use in real life have features that will work for your assignments, but then fail in the Autograder. I spent 1.5 to 2x as long on this month of the specialization due to how archaic the autograder is. I will definitely be warning my employer about this if other people in my organization want to take the course. Figuring out these issues exist requires going to the Discussion Forums (the warnings are in the assignments), and there you will encounter problem #1 I pointed out.

By Eric N

•

Jan 8, 2020

Most of the course is great. The lecture content is interesting, well explained and backed up by examples. The notebooks to for following along the videos are very good too. The assignment problems are also interesting, applicable and to a reasonable level of difficulty. However there is one significant problem which, while not big enough to stop me recommending the course, is very frustrating. The assignment auto-grader has so many issues and peculiarities that on the third and fourth assignments I have spent more time getting the autograder to accept my answers which are already correct, than I spent actually obtaining the answers in the first place. Just one example of such a problem is that the autograder works with an older version of pandas so there is some syntax which will work perfectly when ran in the jupyter notebook but which will fail when the autograder tries to run it. I know it is not practical to have all assignments reviewed by a real person but something needs to be done about this, when it can take more time to get the assignment submitted than to actually complete it. It would also be good to have access to optimised assignment solutions after the assignment has been passed because it is certainly possible to pass the assignments without having a very efficient or 'pandorable' solution.

By Robert C

•

Feb 20, 2021

I felt the course material was helpful to improve my skills in data preparation within Pandas. I believe the course is a bit advanced to be considered an "Introduction" but perhaps it only gets much harder from here :). The programming assignments are estimated at 3hr each week, but I found myself putting in much more time than this - at least 10 hrs per assignment. I do not consider this a bad thing - the assignments are where you roll up your sleeves and actually become proficient at this stuff. The biggest negative to the course (cost 1 star on my rating) are in the the layout and autograder for the assignments. Most of the assignments require significant levels of work that have nothing to do with data science or the course material -e.g., doing web searches on which city a major league team is based out of, interpreting census data acronyms, etc. This should be provided in some form of a summary or cheat sheet so the student can focus their time on learning the course material. On some assignments, the autograder has known issues and will mark a correct answer as wrong. This can be frustrating - and if known by the staff-should be corrected. In summary-good course, learned alot, could be great with better assignment planning and more robust autograder.

By Jim

•

Jul 2, 2017

It was a good course, and I found I learned a good amount from it. I feel the following changes would give future participants even more value:

First, assignment grading is too rigid. There are questions I failed that looked perfectly OK to me. If a question fails, hints on how to get it to pass need to be provided. One great example had something like "incorrect output. Be sure that DataFrame.loc['Texas']['2012q3'] returns a float64" or something like that. I tried that, found out I didn't strip spaces (I had 'Texas '), sorted that out, and got the right answer. I feel spending too much time on that is "hacking the grader" vs. learning the material.

Second, the course emphasized following along in the notebook, playing with it, and doing the assignments via checking Google / Stack Overflow. That works, but it doesn't lead to as "Pandorable" code as it could otherwise. Some of my solutions were slow, and I started to get into "one-trick pony" mode using .apply() everywhere. That oftentimes isn't fast enough. Pandas, by nature of optimizations under the hood, is a very "meta" language. Teaching the primitives and building to those pieces could use more emphasis.

By Sashi B

•

Jun 10, 2017

Overall a good learning experience. The assignments were challenging and time consuming at times which is primarily what I am basing my experience on.

The lectures on the other hand fell short of substance and not so helpful understanding philosophy behind using prescribed python tools. Most of my learning was self taught trying to solve the assignments which I really enjoyed! The lectures felt rushed and crammed up. Instructor focused more on python tools for Data Science rather than why use Python for data science in first place and pros and/or cons it(tools) brings to Data Science. I felt I learnt more on why use python for Data Science from the course specialization "Python for Everybody" taught by Prof. Charles Severance from same University.

The Jupyter notebook interface was great! I really like the fact that you could play with the code shown on lecture slides/videos.

This course dwells deep into Python tools such as pandas. For python newbies (like myself) I recommend taking some introductory python course(s) which would greatly help with solving assignments on this course.

By Nicholas B

•

Oct 22, 2017

the course material is relevant to real-world data science and, while not every topic needed for the assignments is covered in the class, this is also realistic as you commonly have to search documentation to discover the "right" object or method needed for your particular task/dataset.

The only reason I hold back from a five-star rating is that the assignments are often stated in vague terms and it's not always clear as to exactly what measure your code should return. For example, one assignment asks students to measure a decline in some value across a date period. this could be a difference, or a ratio, but it wasn't stated in the assignment. only in the forums was it clarified that we should calculate a ratio.

finally, the expected time requirements are very underestimated. There's no way that these assignments can be completed within only 2-4 hours unless you already know pandas. Having plenty of python experience and being new to pandas, i still had to search documentation extensively for the right methods to transform data appropriately.

By Osama L

•

Jul 17, 2017

First of all i would say that anyone with not a strong background or not with a strong functional programming background will be in for a lot of hardwork during this course. Secondly i would say that this course helps you search documentation yourself instead of being spoon feed. This specially where i come from a teacher is only as good as much as he spoon feeds you. So the the people of the above mentioned category should definitively take this course if not to become a data scientist then to learn how does one learn at this level and that definitely not being spoon feed. I think this is a good intro to data science in general because it is such a vast field that one always has that difficulty deciding where to start.

In the course i can say it was mostly basic descriptive statistics and just one thing from inferential statistics which came in the last week's project that is the two independent samples means t testing.

So all in all if you are not strong in your statistics you can still take this course easily

By Richard D

•

Mar 7, 2017

Felt like this course was trying to pack a lot of information into only four weeks. I find Python to be a very frustrating language - at least that's how Pandas feels. Compared to R the syntax is dreadful. And I found myself digging through help files and FAQs to find ways to do what should be the simplest things - like accessing values in a data frame. The Pandas help documents are not written for beginners, but written in a mathematically precise manner that makes them inaccessible. (Their preference for 1-letter variable names is unhelpful.) Stack Overflow is much better but it's a crap shoot as to whether somebody has addressed the exact issue that is bothering me at any given point.

One final issue: it would be nice to see model solutions. I understand the difficulty in having online solutions available for an online course, but I found myself writing what I considered to be inefficient code fairly often. Would love to see slicker ways to do these tasks.

By Kostadin A

•

Jul 8, 2017

Excellent material - covers basics, shows most used approaches, laid out logically and sequentially. The language was clear. Very important - presenters did not improvise, but read from a script, which ensured that the materials was delivered in a succinct and precise way, respecting the time of the student. The teaching team is solid in their grasp of the theory. Good support on the forums by the teaching staff with smart tips and tricks for more efficient "pythonic" way of coding the algorithms. Very relevant and thought-provoking side reading material.

Downside - submissions could be graded wrong for small technicalities (despite the numbers being correct). In many cases I needed to resubmit just to satisfy some insignificant technical requirement. This passes the message that what is important are the formalities and not getting the answer right - there should have been more flexible/smart auto-grading.

Overall I recommend the course.

By 周玮晨

•

Jun 13, 2018

I NEARLY QUIT,but finally i made it.This is my first time to take a "learn by doing" course, i'm not feel comfortable and confident. When i'm ocuppied in the assignment, i always feel frustrated. However, when i become adapting the course ,i feel great and i love the course.After finishing the course, i learned a lot,not only skill, but how to self-learning as well.And i love the reading work,although my english is broken,i still enjoy the topic argument.I like the article 'end of theory' best, which gives me a lot of insight. And i have to mention,the autograder is lacking feedback, when i got wrong anwser, i didn't know where i make mistakes. So it spent me plenty of time to find what mistake i made. Thanks to enthusiastic participator in the forum,especially 'Sophie Greene','Barthold Albrecht' and 'Yusuf Ertas',without you i can't finish the course.But i hope teachers can make the autograder give us more feedback.

By Maximilian W

•

Jun 29, 2019

Really good introduction to Data Science. Great lectures and really good exercises to enforce what you have learnt.

If you are wondering to do this course, and have reservations given the many reviews about the lack of spoon feeding, I would advise to still consider it. For some other courses, which have those kind of reviews, have often been spot on.

However, personally, I found the balance to be good in this course. Learning to solve the problems yourself, with the safety net of a great discussion board and grader to tell you if you are right or wrong, you will get there. The approach really allows personal development for problems without either of those two aids. Some have been disparaging about the over dependence on Stackover flow to help finish the exercises. I can see the frustration, but the Fundamentals were taught well, and learning to bridge the gap efficiently using Stack overflow is a large learning.

By David C

•

Jun 30, 2017

This was a good course. The professor (Chris Brooks) was EXCELLENT, although much of the material was presented very quickly and was difficult to follow at times. The real learning came during the exercises, which I generally found to be very difficult but also very good at ensuring the material presented in lecture was actually reinforced through hands-on programming. I would recommend making more exercises, perhaps two per week, rather than one very detailed exercise at the end. The last week's exercise, in particular, was very difficult for me and I would recommend breaking part 5 (of 6) of that exercise into several smaller chunks that could be completed and graded individually. Rather than 10-10-10-10-10-50 as the points awarded in that exercise, I would also recommend changing the scoring to 10-10-10-10-35-25 as the next-to-last part of the exercise was by far the most difficult and time consuming.

By Robert O

•

Apr 26, 2022

First off - Kudos to Yusuf Ertas for all his help and patience getting me through the programming assignments. I really appreciate his willingness to help!

I believe a different text book would be helpful. I understand it is optional, however Wes McKinney has created a great reference manual, however he uses too many theoretical examples. I really need real-world, applied examples. He seems to get to into corner cases and shows you 10 different ways to do something, however no clear direction for a new students.

I would also like to see the lectures topics more closely reflected in the programming assignments. Maybe additional assignements after each lecture. The student should then be able to pull from the lecture assignements to complete the final programming assignment.

Need more direction and guidance on the programming assignments for an Intro course.

By Zhou S

•

Jul 12, 2020

This course is kind of difficult for those who have zero or little computer programming background. But if you have completed the Python for Everybody in Coursera, you can quickly get on it. To take this course, make sure you have basic knowledge in Python, or you may get frustrated quickly. Also, you can refer to Stack Overflow, Github for solutions if you have struggled for a long time on assignments. But, don't just copy the answers, go search the Python/Pandas/Numpy Docs for further explanation, and try your own way to optimize the solutions.

All in all, this is a very good course to deeply explore data science by Python. One problem for me, an international student, is what the instructor says sometimes confused me because I can't understand the abstract concept just by listening. I hope more graphic explanation could be added in lectures.

By Udit C

•

Jan 24, 2020

'Introduction' is a bit of a misnomer for the course. If you already have even a fairly elementary understanding of analytics/data science, this course adds nothing of value; if you don't have that understanding, you don't really a great intro into what it is. If you already have at least some programming and/or python experience, this course does a good job of showing you the kind of tools Python has specifically for dealing with data; if you don't have any programming experience (especially in python), you may be way out of your depth. The exercises were well designed and were great learning opportunities, but were far more difficult than course materials implied, especially for python beginners. In spite of limited lectured information, the activities helped me get a great appreciation of python, which is the course's saving grace.

By Andrew A

•

Aug 12, 2017

In conjugation with learning from the Python for Data Science book this course is a nice introduction to the topic. I would say there is too much scope to pass the course with bad code and not learn much. It will take self discipline to not accept whatever works and learn the general aspects taught. There is a large degree of independent learning required which means that to get the best out of the course requires dedicated time exploring. If I didn't have the book I may have felt lost. The discussion forum contained key information a rushed student may not have picked up, however it is a testament to the mentors that this became readily available along with their support. This course could have done with some peer review to allow code comparisons to check quality, methodology and readability without breaking the honor code.

By Christopher F

•

Oct 22, 2018

Generally very interesting and helpful course. Lectures could have been a bit meatier (I spent a lot more reading through the docs than most other Coursera programming classes to complete my homework). It's one of the few Data Science sequences that seem to be offered in Python, rather than R, which was a big motivator for taking this particular class (the transition to working with big data with for example pyspark should be significantly easier).

As an side: one thing that would be helpful is if learners could see the solutions after submitting/getting a grade. Learners would stand to learn a lot by seeing the 'better'/'accepted' answers -- even though I aced the assignments, I _know_ my code isn't as "pandorable" as it could be. That remains one of (a few) bigs differences between a MOOC and in-person teaching...

By Shah M

•

Jan 30, 2022

This course is fairly difficult in regards to the assignments and the lectures cover a lot of material!

But trust me when I say the that the lectures are sufficient to point you in the right direction for the assignments. For the most part I ended up using stack overflow when I was working on the assignments, this is nothing new since most of my past programming assignments consisted of me scrounging through stack overflow posts. This course did in fact teach me alot concerning regex and how to apply it on a pandas dataframe. I learned a lot when it comes to data cleaning and for that I think this course is well worth it! The material and how it's presented does add to the difficulty, but honestly it's a fun course if you sufficiently go over the lectures and use stack overflow for general aid.

By Shantanu A

•

Mar 29, 2020

The course is excellent. I really enjoyed this course. But I think that the assignments are a bit too tough because sometimes the concepts required in assignments are not covered in the lectures and help is needed from 'Discussion Forums'. Since I approached discussion forums only when I could not think of anything else, sometimes it was very frustrating for me because I got stuck on a couple of problems for even some days. Thus, assignments should be only on the concepts covered or it would be better if lectures could be made more exhaustive so that we can learn more in the lectures itself. Otherwise, the course is great and the person who helps us on the Discussion Forums is super. His efforts are commendable and he is very knowledgable, cooperative and active.

By Cole H

•

Apr 13, 2020

The course material was good, but don't expect to have your hand held during this experience. While the lectures are informative, the assignments often leap ahead in complexity. Each assignment does say that you will have to do some research on your own, to be fair, but the amount of self teaching required is, in my opinion, too much. That said, the problems posed by the assignments do build on each other (even if they don't line up well with the lecture material) and if you honestly take the time to learn the things you have to understand in order to complete each assignment, you will walk away with some new skills and a sense of accomplishment. 4 out of 5 for teaching me some new things, minus the one star for being more of a challenge than I was expecting.

By Malik K

•

Nov 8, 2017

It was hard and I could not have passed nor learned much except frustration if it had not bee for Sophie Green (TA) superb support.

I could not start the course on time but the first week was easy. So I was surprised by the work excepctation from the 2nd week. Also it did not match what was forcasted by instructors. 2h --> 10h, 4h--> 20h... and I'm not count the night thinking of ways to solve the problem.

I think that difficult comes from the expectation that documentation is understandable by newbies.

Also question were often tested on type but the expected type output was not mentioned in the questions.

Finaly, I think personally would need to learn how to debug properly a python program (going step by step in it)

Hard and challenging. Thank you Sophie

By Max B

•

Dec 29, 2018

This is a good introduction to Python and especially pandas for handling data. However, the course material is not very comprehensive, and you are expected to read online documentation and search StackOverflow to find answers to most of the required functionalities if you wish to finish the assignments. But, and here comes the but, this is actually how you would proceed the day you are faced with a "real" data science task, so from that perspective it is a good lesson. Also, for learners out there not wishing to pay for similar material, there are plenty of notebooks (on e.g. GitHub) that more or less contain the same (for free!). In summary, mr. Brooks does a good job explaining the material and the assignments are hands-on and well thought through.

By Matthew S

•

May 2, 2018

Good introduction to Python, with a heavy focus on Pandas. Definitely worth doing if you're struggling a bit with the Pandas documentation. The course assessment were a bit of an up-hill battle for me, but I feel more skilled for completing them, so I would encourage others to fully engage as much as possible. Same with the readings that are set. In fact, I'd like to see more recommended readings, along the lines of David Donoho's paper. The course uses Jupyter notebooks for assessments, which was refreshing, and has in-video code to work-through which was also much appreciated. All-in-all, take the course if you're interested in Python and Pandas. It will eat your time quite a bit to do the assessments if you're like me, so be prepared for that.