Machine learning models rely on good data to produce meaningful insights. For that reason, data prep is one of the most critical skills for machine learning.
In this course, you’ll learn how to import and clean data before populating missing values using imputation. You’ll learn how to visualize histograms, scatter charts, and box plots to identify trends of interest before using the analysis to select the most important features. Feature engineering techniques such as one hot encoding, binning and scaling will help us transform the structure of our data to produce higher quality machine learning insights. This data prep course in Python includes more interactive exercises and challenges than previous BIDA courses have. You will also have the opportunity to test your skills on a comprehensive guided Python case study before completing the final exam. Upon completing this course, you will be able to: • Import and clean your data in Python • Apply imputation to estimate missing values in the dataset • Conduct exploratory data analysis (EDA) to find initial patterns to guide our analysis • Select features to focus on the most important variables • Apply feature engineering to make datasets machine learning-friendly • Select appropriate feature engineering techniques based on the model type Whether you are a business leader or an aspiring analyst exploring data science, this Data Prep for Machine Learning in Python course will serve as your comprehensive introduction to this fascinating subject. You’ll learn all the key terminology to allow you to talk data science with your teams, begin implementing analysis, and understand how data science can help your business.