Machine learning algorithms power many services in the world today. Here are 10 to know as you look to start your career in machine learning.
At the core of machine learning are algorithms, which are trained to become the machine learning models used to power some of the most impactful innovations in the world today. Read on to learn about 10 of the most popular machine learning algorithms you'll want to know, and explore the different learning styles used to turn machine learning algorithms into functioning machine learning models.
Machine learning (ML) can analyse X-rays, predict stock market prices, and recommend binge-worthy television shows. With such a wide range of applications, it's unsurprising that the global machine-learning market is projected to grow from $21.7 billion in 2022 to $209.91 billion by 2029, according to Fortune Business Insights [1].
A machine learning algorithm is like a recipe that allows computers to learn and make predictions from data. Instead of explicitly telling the computer what to do, you provide a large amount of data and let it discover patterns, relationships, and insights independently.
From classification to regression, here are 10 types of machine learning algorithms you need to know in the field of machine learning:
Linear regression is a supervised machine learning technique used for predicting and forecasting values that fall within a continuous range, such as sales numbers or housing prices. It is a technique derived from statistics and is commonly used to establish a relationship between an input variable (X) and an output variable (Y) that can be represented by a straight line.
In simple terms, linear regression takes data points with known input and output values and finds the line that best fits those points. This line, known as the "regression line," is a predictive model. We can estimate or predict the output value (Y) for a given input value (X) using this line.
Linear regression is primarily used for predictive modelling rather than categorisation. It is useful to understand how changes in the input variable affect the output variable. By analysing the slope and intercept of the regression line, we can gain insights into the relationship between the variables and make predictions based on this understanding.
Logistic regression, or "logit regression," is a supervised learning algorithm primarily used for binary classification tasks. It is commonly employed when determining whether an input belongs to one class or another, such as deciding whether an image is a cat.
Logistic regression predicts the probability that an input can be categorised into a single primary class. However, it is commonly used to group outputs into two categories: the primary class and not the primary class. To accomplish this, logistic regression creates a threshold or boundary for binary classification. For example, any output value between 0 and 0.49 might be classified as one group, while values between 0.50 and 1.00 would be classified as the other group.
Consequently, logistic regression is typically used for binary categorisation rather than predictive modelling. It enables us to assign input data to one of two classes based on the probability estimate and a defined threshold. This makes logistic regression a powerful tool for tasks such as image recognition, spam email detection, or medical diagnosis, where we need to categorise data into distinct classes.
Naive Bayes is a set of supervised learning algorithms used to create predictive models for binary or multi-classification tasks. It is based on Bayes' Theorem and operates on conditional probabilities, which estimate the likelihood of a classification based on the combined factors while assuming independence between them.
Let's consider a program that identifies plants using a Naive Bayes algorithm. The algorithm considers specific factors such as perceived size, colour, and shape to categorise images of plants. Although each factor is considered independently, the algorithm combines them to assess the probability of an object being a particular plant.
Naive Bayes leverages the assumption of independence among the factors, simplifying the calculations and allowing the algorithm to work efficiently with large datasets. It is particularly well-suited for tasks like document classification, email spam filtering, sentiment analysis, and many other applications where the factors can be considered separately but still contribute to the overall classification.
A decision tree is a supervised learning algorithm used for classification and predictive modelling tasks. It resembles a flowchart, starting with a root node that asks a specific question about the data. Based on the answer, the data is directed down different branches to subsequent internal nodes, which ask further questions and guide the data to subsequent branches. This process continues until the data reaches an end node, a leaf node, where no further branching occurs.
Decision tree algorithms are popular in machine learning because they can handle complex datasets with ease and simplicity. The algorithm's structure makes understanding and interpreting the decision-making process straightforward. Decision trees enable us to classify or predict outcomes based on the data's characteristics by asking questions and following the corresponding branches.
This simplicity and interpretability make decision trees valuable for various machine learning applications, especially when dealing with complex datasets.
A random forest algorithm is an ensemble of decision trees used for classification and predictive modelling. Instead of relying on a single decision tree, a random forest combines the predictions from multiple decision trees to make more accurate predictions.
In a random forest, numerous decision tree algorithms (sometimes hundreds or even thousands) are individually trained using different random samples from the training dataset. This sampling method is called "bagging." Each decision tree is trained independently on its respective random sample.
Once trained, the random forest feeds the same data into each decision tree. Each tree produces a prediction, and the random forest tallies the results. The most common prediction among all the decision trees is then selected as the final prediction for the dataset.
Random forests address a common " overfitting " issue with individual decision trees. Overfitting happens when a decision tree becomes too closely aligned with its training data, making it less accurate when presented with new data.
K-nearest neighbour (KNN) is a supervised learning algorithm for classification and predictive modelling tasks. The name "K-nearest neighbour" reflects the algorithm's approach of classifying an output based on its proximity to other data points on a graph.
Let's say you have a dataset with labelled points, some marked as blue and others as red. When you want to classify a new data point, KNN looks at its nearest neighbours in the graph. The "K" in KNN refers to the nearest neighbours considered. For example, if K is set to 5, the algorithm looks at the five closest points to the new data point.
The algorithm assigns a classification to the new data point based on most labels among the K nearest neighbours. For instance, if most of the nearest neighbours are blue points, the algorithm classifies the new point as belonging to the blue group.
Additionally, KNN can also be used for prediction tasks. Instead of assigning a class label, KNN can estimate the value of an unknown data point based on the average or median of its K nearest neighbours.
K-means is an unsupervised algorithm commonly used for clustering and pattern recognition tasks. It aims to group data points based on their proximity to one another. Like K-nearest neighbour (KNN), K-means clustering utilises the concept of proximity to identify patterns in data.
Each cluster is defined by a centroid, a real or imaginary centre point. K-means is useful on large data sets, especially for clustering, though it can falter when handling outliers.
Clustering algorithms are particularly useful for large datasets and can provide insights into the data's inherent structure by grouping similar points. It has applications in various fields, such as customer segmentation, image compression, and anomaly detection.
A support vector machine (SVM) is a supervised learning algorithm commonly used for classification and predictive modelling tasks. SVM algorithms are popular because they are reliable and can work well even with a small amount of data. SVM algorithms work by creating a decision boundary called a "hyperplane." In two-dimensional space, this hyperplane is like a line that separates two sets of labelled data.
SVM aims to find the best possible decision boundary by maximising the margin between the two labelled data sets. It looks for the widest gap or space between the classes. Any new data point on either side of this decision boundary is classified based on the labels in the training dataset.
It's important to note that hyperplanes can take on different shapes when plotted in three-dimensional space, allowing SVM to handle more complex patterns and relationships in the data.
Apriori is an unsupervised learning algorithm used for predictive modelling, particularly in the field of association rule mining.
The Apriori algorithm was initially proposed in the early 1990s to discover association rules between item sets. It is commonly used in pattern recognition and prediction tasks, such as understanding a consumer's likelihood of purchasing one product after another.
The Apriori algorithm examines transactional data stored in a relational database. It identifies frequent itemsets, which are combinations of items that often occur together in transactions. These itemsets are then used to generate association rules. For example, if customers frequently buy products A and B together, an association rule can be generated to suggest that purchasing A increases the likelihood of buying B.
By applying the Apriori algorithm, analysts can uncover valuable insights from transactional data, enabling them to make predictions or recommendations based on observed patterns of item-set associations.
Gradient boosting algorithms employ an ensemble method, which means they create a series of "weak" models that are iteratively improved upon to form a strong predictive model. The iterative process gradually reduces the errors made by the models, leading to the generation of an optimal and accurate final model.
The algorithm starts with a simple, naive model that may make basic assumptions, such as classifying data based on whether it is above or below the mean. This initial model serves as a starting point.
In each iteration, the algorithm builds a new model that focuses on correcting the mistakes made by the previous models. It identifies the patterns or relationships the previous models struggled to capture and incorporates them into the new model.
Gradient boosting is effective in handling complex problems and large datasets. It can capture intricate patterns and dependencies that a single model may miss. By combining the predictions from multiple models, gradient boosting produces a powerful predictive model.
Everyone learns differently–including machines! Generally, data scientists use three different learning styles to train machine learning algorithms: supervised, unsupervised, and reinforcement learning. Learn more about each of them in this article.
Machine learning algorithms are the essential components of artificial intelligence, and they are trained using various techniques to become the powerful models driving many innovations today. Explore machine learning concepts to gain a more comprehensive understanding of these algorithms and their transformative potential.
With Machine Learning from DeepLearning.AI on Coursera, you'll have the opportunity to learn practical machine learning concepts and techniques from industry experts. Develop the skills to build and deploy machine learning models, analyse data, and make informed decisions through hands-on projects and interactive exercises. Not only will you build confidence in applying machine learning in various domains, but you could also open doors to exciting career opportunities in data science.
Fortune Business Insights. “The global machine learning (ML) market is expected to grow from $21.17 billion in 2022 to $209.91 billion by 2029, https://www.fortunebusinessinsights.com/machine-learning-market-102226.” Accessed April 24, 2024.
Editorial Team
Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...
This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.