Descriptive statistics present facts from a data set, while inferential statistics make broad predictions based on a sample data set. Discover the measures of each statistical method, how they differ, and how to pick the right one for your analysis.
Descriptive statistics summarize, describe, and derive facts from a particular data set, while inferential statistics go beyond to make inferences and draw conclusions about broader populations based on sample data.
Before you can determine when to use each type of statistical test, you need a solid understanding of basic statistical concepts, critical differences between differential and inferential statistics, and what the measures of each test represent.
Read more: Data Analytics: Definition, Uses, Examples, and More
Statistics is a broad topic and a subset of mathematics. Statistics analyzes data and aims to provide valuable information about the features and structure of a data set. It provides a framework for understanding and making sense of complex information. Statistics helps us uncover patterns, relationships, and trends in data, allowing for informed decision-making and drawing meaningful conclusions.
Using various statistical techniques, such as descriptive and inferential statistics, we can summarize data, test hypotheses, make predictions, and gain insights into the world around us. Ultimately, statistics empowers us to explore, understand, and extract valuable information from data to support research, business, and scientific endeavors.
Read more: What Is Statistical Analysis? Definition, Types, and Jobs
A population in statistics includes the complete data set for a particular problem. It's the entire group that you want to make inferences about. For example, if you study the average weight of all adults in a country, the population would be the entire adult population of that specific country.
A sample in statistics is more specific than the population and is a smaller group that resides within the population. Instead of collecting data from the entire population, samples save time, resources, or feasibility. The goal is to draw a broad conclusion applicable to the larger population from the analysis conducted on the sample, such as polling a representative sample of citizens of a town instead of every person in that location. A crucial component is to ensure the sample properly represents the overall population to confirm that any conclusions drawn from the sample are valid inferences.
Descriptive statistics is a subset of statistics primarily focused on analyzing and generating valuable insights about a set of data's core trends and relationships. It provides tools and techniques capable of extracting meaning from the data. Descriptive statistics explains various measures that characterize different aspects of the data. These measures summarize the data regarding its central tendency, variability, shape, and distribution.
Descriptive statistics involves the examination of various types of measures to summarize and describe a data set. These measures provide insights into different aspects of the data, allowing researchers to gain a comprehensive understanding of its characteristics. The main types of measures that descriptive statistics look at include:
Measures of central tendency: These measures focus on the center or average of the data. They provide valuable information about the usual or expected value of the data. The standard measures of central tendency are:
Mean: The average value of all the data points selected from an arithmetic standpoint
Median: The middle value separating the upper and lower half of a data set
Mode: The value or values that appear most frequently in the data set
Measures of variability: These measures examine the overall spread of data within the entire set. Another term from this is the dispersion of data. These measures aim to quantify a data point's overall deviation from the center. Standard measures of variability include the following:
Range: The spread between the highest and lowest values in the data set
Variance: The average of the squared deviations from the mean
Standard deviation: The square root of the variance provides a measure signifying how far each data point is from the mean of the entire data set on average
Measures of shape and distribution: These measures provide insights into the data's symmetry, kurtosis, and skewness. Some measures of shape and distribution include:
Skewness: The skewness shows the overall asymmetry of a data distribution. If the skewness is positive, the right side of the set of data has a longer and flatter curve, while a negative skewness indicates the opposite.
Kurtosis: This measures a data set's peakedness or flatness and compares the tails in the data distribution to a standard distribution.
Measures of association: These measures quantify the relationship or association between variables. The overall direction and power of a relationship in data becomes apparent through measures of association. Common measures of association include:
Correlation coefficient: It measures the linear relationship between two variables and ranges from -1 to +1.
Covariance: It measures the extent to which two variables vary together.
Measures of frequency: These measures show the frequency or count of values or categories in a data set. They provide information about the occurrence and relative representation of different values or categories.
Frequency or count: The total number of instances where a particular value appears in the data set
Percent: The proportion or relative frequency in terms of parts per hundred, representing the fraction of a whole expressed as a percentage value
Inferential statistics is a branch of statistics that deals with making inferences or conclusions about a population based on data from a sample. It involves the use of sample data chosen from the larger population to make broad generalizations about the entire data population. Inferential statistics allows researchers to make predictions, test hypotheses, and confidently estimate population parameters.
Inferential statistics goes beyond descriptive statistics by utilizing sample data to draw meaningful insights that pertain to a larger population. It involves several types of measures to support the process of inference:
Confidence intervals: These measures provide a range of numbers that a population parameter may land for a specific proportion of instances. A confidence interval quantifies the uncertainty associated with the estimate and provides a level of confidence (for example, 95 percent) that the valid population parameter falls within the interval.
Hypothesis testing: This process tests hypotheses centered around the population parameters. Hypothesis tests assess the likelihood of observing a particular result under a given hypothesis and help determine whether evidence exists to support or reject the hypothesis.
P-values: The p-value represents how strongly the underlying evidence disproves the provided null hypothesis in the test. P-values represent the probability that an outcome happened by chance. The null hypothesis has more evidence going against it if a lower p-value exists.
Regression analysis: This statistical technique examines one dependent variable and one or more independent variables to determine their overall trend and relationship. It allows for estimating parameters, predicting outcomes, and understanding the direction and strength of associations.
Analysis of variance (ANOVA): This measure tests for differences in the averages between multiple subsets of data. It assesses whether the observed variation between groups is statistically significant.
Probability distributions: These provide the basis for making probabilistic inferences and conducting hypothesis tests. Examples include the normal distribution and the chi-square distribution.
Descriptive and inferential statistics apply in different situations, depending on the goals and nature of the data analysis. Descriptive statistics summarize and describe the characteristics of a data set, whereas inferential statistics make inferences, generalize findings, test hypotheses, and support decision-making processes.
The objectives of your research and the type of data analysis you aim to run should guide your choice of which is appropriate. For example, if you wanted to research instances of a specific disease, using inferential statistics is most helpful. It allows you to pick a sample of individuals rather than trying to gain insights from every medical record available. The analysis and conclusions obtained from the sample apply to the broader population. Understanding the context of your experiment allows you to determine if you need descriptive or inferential statistics.
Descriptive statistics describe a data set, including the spread and variability, while inferential statistics allow you to test hypotheses and make predictions. Learning the differences between descriptive and inferential statistics is crucial in using statistical analysis to make informed decisions.
If you're interested in learning more about data analytics consider the Google Data Analytics Professional Certificate. This program is designed for beginners and teaches in-demand skills for an entry data analytics career. Topics that are covered include foundations of data, data exploration, data visualization, and more.
Editorial Team
Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...
This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.