7 Statistical Analysis Methods Beginners Should Know

Written by Coursera Staff and Coursera Staff • Updated on

Learn about seven statistical analysis methods with examples to better understand statistics’ far-reaching everyday uses and the types of careers you might pursue if it’s something you’re passionate about.

[Featured image] A businessperson wearing glasses gazes at a computer screen choosing between statistical analysis methods to perform their job.

Nearly every social or scientific discipline uses statistics to inform decisions and improve outcomes. They do this through statistical analysis methods, which make sense of data by giving analytical insights into it. Statistical analysis drives informed approaches with business analytics. The insights gained from statistical analysis allow you to see patterns in data that have the potential to make future predictions, informing your business decision-making process. 

This article explores some basic statistical analysis methods to help you get started using statistics to improve your decision-making. It also examines how statistical analysis compares to data analysis when to use descriptive or inferential analysis and some jobs that use statistical analysis. 

Statistical analysis vs. data analysis

Statistical and data analysis do similar things and often work together to discover similar outcomes, such as behavior predictions. The main difference is the discipline's tactics to find patterns and predictions. Let’s examine some differences between statistical analysis and data analysis:

Statistical analysisData analysis
Data analyzed is from smaller sample sizesData analyzed is from large or massive amounts of data
Analysis focuses on the use of mathematical techniques, including probability, calculus, and linear algebraAnalysis focuses on data science techniques, including machine learning and computer programming, to identify patterns
Uses descriptive and inferential statistics to analyze dataUses descriptive, diagnostic, predictive, and prescriptive data analysis to inform decisions
Looks to understand a particular aspect of a data setDraws conclusions and finds patterns from the entire data set

Descriptive statistical analysis methods

Descriptive statistical analysis describes aspects of a set of data. These quantitative statistical methods show representations of what a set of data represents. Graphs and charts help visualize the findings of these methods. Some important beginner descriptive statistical analysis methods to know are:

  • Central tendency (mean, median, mode)

  • Variance

  • Standard deviation

Let’s take a closer look at each method and its application.

Mean

The mean is a central tendency that calculates the average value in a data set. The formula is the sum of all data points divided by the quantity of data points in the set. For example, if you want to find the average grade from this series of tests: 89, 99, 100, 75, 86, 95, 86, 73, and 86, you would start by adding them together, getting the sum of the series, which is 789. Then, divide that by the number of data points (nine), which equals 87.67—the mean or average test score. 

Median

The median is another central tendency that finds the data set’s middle value. To find the median, order data from the lowest to highest value. Using the test scores from above, the data set should look as follows: 73, 75, 86, 86, 86, 89, 95, 99, 100. Since this data set contains odd numbers, the median becomes 86. 

However, if it had one more number, it might look as follows: 73, 75, 86, 86, 86, 88, 89, 95, 99, and 100. Then, you would calculate the mean value of the two middle numbers. In this example, you would add 86 and 88, which sum to 174. Divide that by the two numbers and arrive at the new median of 87. In this case, the mean and median are similar. However, the median is sometimes a more accurate indicator of the average if the mean contains large outliers that weigh the average.  

Mode

The mode is the last central tendency of a data set and is simply data set’s most common number. With our original data set, put in order 73, 75, 86, 86, 86, 89, 95, 99, and 100, the mode reveals itself as 86—the most frequently repeated number in the data set. Mode is a valuable method for finding data patterns when predicting a common occurrence. In this case, while the median is also 86, the mode indicates there could be something about the test that makes 86 a common score.

Standard deviation

The standard deviation is a test of variability you use to measure the average distance data points vary from the mean. This method explains how far data points spread out from the mean value. Low values indicate a closeness to the mean, while high values indicate the values are more spread out. Standard deviation uses this formula:

s = √ ( Σ (x - x̄ )2 / n -1 )

Here are the steps to find the standard deviation using the data set from above 73, 75, 86, 86, 86, 89, 95, 99, 100:

  1. Find the mean of the data set. In this example, it would be 87.6667.

  2. Subtract the value of each data point from the mean to find the deviation, then square each value. 

  3. Sum the squared deviations. In this case, it is 720. 

  4. Using the formula, you get √720/8 = 9.49

Using this calculation, 9.49 is the standard deviation from the mean. 

Inferential statistical analysis methods

Inferential statistical analysis methods work to draw general conclusions and make predictions about populations through smaller data sets. These methods examine the quality of samples and findings of descriptive statistical findings to ensure their inferences to the larger population are valid. Many methods test the quality of the results. Some of these essential inferential methods include:

  • Hypothesis testing

  • Confidence intervals

  • Regression analysis

Let’s take a closer look at each method and its application.

Hypothesis testing

In hypothesis testing, you formulate two hypotheses to discover which statement about a data sample is valid. These two hypotheses are:

  • Null hypothesis: The hypothesis you are testing, symbolized as H0

  • Alternative hypothesis: An alternative hypothesis to the null that becomes true if the null hypothesis proves false, symbolized as H1

A typical test to reject the null hypothesis, which is assumed correct until you reject it, is analyzing a p-value. You can reject the null hypothesis if the p-value is less than or equal to the chosen significance level. The smaller the p-value, the more the evidence supports the alternative hypothesis.

Placeholder

Using the data on test scores above, let’s calculate a p-value with a significance value of a .05 level to perform a hypothesis test. This example is for the more common two-tailed p-value. Let’s say you think the mean of the test scores is 90 instead of 87.67. 

1. Make your null and alternative hypotheses known.

μ = hypothesis mean

The two hypotheses for this problem become:

H0: μ = 90   

H1: μ ≠ 90

2. After you state the hypothesis, use a t-test to calculate the value of the test concerning the data set.

The formula for “t” is t = x-μs÷n

x = 87.67 = data set mean 

μ = 90 = hypothesis mean

s = 9.49 = standard deviation

n = 9 = the size of the data set

Plug in your numbers from the sample problem and calculate the t. Once calculated, use the absolute value of t to keep the number positive |t| = 0.7366. 

3. Once you have your t value, consult a t-table to find a p-value. 

In this case, the p-value = 0.482425. Because this value is greater than the significance value of 0.05, you would not reject the null hypothesis H0: μ = 90 because you lack sufficient evidence.  

Confidence interval

This test determines how accurate a mean is from data set to data set. In the example of test scores above, the confidence interval determines a degree of confidence that the mean of the test scores will fall into a specific percentage of the time. The confidence interval is the sample mean margin of error. 

In the test score example, you want a confidence interval = 95 percent. 

1. Calculate the margin of error.

The formula for the margin of error is ME = z*sn

In the margin of error formula, z* represents a level of confidence consulted to the confidence table. For a 95 percent level of confidence, z* = 1.96

Using the standard deviation above 9.49 and the number of data points 9, the ME = 6.2

2. Calculate the confidence interval using the margin of error

Use the sample mean of 87.67 calculated earlier. 

C = 87.67±6.2, or from 81.47 to 93.87. 

With 95 percent confidence, you can say that the mean of the test scores in a different class falls between 81.47 and 93.87.

Regression analysis

Simple regression analysis uses a line of best fit drawn through a graph of data points, showing how many data points the line hits. This is the regression line. A regression analysis gives you the slope of the line, the correlation, and how well the line fits the data based on variation. Simple linear regression uses two variables, while multi-variable regressions use three or more variables. 

Simple regression analysis primarily aims to find the relationship between the dependent and independent variables. The formula for regression analysis is Y = a + b(x). In the formula:

  • Y = the independent variable

  • x = the dependent variable

  • a = the y-intercept

  • b = the slope of the graph

What jobs use statistical analysis methods?

The government, marketing, business, and engineering industries rely on statistical analysis methods. Additionally, you can get jobs in data analytics with a background in statistics as well. Many jobs require a master’s degree, but some entry-level positions accept a bachelor’s degree if your math background is strong enough. US Bureau of Labor Statistics projects the current job outlook from 2023 to 2033 for statisticians to grow 11 percent [1].

To get an idea of the types of roles you might pursue, consult the following list of statistics jobs and their average annual base salaries:

(All salary data is average annual base pay from Glassdoor as of October 2024)

  • Statistician: $99,937 

  • Statistical analyst: $89,552

  • Data analyst: $89,552

  • Business analyst: $92,569

  • Financial analyst: $79,434

  • Market researcher: $61,053

  • Actuarial analyst: $116,501

  • Investment analyst: $115,718

  • Data scientist: $115,691

Getting started with Coursera

If you want to understand statistical analysis methods more deeply, consider an online course or degree to gain in-demand skills. For example, you can try the Introduction to Statistics course from Stanford University to gain beginner skills in statistics. If you want a background in statistical analysis tied to data science, you can try Statistics with Python Specialization from the University of Michigan on Coursera. 

Article sources

  1. US Bureau of Labor Statistics. “Mathematicians and Statisticians Job Outlook, https://www.bls.gov/ooh/math/mathematicians-and-statisticians.htm#tab-6.” Accessed October 30, 2024. 

Keep reading

Updated on
Written by:

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.