Descriptive Statistics in Excel: The Ultimate Guide

If you want to make sense of your data and gain real insights, you’ve come to the right place. Descriptive statistics in Excel is a super easy and convenient way to get a handle on your data and figure out what’s going on. 

And the best part? You don’t need to be a statistician or a data scientist to use it! With Excel’s user-friendly interface and powerful functions, anyone can perform descriptive statistics and get valuable insights into their data.

There are several methods to get descriptive statistics in Excel. The method you choose will depend on your specific needs and the data you are working with, respectively: AVERAGE, MEDIAN, MODE, RANGE, STDEV and VAR functions and the Analysis ToolPak in Excel. In this lesson, we will show you how to use both methods, as well as visualize and interpret the descriptive statistics results.

Hang tight, as we’ll dive into the world of descriptive statistics and show you how to use Excel to make sense of your data. We’ll cover everything from basic stats like mean, median, and mode to more advanced techniques like histograms, box plots, and trend line. 

So, if you’re ready to gain some real insights into your data, grab a coffee and let’s get started!

Method 1: Descriptive Analysis in Excel using Functions

In Excel, there are several built-in functions that allow you to calculate summary statistics directly in a cell. These functions are AVERAGE, MEDIAN, MODE, RANGE, and STDEV.

  • AVERAGE function calculates the average of a range of values
  • MEDIAN function calculates the median of a range of values
  • MODE function calculates the mode of a range of values
  • RANGE function calculates the range of a range of values
  • STDEV function calculates the standard deviation of a range of values
  • VAR function calculates the variation of a range of values

First thing to do is to input our data into an Excel worksheet, with the months in one column and the sales in another like shown in the example below:

Sample data set in Excel. Source: uedufy.com
Sample data set

Alternatively, you can download the Excel sheet I use as an example in this lesson HERE and follow along.

Next, let’s use the following formulas to calculate the mean, median, mode, range, standard deviation, and variance in Excel.

Step 1: Calculate Mean 

In cell D2, type =AVERAGE(B2:B13) and hit the ENTER key. This formula calculates the average (arithmetic mean) sales figures for each month. And the mean for our dataset is 16.25.

Mean formula in Excel. Source: uedufy.com
Mean formula in Excel

The mean of 16.25 in the context of the sales data set indicates the average sales figure for the 12 months. In other words, if you add up all of the sales figures and divide by 12, you will get the mean of 16.25. This value gives you a general idea of what the typical sales figure is for the year.

It’s important to note that the mean can be influenced by outliers and extreme values, so it may not always provide an accurate representation of the typical value. However, in this case, the mean of 16.25 is relatively close to the median sales figure of 16, which indicates that the data is not significantly influenced by outliers or extreme values.

Step 2: Calculate Median

In cell E2, type =MEDIAN(B2:B13) and press ENTER. This formula calculates the median, the middle value of the data set, when the values are arranged in order. The median for this dataset is 16.

Median formula in Excel. Source: uedufy.com
Median Formula in Excel

The median of 16 in the context of the sales data set indicates the middle value of the sales figures for the 12 months. In other words, if you were to order all of the sales figures from smallest to largest, the median would be the value that is exactly in the middle. If there is an even number of data points, the median is the average of the two middle values.

The median is a robust measure of central tendency, which means that it is not greatly affected by outliers or extreme values. For this reason, the median is often used as an alternative to the mean when the data is not normally distributed.

In the case of the sales data set, the median of 16 provides a good indication of the typical sales figure for the year, as the data is not significantly influenced by outliers or extreme values. This means that half of the sales figures are above 16, and half are below 16.

Step 3: Calculate Mode

In the F2 cell, type =MODE(B2:B13). This formula calculates the mode, the most frequently occurring value in the data set. The mode for this dataset is 13.

Mode formula in Excel. Source: uedufy.com
Mode formula in Excel

The mode of 13 in the context of the sales data set indicates the most frequently occurring sales figure for the 12 months. In other words, the mode is the value that appears most frequently in the data.

The mode is a measure of central tendency, which gives you an idea of what the typical sales figure is for the year. Unlike the mean and median, the mode can be influenced by the frequency of the data points. For example, if there are several sales figures that occur more frequently than the others, then there may be multiple modes.

In the case of the sales data set, the mode of 13 indicates that 13 is the most frequently occurring sales figure for the 12 months. This means that 13 was the sales figure that was achieved the most number of times during the year.

Step 4: Calculate Range 

In the cell G2, type =MAX(B2:B13) – MIN(B2:B13). This formula calculates the range, which is the difference between the largest and smallest values in the data set. The range for this dataset is 8.

Range formula in Excel. Source: uedufy.com
Range formula in Excel

The range of 8 in the context of the sales data set indicates the difference between the highest and lowest sales figures for the 12 months. In other words, the range is calculated by subtracting the lowest value from the highest value.

The range provides a rough idea of how spread out the sales figures are and gives you an idea of the variability of the data. A larger range indicates that the sales figures are more spread out, while a smaller range indicates that the sales figures are more clustered together.

In the case of the sales data set, the range of 8 indicates that the difference between the highest and lowest sales figures for the 12 months is 8. This means that the highest sales figure was 21 and the lowest sales figure was 13.

Step 5: Calculate Standard Deviation

In the cell H2, type =STDEV(B2:B13) and hit ENTER. This formula calculates the standard deviation, which is a measure of the spread of the data set. The standard deviation for this dataset is 2.8.

Standard deviation formula in Excel. Source: uedufy.com
Standard deviation formula in Excel

The standard deviation of 2.8 in the context of the sales data set indicates how much the sales figures for the 12 months deviate from the mean. In other words, the standard deviation is a measure of how spread out the data is.

The standard deviation is calculated as the square root of the variance, which is the average of the squared differences between each data point and the mean. The larger the standard deviation, the more spread out the data is. A small standard deviation indicates that the data points are close to the mean, while a large standard deviation indicates that the data points are more spread out.

In the case of the sales data set, the standard deviation of 2.8 indicates that the sales figures for the 12 months deviate from the mean by 2.8 on average. This means that the majority of the sales figures are within 2.8 units of the mean.

Step 6: Calculate Variance 

In cell H8, type =VAR(B2:B13). This formula calculates the variance, which measures the spread of the squared data set. The variance for this dataset is 7.84.

Variance formula in Excel. Source: uedufy.com
Variance formula in Excel

Here is the summary of our descriptive analysis in Excel for the given data set. 

Example of descriptive summary in Excel. Source: uedufy.com
Descriptive summary in Excel

The variance of 7.84 in the context of the sales data set indicates how much the sales figures for the 12 months deviate from the mean. In other words, the variance is a measure of how spread out the data is.

The variance is calculated as the average of the squared differences between each data point and the mean. The larger the variance, the more spread out the data is. A small variance indicates that the data points are close to the mean, while a large variance indicates that the data points are more spread out.

In the case of the sales data set, the variance of 7.84 indicates that the sales figures for the 12 months deviate from the mean by an average of 7.84 units squared. This means that the majority of the sales figures are within 7.84 units squared of the mean.

If you choose to calculate descriptive statistics in Excel manually, don’t leave just yet. Scroll down as we are going to learn how to visualize our results.

Method 2: Descriptive Analysis in Excel using Analysis Toolpak

Now, if you are in a rush and want to just get down to the results without the hassle (but fun) of manually calculating everything, Excel provides you a fast way to calculate descriptive statistics using the Data Analysis Toolpak.

First, make sure you install the Data Analysis Toolpak in Excel – it literally takes 5 seconds. 

Using the same dataset, simply navigate in Excel to the Data tab, click on the Data Analysis icon and select the Descriptive Statistics option. Select all the Sales values in column B, check the Summary Statistics checkbox and click OK.

Descriptive Statistics in Excel using Data Analysis Toolpak. Source: uedufy.com
Descriptive Statistics in Excel using Data Analysis Toolpak

Here you go. We get the descriptive statistics in Excel for our dataset with just a few clicks.

Descriptive statistics results in Excel using Analysis Toolpak. Source: uedufy.com
Descriptive statistics results in Excel using Analysis Toolpak

Visualizing Descriptive Statistics in Excel

Descriptive statistics is all about numbers. But when we visualize them we can get insights otherwise difficult to see without a trained eye. Let’s add some visual magic by learning how to create a histogram, box plot, scatter plot, and trend line to our descriptive statistics in Excel. 

Step 1: Create a histogram

A histogram is a bar graph that represents the distribution of a dataset by grouping the data into a set of bins and showing the frequency of data points in each bin.

  • In the Data Analysis window, select Histogram and click OK.
Generate histogram in Excel. Source: uedufy.com
Generate histogram in Excel
  • Select the Input Range, in our case the sales values for January to December. And a histogram is about visualizing data, let’s also select the Chart Output checkbox. 

NOTE: if you include the column label (i.e., Sales) in your selection, make sure you check the Label box as well.

Click the OK button.

Histogram chart output in Excel. Source: uedufy.com
Histogram chart output in Excel

Excel will generate a histogram table and chart instantly for us on a new tab as shown below.

Histogram result in Excel. Source: uedufy.com
Histogram result in Excel

The height of each bar represents the number of data points in the corresponding bin.

In the case of the dataset provided, the histogram shows the distribution of sales figures for each month. By looking at the histogram, you can quickly see how many sales were made in each range of sales figures. 

For example, if there is a bar in the histogram representing the sales range of 10 to 15, this means that a certain number of sales were made in that specific range.

Step 2: Create a Box Plot 

Go to the Insert tab in the tab in Excel. In the charts section, click on the Statistical icon and select Box and Whisker chart type.

Box and Whisker chart in Excel. Source: uedufy.com
Box and Whisker chart in Excel

In the case of the dataset provided, the Box and Whisker plot shows the distribution of sales figures for each month. The box in the plot represents the interquartile range (IQR), which is the range of the middle 50% of the data. The median of the data is represented by a line within the box. 

The whiskers of the plot represent the minimum and maximum values of the data, excluding any outliers. Any data points outside of the whiskers are considered outliers and are represented as individual dots. 

Box and Whisker result in Excel. Source: uedufy.com
Box and Whisker results in Excel

In the case of the dataset provided, the Box and Whisker plot shows that the median sales figure is 16, and the IQR is 7 (13 to 18). This means that 50% of the sales figures fall between 13 and 18. There are no outliers visible in the plot, indicating that all of the sales figures are relatively close to the median.

Step 3: Create a Scatter Plot 

A scatter plot is a graph that plots individual data points and shows the relationship between two variables. In the case of the dataset provided, the scatter plot shows the relationship between the months and the sales figures. By looking at the scatter plot, you can quickly see if there is a positive or negative relationship between the two variables and the strength of the relationship.

Select both columns of data, go to the Insert tab, and choose the Scatter chart type to visualize the relationship between the months and the sales.

How to scatter plot in Excel. Source: uedufy.com
Scatter Plot in Excel

In the case of the dataset provided, the scatter plot shows a weak positive relationship between the months and the sales figures. This means that as the months progress, the sales figures tend to increase slightly, but the relationship is not strong.

Scatter Plot example in Excel. Source: uedufy.com
Scatter Plot result in Excel

Step 4: Add a Trend Line

The trend line can also provide an estimate of future sales figures. For example, if the trend line is upward-sloping, this means that sales are generally increasing, and you can expect future sales to be higher than the current sales. If the trend line is downward-sloping, this means that sales are generally decreasing, and you can expect future sales to be lower than the current sales.

Right-click on a data point in the scatter plot graph we generated in the previous step, select Add Trendline, and choose the type of trendline that best fits the data to see if there is a pattern in the data.

Add trend line to scatter plot in Excel. Source: uedufy.com
Add trend line in Excel

In the case of the sales data set, the trend line is relatively flat, indicating that there is no significant upward or downward trend in the sales figures. This means that you can expect the sales figures to remain relatively stable in the future.

Conclusion

Calculating descriptive statistics in Excel is a valuable tool for gaining insights into your data. By using the various tools and functions available in Excel, you can quickly calculate important statistics such as the mean, median, mode, range, standard deviation, variance, histogram, Box and Whisker plot, scatter plot, and trend line.

Each of these tools and functions provides valuable insights into your data and can help you make informed decisions about your data. For example, the mean, median, and mode can give you a general idea of the typical value of your data. The range, standard deviation, and variance can help you understand how spread out your data is. The histogram, Box and Whisker plot, scatter plot, and trend line can provide visual representations of your data and help you understand the distribution, outliers, and overall pattern of your data.

In short, descriptive statistics in Excel is a powerful tool for data analysis, and can help you quickly gain valuable insights into your data and make informed decisions about your data. Whether you’re a student, business owner, or data analyst, knowing how to calculate descriptive statistics in Excel is a valuable skill that will serve you well in your data analysis endeavors.

Leonard Cucos Profile Uedufy
Leonard

Leonard is a Ph.D. student in Data Science and holds an MBA and B.Sc. He has an impressive public speaking profile on education, engineering, and research. He loves to help students achieve their academic objectives and believes education is the key to building a better future for mankind.

Leonard
Leonard

Leonard is a Ph.D. student in Data Science and holds an MBA and B.Sc. He has an impressive public speaking profile on education, engineering, and research. He loves to help students achieve their academic objectives and believes education is the key to building a better future for mankind.

Articles: 51