Learn more about Minitab Statistical Software
Find definitions and interpretation guidance for every statistic and graph that is provided with display descriptive statistics.
In This Topic
 Boxplot
 Histogram
 Individual value plot
 Q1
 IQR
 Maximum
 Median
 Minimum
 Range
 Q3
 Mean
 SE mean
 TrMean
 CumN
 N*
 N
 Total Count
 CumPct
 Percent
 Kurtosis
 Skewness
 CoefVar
 StDev
 Variance
 Mode
 MSSD
 Sum
 Sum of Squares
Boxplot
A boxplot provides a graphical summary of the distribution of a sample. The boxplot shows the shape, central tendency, and variability of the data.
Interpretation
Use a boxplot to examine the spread of the data and to identify any potential outliers. Boxplots are best when the sample size is greater than 20.
 Skewed data

Examine the spread of your data to determine whether your data appear to be skewed. When data are skewed, the majority of the data are located on the high or low side of the graph. Often, skewness is easiest to detect with a histogram or boxplot.
 Outliers

Outliers, which are data values that are far away from other data values, can strongly affect the results of your analysis. Often, outliers are easiest to identify on a boxplot.
Try to identify the cause of any outliers. Correct any data–entry errors or measurement errors. Consider removing data values for abnormal, onetime events (also called special causes). Then, repeat the analysis. For more information, go to Identifying outliers.
Histogram
A histogram divides sample values into many intervals and represents the frequency of data values in each interval with a bar.
Interpretation
Use a histogram to assess the shape and spread of the data. Histograms are best when the sample size is greater than 20.
 Skewed data

You can use a histogram of the data overlaid with a normal curve to examine the normality of your data. A normal distribution is symmetric and bellshaped, as indicated by the curve. It is often difficult to evaluate normality with small samples. A probability plot is best for determining the distribution fit.
 Outliers

Outliers, which are data values that are far away from other data values, can strongly affect the results of your analysis. Often, outliers are easiest to identify on a boxplot.
Try to identify the cause of any outliers. Correct any data–entry errors or measurement errors. Consider removing data values for abnormal, onetime events (also called special causes). Then, repeat the analysis. For more information, go to Identifying outliers.
 Multimodal data

Multimodal data have multiple peaks, also called modes. Multimodal data often indicate that important variables are not yet accounted for.
If you have additional information that allows you to classify the observations into groups, you can create a group variable with this information. Then, you can create the graph with groups to determine whether the group variable accounts for the peaks in the data.
Individual value plot
An individual value plot displays the individual values in the sample. Each circle represents one observation. An individual value plot is especially useful when you have relatively few observations and when you also need to assess the effect of each observation.
Interpretation
Use an individual value plot to examine the spread of the data and to identify any potential outliers. Individual value plots are best when the sample size is less than 50.
 Skewed data

Examine the spread of your data to determine whether your data appear to be skewed. When data are skewed, the majority of the data are located on the high or low side of the graph. Often, skewness is easiest to detect with a histogram or boxplot.
 Outliers

Outliers, which are data values that are far away from other data values, can strongly affect the results of your analysis. Often, outliers are easiest to identify on a boxplot.
Try to identify the cause of any outliers. Correct any data–entry errors or measurement errors. Consider removing data values for abnormal, onetime events (also called special causes). Then, repeat the analysis. For more information, go to Identifying outliers.
Q1
Quartiles are the three values–the first quartile at 25% (Q1), the second quartile at 50% (Q2 or median), and the third quartile at 75% (Q3)–that divide a sample of ordered data into four equal parts.
The first quartile is the 25th percentile and indicates that 25% of the data are less than or equal to this value.
IQR
The interquartile range (IQR) is the distance between the first quartile (Q1) and the third quartile (Q3). 50% of the data are within this range.
Interpretation
Use the interquartile range to describe the spread of the data. As the spread of the data increases, the IQR becomes larger.
Maximum
The maximum is the largest data value.
In these data, the maximum is 19.
13  17  18  19  12  10  7  9  14 
Interpretation
Use the maximum to identify a possible outlier or a dataentry error. One of the simplest ways to assess the spread of your data is to compare the minimum and maximum. If the maximum value is very high, even when you consider the center, the spread, and the shape of the data, investigate the cause of the extreme value.
Median
The median is the midpoint of the data set. This midpoint value is the point at which half the observations are above the value and half the observations are below the value. The median is determined by ranking the observations and finding the observation that are at the number [N + 1] / 2 in the ranked order. If the number of observations are even, then the median is the average value of the observations that are ranked at numbers N / 2 and [N / 2] + 1.
Interpretation
The median and the mean both measure central tendency. But unusual values, called outliers, can affect the median less than they affect the mean. If your data are symmetric, the mean and median are similar.
Minimum
The minimum is the smallest data value.
In these data, the minimum is 7.
13  17  18  19  12  10  7  9  14 
Interpretation
Use the minimum to identify a possible outlier or a dataentry error. One of the simplest ways to assess the spread of your data is to compare the minimum and maximum. If the minimum value is very low, even when you consider the center, the spread, and the shape of the data, investigate the cause of the extreme value.
Range
The range is the difference between the largest and smallest data values in the sample. The range represents the interval that contains all the data values.
Interpretation
Use the range to understand the amount of dispersion in the data. A large range value indicates greater dispersion in the data. A small range value indicates that there is less dispersion in the data. Because the range is calculated using only two data values, it is more useful with small data sets.
Q3
Quartiles are the three values–the first quartile at 25% (Q1), the second quartile at 50% (Q2 or median), and the third quartile at 75% (Q3)–that divide a sample of ordered data into four equal parts.
The third quartile is the 75th percentile and indicates that 75% of the data are less than or equal to this value.
Mean
The mean is the average of the data, which is the sum of all the observations divided by the number of observations.
For example, the wait times (in minutes) of five customers in a bank are: 3, 2, 4, 1, and 2. The mean waiting time is calculated as follows:
On average, a customer waits 2.4 minutes for service at the bank.
Interpretation
Use the mean to describe the sample with a single value that represents the center of the data. Many statistical analyses use the mean as a standard measure of the center of the distribution of the data.
The median and the mean both measure central tendency. But unusual values, called outliers, can affect the median less than they affect the mean. If your data are symmetric, the mean and median are similar.
SE mean
The standard error of the mean (SE Mean) estimates the variability between sample means that you would obtain if you took repeated samples from the same population. Whereas the standard error of the mean estimates the variability between samples, the standard deviation measures the variability within a single sample.
For example, you have a mean delivery time of 3.80 days, with a standard deviation of 1.43 days, from a random sample of 312 delivery times. These numbers yield a standard error of the mean of 0.08 days (1.43 divided by the square root of 312). If you took multiple random samples of the same size, from the same population, the standard deviation of those different sample means would be around 0.08 days.
Interpretation
Use the standard error of the mean to determine how precisely the sample mean estimates the population mean.
A smaller value of the standard error of the mean indicates a more precise estimate of the population mean. Usually, a larger standard deviation results in a larger standard error of the mean and a less precise estimate of the population mean. A larger sample size results in a smaller standard error of the mean and a more precise estimate of the population mean.
Minitab uses the standard error of the mean to calculate the confidence interval.
TrMean
The mean of the data, without the highest 5% and lowest 5% of the values.
Use the trimmed mean to eliminate the impact of very large or very small values on the mean. When the data contain outliers, the trimmed mean may be a better measure of central tendency than the mean.
CumN
Cumulative N is a running total of the number of observations in successive categories. For example, an elementary school records the number of students in grades one through six. The CumN column contains the cumulative count of the student population:
Grade Level  Count  CumN  Calculation 

1  49  49  49 
2  58  107  49 + 58 
3  52  159  49 + 58 + 52 
4  60  219  49 + 58 + 52 + 60 
5  48  267  49 + 58 + 52 + 60 + 48 
6  55  322  49 + 58 + 52 + 60 + 48 + 55 
N*
The number of missing values in the sample. Thenumber of missing values refers to cells that contain the missing value symbol*.
In this example, 8 errors occurred during data collection and are recorded as missing values.
Total count  N  N* 

149  141  8 
N
The number of nonmissing values in thesample.
In this example, there are 141 recorded observations.
Total count  N  N* 

149  141  8 
Total Count
The total number of observations in the column. Use to represent the sum of N missing and Nnonmissing.
In this example, there are 141 valid observations and 8 missing values. The total count is 149.
Total count  N  N* 

149  141  8 
CumPct
The cumulative percent is the cumulative sum of the percentages for each group of the By variable. In the following example, the by variable has 4 groups: Line 1, Line 2, Line 3, and Line 4.
Group (by variable)  Percent  CumPct 

Line 1  16  16 
Line 2  20  36 
Line 3  36  72 
Line 4  28  100 
Percent
The percent of observations in each group of the By variable. In the following example, there are four groups: Line 1, Line 2, Line 3, and Line 4.
Group (by variable)  Percent 

Line 1  16 
Line 2  20 
Line 3  36 
Line 4  28 
Kurtosis
Kurtosis indicates how the tails of a distribution differ from the normal distribution.
Interpretation
Use kurtosis to initially understand general characteristics about the distribution of your data.
Skewness
Skewness is the extent to which the data are not symmetrical.
Interpretation
Use skewness to help you establish an initial understanding of your data.
CoefVar
The coefficient of variation (CoefVar) is a measure of spread that describes the variation in the data relative to the mean. The coefficient of variation is adjusted so that the values are on a unitless scale. Because of this adjustment, you can use the coefficient of variation instead of the standard deviation to compare the variation in data that have different units or that have very different means.
Interpretation
The larger the coefficient of variation, the greater the spread in the data.
For example, you are the quality control inspector at a milk bottling plant that bottles small and large containers of milk. You take a sample of each product and observe that the mean volume of the small containers is 1 cup with a standard deviation of 0.08 cup, and the mean volume of the large containers is 1 gallon (16 cups) with a standard deviation of 0.4 cups. Although the standard deviation of the gallon container is five times greater than the standard deviation of the small container, their coefficients of variation support a different conclusion.
Large container  Small container 

CoefVar = 100 * 0.4 cups / 16 cups = 2.5  CoefVar = 100 * 0.08 cups / 1 cup = 8 
The coefficient of variation of the small container is more than three times greater than that of the large container. In other words, although the large container has a greater standard deviation, the small container has much more variability relative to its mean.
StDev
The standard deviation is the most common measure of dispersion, or how spread out the data are about the mean. The symbol σ (sigma) is often used to represent the standard deviation of a population, while s is used to represent the standard deviation of a sample. Variation that is random or natural to a process is often referred to as noise.
Because the standard deviation is in the same units as the data, it is usually easier to interpret than the variance.
Interpretation
Use the standard deviation to determine how spread out the data are from the mean. A higher standard deviation value indicates greater spread in the data. A good rule of thumb for a normal distribution is that approximately 68% of the values fall within one standard deviation of the mean, 95% of the values fall within two standard deviations, and 99.7% of the values fall within three standard deviations.
The standard deviation can also be used to establish a benchmark for estimating the overall variation of a process.
Variance
The variance measures how spread out the data are about their mean. The variance is equal to the standard deviation squared.
Interpretation
The greater the variance, the greater the spread in the data.
Because variance (σ^{2}) is a squared quantity, its units are also squared, which may make the variance difficult to use in practice. The standard deviation is usually easier to interpret because it's in the same units as the data. For example, a sample of waiting times at a bus stop may have a mean of 15 minutes and a variance of 9 minutes^{2}. Because the variance is not in the same units as the data, the variance is often displayed with its square root, the standard deviation. A variance of 9 minutes^{2} is equivalent to a standard deviation of 3 minutes.
Mode
The mode is the value that occurs most frequently in a set of observations. Minitab also displays how many data points equal the mode.
The mean and median require a calculation, but the mode is determined by counting the number of times each value occurs in a data set.
Interpretation
The mode can be used with mean and median to provide an overall characterization of your data distribution. The mode can also be used to identify problems in your data.
For example, a distribution that has more than one mode may identify that your sample includes data from two populations. If the data contain two modes, the distribution is bimodal. If the data contain more than two modes, the distribution is multimodal.
For example, a bank manager collects wait time data for customers who are cashing checks and for customers who are applying for home equity loans. Because these are two very different services, the wait time data included two modes. The data for each service should be collected and analyzed separately.
MSSD
The MSSD is the mean of the squared successive difference. MSSD is an estimate of variance. One possible use of the MSSD is to test whether a sequence of observations is random. In quality control, a possible use of MSSD is to estimate the variance when the subgroup size = 1.
Sum
The sum is the total of all the data values. The sum is also used in statistical calculations, such as the mean and standard deviation.
Sum of Squares
The uncorrected sum of squares are calculated by squaring each value in the column, and calculates the sum of those squared values. For example, if the column contains x_{1}, x_{2}, ... , x_{n}, then sum of squares calculates (x_{1}^{2} + x_{2}^{2} + ... + x_{n}^{2}). Unlike the corrected sum of squares, the uncorrected sum of squares includes error. The data values are squared without first subtracting the mean.
FAQs
Interpret all statistics and graphs for Display Descriptive Statistics? ›
Some of the types of graphs that are used to summarize and organize data are the dot plot, the bar graph, the histogram, the stemandleaf plot, the frequency polygon (a type of broken line graph), the pie chart, and the box plot.
How do you interpret descriptive statistics? › Step 1: Describe the size of your sample.
 Step 2: Describe the center of your data.
 Step 3: Describe the spread of your data.
 Step 4: Assess the shape and spread of your data distribution.
 Compare data from different groups.
Some of the types of graphs that are used to summarize and organize data are the dot plot, the bar graph, the histogram, the stemandleaf plot, the frequency polygon (a type of broken line graph), the pie chart, and the box plot.
How do you display descriptive statistics in a report? ›Format of descriptive statistics
In tables, you should use separate columns or rows to report the descriptive statistics for each variable or group and use appropriate labels and headings. For example, you can create a table with columns for mean, standard deviation, and range for each variable.
Tips for understanding descriptive statistics results
Describe the center of your data. Describe the spread of your data using the standard deviation. Use individual value plot, histogram and box plot to assess the shape and spread of your data distribution. Compare data that you analyze from different groups.
Examples of descriptive analytics include KPIs such as yearonyear percentage sales growth, revenue per customer and the average time customers take to pay bills. The products of descriptive analytics appear in financial statements, other reports, dashboards and presentations.
What is the main purpose of descriptive statistics? ›The purpose of a descriptive statistic is to summarize data. Descriptive stats only make statements about the set of data from which they were calculated; they never go beyond the data you have.
What are the 5 descriptive statistics? ›Descriptive statistics are broken down into measures of central tendency and measures of variability (spread). Measures of central tendency include the mean, median, and mode, while measures of variability include standard deviation, variance, minimum and maximum variables, kurtosis, and skewness.
How to interpret descriptive statistics mean and standard deviation? ›That is, how data is spread out from the mean. A low standard deviation indicates that the data points tend to be close to the mean of the data set, while a high standard deviation indicates that the data points are spread out over a wider range of values.
What are the 5 commonly used graphs in statistics? ›Types of Graphs in Statistics. The four basic graphs used in statistics include bar, line, histogram and pie charts. These are explained here in brief.
What should you report in descriptive statistics? ›
The mean, the mode, the median, the range, and the standard deviation are all examples of descriptive statistics. Descriptive statistics are used because in most cases, it isn't possible to present all of your data in any form that your reader will be able to quickly interpret.
How do you choose a graph in statistics? ›If you want to compare values, use a pie chart — for relative comparison — or bar charts — for precise comparison. If you want to compare volumes, use an area chart or a bubble chart. If you want to show trends and patterns in your data, use a line chart, bar chart, or scatter plot.
Are bar graphs descriptive statistics? ›Descriptive statistics
The relative frequencies can be represented graphically by a relative frequency line or bar graph or by a relative frequency polygon.
A horizontal bar chart is great for illustrating survey results with multiple categories. The structure of the graphs improves readability. This is because the bars extend along the xaxis instead of up the yaxis.
Which is a type of graph that can be used to analyze data? ›A line graph reveals trends or progress over time, and you can use it to show many different categories of data. You should use it when you chart a continuous data set.