Descriptive statistics

• The term descriptive statistics refers to the act of describing and summarizing data. This can be done using tables, plots, charts and key descriptive values such as the size of the dataset, percentages, average values and measures of spread. Depending on the task at hand, it might be that all that is required from you are descriptive statistics.

Illustrating Descriptive Statistics

Take a look at your lecture notes, reading materials and assignment briefs for examples on what types of data are presented and how data has been presented.

• Tables are an excellent way to summarize key information succinctly. They are often used to capture an overview of the dataset that has been collected and summarized using key descriptive statistics (see Calculating Descriptive Statistics below). When producing a table, be sure to use the conventions of your subject area. Look back at your lecture notes and reading materials for examples. Use publications in your field for additional inspiration.

Different types of tables

An introduction to frequency distribution tables, grouped frequency distributions and cumulative tables and graphs by Maths is Fun.

Creating tables using Excel and SPSS

Want to visualize your dataset but are not sure where to start? Take a look at this interactive tool for viewing data graphs (bar, line, dot, pie, histogram) by Maths is Fun. For small datasets consider using a dot plot, pictograph or stem-and-leaf plots.

Pictographs

• Looking to create an image of your data, try using a pictograph. Examples of pictographs by Maths is Fun.
• Excel can be used to create pictographs. Here is an example of how to create a pictograph in Excel by Tech-Recipes.

Infographics

Infographics can be used to demonstrate a process using step-by-step instruction. Take a look at these  12 incredible infographic ideas by Adobe for inspiration.

Pie charts are useful for comparing proportions. If your data has lots of subgroups consider using an alternative chart.

Bar charts have a wider application and can be used to compare different types of categorical data. There are several ways that a bar chart can be illustrated, vertically, horizontally, ordered and grouped in different ways. Look for ideas in your subject area on the best way of displaying your type of data.

Pie charts

• Pie Chart - what is it, what can it show and different formats by Maths is Fun.
• How to add a pie chart using Excel by Microsoft support.

Bar charts

Scatter plots and line graphs are useful for exploring potential relationships between variables. Take a look at the pages below for definitions and examples.

Histograms and Box and Whisker plots (or Box plot) are commonly used to explore the distribution of datasets. Creating a box and whisker plot can allow you to compare more than one distribution on a single plot.

The links below describe what these two different types of plot are and how to draw them using Excel.

Box and Whisker plot

Take care to understand how your software package has calculated a box and whisker plot. Different packages have different ways of classifying the whiskers. Be clear which method is being used to represent your data set.

• Box and whisker plots are constructed using quartiles. Take a look at the brief guide by Maths is Fun for a brief explanation.
• Box and whisker plot - what are they and how to construct them by Maths is Fun.
• Excel box and whisker plot - what they are, and how to create one using Excel by Microsoft support.

• Create your own graph paper using this graph paper maker by Maths is Fun.
• Graphics in SPSS - here UCLA Statistical Consulting illustrates how to produce a number of different graphics in SPSS, including the histogram and box and whisker plots.

Calculating Descriptive Statistics

Descriptive statistics provide researchers and readers with insight into a dataset at a glance. Care should be taken to ensure that the type of descriptive statistics used and/or reported are appropriate for the type of data. Take a look at your lecture notes, reading materials and assignment briefs for examples on what types of descriptive statistics are calculated and how data has  been summarized.

• Population and Samples

A population refers to an entire group that you are hoping to draw conclusions about. This could be all first year undergraduate students in the world.

A sample contains some of the population that you are able to gather data from. This could be all first year undergraduate students at Oxford Brookes University.

One of the first measures to report is the size of the dataset. This could be as simple as reporting how many participants or observations you have and an opportunity to disclose any data points that you decide to omit, and state why.

Want to know more about populations and samples. Take a look at the following page ‘ Population vs sample: what’s the difference?’ by Pritha Bhandari.

Count and Sum

Spreadsheets and statistical software packages have a range of functions that can count, sum (add-up) and quantify datasets for us, take a look below for some ideas on the types of functions that are available.

Using Excel to count, and add-up.

• COUNT function - counts the number of cells that contain information.
• COUNTIF function - counts the number of cells that contain information that meet a specific criterion.
• SUM function - adds-up numerical values to produce a total value (sums them up).
• SUMIF function - adds values that meet a criteria

PivotTables are a tool that can be used to calculate and summarize your dataset. For additional information on how to create a PivotTable, take a look at the following guidance Make your first PivotTable by Microsoft Support pages. This page has lots of additional tutorials, scroll down to find the Pivot Table options.

Need a refresher on how to calculate percentages by hand? Then these are the pages for you. Want to check your workings? How about using a software package to perform these calculations?

What are percentages?

Take a look at this introduction to percents by Maths is Fun, a brief introduction with interactive aids and worked examples.

Practice percentages with this MathCentre resource.

How to calculate percentages?

Explore these resources from the MathCentre written by Eleanor Lingham.

Changes in percentages

Sometimes we need to capture how much something has changed. Take a look at the following leaflets written by Eleanor Lingham.

How to do percentages in Excel

Take a look at how to do percentages in Excel by the Microsoft 365 team. This page details how to format the cells and how to calculate percentages with illustrated examples.

The term average value is used to describe the central tendency of a dataset. When we think about average values we often think about the mean value (where we add all of the numbers up and divide by how many there are). There are many different ways of reporting the average value of a dataset, the one to choose will depend on the type of data you have.

Which average value should I report?

Unless your assignment clearly specifies which average value to calculate (and/or report), the value you report will depend on the type of data you have. Three key average values are, the mode, the median and the mean.

Mode: The mode is the most commonly occurring category or number.
It is a good choice for nominal data. Take a look at the following example of the mode by Maths is Fun.

Calculate the mode using Excel

Median: The median is the midpoint of ordered data.
It is a good choice for ordinal, or interval or ratio data that is skewed or an unknown shape. Take a look at the following example of a median value by Maths is Fun.

Calculate the median using Excel

Mean: The arithmetic mean (or average) of numbered data.
A good choice for normally distributed interval or ratio data. Take a look at the following example of a mean value by Maths is Fun.

Calculate the mean using Excel.

• A two page document from the mathcentre with examples of how to perform the calculations by hand with additional exercises.
• Practical examples and illustrations of average values by Laerd statistics.

If you have reported an average value, it is a good idea to report how spread the dataset is too. This provides not only you the research but also the reader with better insight into how variable (or spread out) the dataset is.

Which measure of spread should I report?

The measure of spread to use will depend on which average value you have chosen to report for your dataset. As a general guide if you have reported:

• The mode then you could report the range of the data.
• The median then you should report the interquartile range of the data.
• The mean then you should report the standard deviation of the data.

Details of these measures of spread are provided below.

Range: The range of a data set is the difference between the largest (maximum) and smallest (minimum) values in the data set. Take a look at the following example of the range by Maths is Fun.

Calculate: Range = Maximum - Minimum

Calculate using Excel: Range = MAX(dataset) - MIN(dataset)

Interquartile Range (IQR): The interquartile range captures the middle 50% of ordered data. Take a look at the following example of interquartile range by Maths is Fun.

Calculate:

• Order data from smallest to largest.
• Find the values that are at the first (Q1) and third (Q3) quartile.
• The interquartile range is the difference between the third and first quartile:
•           IQR = Q3 - Q1

Calculate using Excel: IQR = Q3 - Q1

where Q1 and Q3 are found using the quartiles function

Q1 = QUARTILE.EXC(datarange,1)

Q3 = QUARTILE.EXC(datarange,3)

Standard Deviation: The standard deviation quantifies on average, how much each data point varies from the mean of the data set. Take a look at the following example of standard deviation by Maths is Fun.

How to calculate the population standard deviation using Excel.

How to calculate the sample standard deviation using Excel.

Population and Samples

Want to know more about populations and samples. Take a look at the following page ‘ Population vs sample: what’s the difference?’ by Pritha Bhandari.

• A short guide from the statstutor with examples of how to perform the calculations by hand and SPSS.
• Practical examples and illustrations of measures of spread by Laerd statistics.

Descriptive Statistics Summary

It is possible to use software packages to produce a summary of key descriptive statistic measures:

• Useful step-by-step instructions for obtaining the descriptive statistics data analysis ToolPak add-on in Excel written by Excel Easy. If you need to add the data analysis ToolPak in your version of Excel, follow these instructions.
• Filled with useful questions, details about the output together with links to additional information, the Exploring Data by Kent State University Libraries is a great place to begin how to perform descriptive statistics in SPSS.
• Annotated SPSS output: examples of descriptive statistics output with additional annotation detailing what values have been presented in a range of tables and plots - by UCLA Institute for Digital Research and Education Statistical Consulting

• Software training

• Introduction to Excel, rows and columns, formulas and functions, tables, charts and pivot tables by Microsoft support.
• IT Training at Oxford Brookes have produced an  SPSS self-paced course of 4 Parts: Introductory, Exploring your Data and initial statistical analysis, statistical analysis, surveys and survey analysis. Additional resources include YouTube videos and one-to-one support (for those who have done SPSS part 1 introductory).
• Getting started with SPSS - FREE course from The Open University.