Two sets of data can have a similar mean, mode and median but contain completely different values.

e.g. 4, 5, 6, 6, 7, 8 and 1, 2, 6, 6, 9, 12.

The mean, median and mode for both sets of data is 6 but the second set of numbers is much more spread out than the first.

The spread of a set of data can be measured using the range, quartilespercentiles and standard deviation.

The Working with Data activity provides practice at calculating these measures of spread.

Range

The range of a set of values is the difference between the highest value and the lowest value.

example Find the range of 4, 5, 6, 6, 7, 8

Range = 8 − 4 = 4

Quartiles

The median is the value which splits a set of values into two equal parts.

The quartiles split a set of values into four equal parts.
The lower quartile (LQ or Q1) is the value below which one quarter of the values lie.
The upper quartile (UQ or Q3 ) is the value below which three quarters of the values lie.

Finding the quartiles

For a set of data with an odd number of values:

Example 1 For the 11 values: 1 ,2, 3, 5, 6, 6, 7, 7, 9, 9, 10

 
LQ
  Median  
UQ
 
1
2
3
5
6
6
7
7
9
9
10
 
3rd value
 
6th value
 
9th value
 

For a set of data with an even number of values:

Example 2 For the 10 values: 3, 4, 5, 7, 9, 10, 11, 12, 13, 20:

 
LQ
 
Median
 
UQ
 
3
4
5
7
9
(9.5)
10
11
12
13
20
 
3rd value
 
between 5th and 6th value
 

8th value

 

Formula for working out the position of the quartiles and median

Quartile depth = Y11_Measures_of_Spread_01.gif where n is the number of values and when n/2 is found any halves are dropped.

Y11_Measures_of_Spread_02.gif

Therefore, the lower quartile is the 3rd value from the bottom of the set of values.
The upper quartile is the 3rd value from the top of the set of values.

 

Median depth = Y11_Measures_of_Spread_03.gif

Y11_Measures_of_Spread_04.gif

Therefore the median is the 6th value from either the top or from the bottom of the set of values.

Interquartile Range

The interquartile range of a set of values is the difference between the upper quartile and the lower quartile.

In example 1 above the interquartile range is 9 − 3 = 6

In example 2 above the interquartile range is 12 − 5 = 7


 

Percentiles

The quartiles and the median are special cases of the percentiles.

The lower quartile is the 25th percentile.
The median is the 50th percentile.
The upper quartile is the 75th percentile.

The x th percentile is a value below which x % of the sample is less than or equal to.

e.g. If the 90th percentile is 84 then 90% of the sample are less than or equal to 84.

Standard Deviation

The standard deviation is the best indication of the spread of a set of data as it takes into consideration every value in the sample or population. Scientific calculators will calculate the standard deviation from entered data. The symbol for the standard deviation of a sample is s and for a population is σ. A calculator uses σ in both cases.

In words, and simply put, the standard deviation is a measure of the average of the amounts that each value in a sample varies from the mean.

Finding standard deviation of a sample using a table

Calculate the standard deviation of a sample 1, 2, 4, 5, 7, 8 without a calculator, and show how it is obtained.

Step 1 Find the mean of the sample. Y11_Measures_of_Spread_05.gif
Step 2 Put values into a table and calculate deviation from mean.
x
x − Y11_Measures_of_Spread_06.gif
(x − Y11_Measures_of_Spread_06.gif)2
1
1 − 4.5 = -3.5
12.25
2
2 − 4.5 = -2.5
6.25
4
4 − 4.5 = -0.5
0.25
5
5 − 4.5 = 0.5
0.25
7
7 − 4.5 = 2.5
6.25
8
 
8 − 4.5 = 3.5
12.25
Total      Σ(x − Y11_Measures_of_Spread_06.gif)2 =
37.5
Y11_Measures_of_Spread_07.gif
Y11_Measures_of_Spread_08.gif
Step 3 Square these values to remove negative sign.
Step 4 Add up these values.
Step 5

Find average of these values.

This number is called thevariance.

Step 6

 

Take the square root.

This number is called thestandard deviation.

If the values are in the form of a frequency table, use the formula:

Y11_Measures_of_Spread_09.gif. See example.

 

Finding standard deviation of a sample using a calculator

The procedure for finding statistical values will vary slightly from calculator to calculator.

For a Casio FX82TL calculator, to find the standard deviation of 1, 2, 4, 5, 7, and 8:

Task
Press
Action
Select the statistical mode
MODE Y11_Measures_of_Spread_10.gif

Gives three choices: COMP(1), SD(2) or REG(3)

Selects SD (statistics) mode.

Clear old data
Y11_Measures_of_Spread_11.gif  Y11_Measures_of_Spread_12.gif  Y11_Measures_of_Spread_13.gif
Always do this before entering new data.
Enter the data
Y11_Measures_of_Spread_14.gif DT Y11_Measures_of_Spread_10.gif DT Y11_Measures_of_Spread_15.gif DT Y11_Measures_of_Spread_16.gif DT Y11_Measures_of_Spread_17.gif DT Y11_Measures_of_Spread_18.gif DT
Enters the six numbers1, 2, 4, 5, 7, and 8
Find standard deviation
Y11_Measures_of_Spread_11.gif  Y11_Measures_of_Spread_10.gif  Y11_Measures_of_Spread_13.gif   2.5
Finds the standard deviation, 
σ = 2.5
Clear old data ready for next calculation
Y11_Measures_of_Spread_11.gif   Y11_Measures_of_Spread_12.gif  Y11_Measures_of_Spread_13.gif
Always do this after finishing a calculation.

If the values are in the form of a frequency table look up the method in your calculator manual.

Pressing RCL and C on this calculator to find is a good check to see that all values have been added.
In this case if should give n = 6.

Finding standard deviation on a spreadsheet

Enter the data:

Y11_Measures_of_Spread_19.gif

The function entered in cell B8 to find the mean is =STDEV(A2..A7) this gives a mean of 2.74(rounded to 3 s.f.)

This value differs from that worked out in the two previous methods. The reason for this is that when dividing , a value of (n − 1) has been used by the spreadsheet for the denominator in the formula instead of n.

Why two standard deviation buttons on a calculator?

Usually σn is used for the denominator for finding the standard deviation of a population when all of the numbers are used, as in our example.

σn -1 is used for a sample taken from a population, when some of the extreme values of the population may be missing and the standard deviation under-estimated.

Unless you have taken a sample, use the button σn although both values are very close.