The mean, median and mode for both sets of data is 6 but the second set of numbers is much more spread out than the first.
The spread of a set of data can be measured using the range, quartiles, percentiles, variance and the standard deviation. The Working with Data activity provides practice at finding these measures of spread.
Range
The range of a set of values is the difference between the highest value and the lowest value.
example Find the range of 4, 5, 6, 6, 7, 8
Range = 8 − 4 = 4
Quartiles
The median is the value which splits a set of values into two equal parts.
The quartiles split a set of values into four equal parts.
The lower quartile (LQ or Q1) is the value below which one quarter of the values lie.
The upper quartile (UQ or Q3 ) is the value below which three quarters of the values lie.
Finding the quartiles
For a set of data with an odd number of values:
Example 1 For the 11 values: 1 ,2, 3, 5, 6, 6, 7, 7, 9, 9, 10
LQ
|
Median |
UQ
|
||||||||
1
|
2
|
3
|
5
|
6
|
6
|
7
|
7
|
9
|
9
|
10
|
3rd value
|
6th value
|
9th value
|
For a set of data with an even number of values:
Example 2 For the 10 values: 3, 4, 5, 7, 9, 10, 11, 12, 13, 20:
LQ
|
Median
|
UQ
|
||||||||
3
|
4
|
5
|
7
|
9
|
(9.5)
|
10
|
11
|
12
|
13
|
20
|
3rd value
|
between 5th and 6th value
|
8th value |
Formula for working out the position of the quartiles and median
Quartile depth = where n is the number of values and when n/2 is found any halves are dropped.
Therefore, the lower quartile is the 3rd value from the bottom of the set of values.
The upper quartile is the 3rd value from the top of the set of values.
Median depth =
Therefore the median is the 6th value from either the top or from the bottom of the set of values.
The interquartile range of a set of values is the difference between the upper quartile and the lower quartile.
In example 1 above the interquartile range is 9 − 3 = 6
In example 2 above the interquartile range is 12 − 5 = 7
Percentiles
The quartiles and the median are special cases of the percentiles.
The lower quartile is the 25th percentile.
The median is the 50th percentile.
The upper quartile is the 75th percentile.
The x th percentile is a value below which x % of the sample is less than or equal to.
e.g. If the 90th percentile is 84 then 90% of the sample are less than or equal to 84.
Standard Deviation
The standard deviation is the best indication of the spread of a set of data as it takes into consideration every value in the sample or population. Scientific calculators will calculate the standard deviation from entered data. The symbol for the standard deviation of a sample is s and for a population is σ. A calculator uses σ in both cases.
In words, and simply put, the standard deviation is a measure of the average of the amounts that each value in a sample varies from the mean.
Finding standard deviation of a sample using a table
Calculate the standard deviation of a sample 1, 2, 4, 5, 7, 8 without a calculator, and show how it is obtained.
Step 1 | Find the mean of the sample. | ||||||||||||||||||||||||||||||
Step 2 | Put values into a table and calculate deviation from mean. |
|
|||||||||||||||||||||||||||||
Step 3 | Square these values to remove negative sign. | ||||||||||||||||||||||||||||||
Step 4 | Add up these values. | ||||||||||||||||||||||||||||||
Step 5 |
Find average of these values. This number is called the variance. |
||||||||||||||||||||||||||||||
Step 6
|
Take the square root. This number is called the standard deviation. |
If the values are in the form of a frequency table, use the formula:
. See example.
Alternative Standard Deviation Formula
These formulae are quite lengthy to use as they deal with each score and alternative versions are available.
Standard deviation
|
Alternative formula
|
|
For raw data | ||
For a frequency distribution |
Remember that n = Σf
Variance
The variance is the square of the standard deviation.
Variance = (standard deviation)2
Finding standard deviation of a sample of raw data using a calculator
The procedure for finding statistical values will vary slightly from calculator to calculator.
For a typical calculator, to find the standard deviation of 1, 2, 4, 5, 7, and 8:
Task
|
Press
|
Action
|
Select the statistical mode |
MODE
|
Gives three choices: COMP(1), SD(2) or REG(3) Selects SD (statistics) mode. |
Enter the data |
DT DT DT DT DT DT
|
Enters the six numbers1, 2, 4, 5, 7, and 8 |
Find standard deviation |
2.5
|
Finds the standard deviation, σ = 2.5 |
Clear old data |
|
Always do this before entering new data. |
Pressing RCL and C on this calculator to find n is a good check to see that all values have been added.
In this case if should give n = 6.
Finding standard deviation of a sample of a frequency distribution using a calculator
The procedure for finding statistical values will vary slightly from calculator to calculator.
For a Casio FX82TL calculator, to find the standard deviation of the frequency distribution:
x
|
f
|
3
|
2
|
4
|
7
|
5
|
8
|
Task
|
Press
|
Action
|
Select the statistical mode |
MODE
|
Gives three choices: COMP(1), SD(2) or REG(3) Selects SD (statistics) mode. |
Enter the data |
DT DT DT |
Enters the seventeen numbers. (To check press RCL followed by C) |
Find standard deviation |
0.680931582
|
Finds the standard deviation, s = 0.68 (to 2 sig. fig.) |
Clear old data |
|
Always do this before entering new data. |
Finding standard deviation of a frequencty distribution on a spreadsheet
x
|
f
|
3
|
2
|
4
|
7
|
5
|
8
|
Enter the data in the spreadsheet:
Now use the formula and values from the spreadsheet to find s = 0.68 (to 2 sig.fig.)
Why are there two standard deviation buttons on a calculator?
Usually σn is used for the denominator for finding the standard deviation of a population when all of the numbers are used, as in our example.
σn -1 is used for a sample taken from a population, when some of the extreme values of the population may be missing and the standard deviation under-estimated.
Unless you have taken a sample, use the button σn although both values are very close.
Transforming the data
If each value in a set of data is changed in the same way, this may effect the values of the mean and standard deviation.
Add c to each value
|
Multiply each value by d
|
|
Mean
|
increases by c
|
multiplied by d
|
Standard deviation
|
stays same
|
multiplied by d
|
e.g. The mean of a set of values is 4.2 and the standard deviation is 5. If each value has 6 added to it, what is the new mean and standard deviation?
The mean would become 10.2 and the standard deviation remains at 5.