## Statistical Investigations

The use of statistics involve s the gathering of data. This data is then processed, organised and displayed. Finally the data is presented, discussed and conclusions are made.

### Designing a Statistical Survey

The collection of data

Many people require statistics for a wide range of purposes.

e.g. A manufacturer may need to know if there is a demand for a certain product.

A government department may need information to help plan for the future.

A political party may want to know how popular it is.

A medical researcher may need to know if a certain drug is effective at fighting a disease.

The collection of data takes time and is therefore expensive.
Much thought must be put into the planning of any statistical investigation.

e.g.The Australian and New Zealand Census, which collects information about the entire population is held every five years and thousands of people are employed to collect and analyse the data.

Note that in statistics the term population does not necessarily mean the entire population of a town or country, it refers to the overall group being studied. Most statistical investigations do not set out to obtain data about every item in a population but rely on a sample from the population.

Choosing a sample

• The first thing to decide is the size of the sample. This would need to be large enough to be truly representative but not too large as this would be too expensive and time-consuming.
• A sample must be evenly spread over the population.
• Choosing a random sample helps to remove bias. Bias occurs when a sample does not accurately represent the group from which it is taken.
There are several ways to obtain a random sample.

e.g.Draw names out of a hat.

Choose names at regular intervals from an alphabetical list.

Give every person or item a number and choose the numbers at random, using special tables, a computer or a draw (e.g. like Lotto.)

Questionnaires and Interviews

questionnaire is a form used to obtain information. Careful preparation of questionnaires is essential and may require special training. Questionnaires can then be sent to people, which often results in a low return rate. Interviewers can be used to stop people in the street or to ring people to ask the questions. Surveys using these kinds of techniques can often produce biased samples unless they are well designed.

Organising and displaying the data

When the data has been collected, it is often summarised into table and graphical form.

Pictographs, column and pie graphs are common ways of displaying this data.

Data can also be sorted into a frequency distribution and displayed in a histogram or a stem and leaf diagram. Statistical calculations can then be carried out to find information such as the mean, median and quartiles and these can then be shown on a box and whisker diagram.

Analysis of the data

Finally, and most importantly, the data, tables, graphs and calculations can be analysed and the results of a survey presented, summarised and any conclusions drawn.

### Example of a Statistical Survey

The Principal of a College wished to know the amount of TV watched each day, and the types of programmes being viewed by her 800 students. There are many ways to approach this task and the method described below is the one chosen by Anna who carried out the survey.

The collection of data.

Two types of information were required:

To find the hours of TV watched: It was decided to find out the number of hours, to the nearest half hour that each student watched TV throughout the week. Students would be randomly selected and interviewed to find this information.

• To find out the types of programmes watched: a questionnaire would be designed and distributed to randomly selected students and collected personally to ensure a 100% return.

Choosing a sample

In order to ensure a representative sample was chosen, Anna decided to get a list of students' names from the school office. This list was arranged alphabetically in form levels. i.e. Form 3 students, Form 4 students, through to Form 7 students. She then picked every 10th student (a total of 80 students).

Each of the students chosen was then asked the question:

"How many hours of TV, to nearest half hour, do you watch on average per day?"

The results were recorded.

Questionnaire

Anna decided to design a questionnaire to find out the type of programmes watched by her selected students.To avoid open-ended questions with many different responses, which would have been difficult to analyse, she decided to sort out the programmes into types herself and the students would be asked which of these types of programmes they preferred. Her questionnaire looked like this:

Please put numbers from 1 to 7 to show the type of TV programmes you prefer to watch.

(e.g. Of the types of programmes shown 1 would indicate your favourite type and 7 your least preferred type of programme)

 Type of programme Order of preference ( 1-7 ) Comedy Series Documentary Crime Current affairs e.g. News Sport Game shows

Organising and displaying the data

Anna sorted the data on the length of time spent watching TV by form level and found the average, median and quartiles at each level and for the total number of students. She then displayed each mean on a bar graph and the median and quartiles at each level using box plots.

The results from the questionnaire were more difficult to analyse and Anna decided to show the results in a table, indicating the number of times each rating was given to the types of programmes by the students..

Analysing the data

After organising and displaying the data in the ways described above Anna was able to provide the Principal with the information she required.