## Careful design

Data according to HNC/HND Business core unit 5: Quantitative Techniques for Business (p52) “Data is simply a ‘scientific’ term for facts, ‘figures, information and measurements”. Data can be divided into two, discrete and continuous. Discrete variables can take a finite or countable number of values within a given range, whilst continuous variables may take any value as they are measured rather than counted.

Information is data that has been transformed in some way. It could have been transformed by: summarising the data, tabulating the data, analysing the data and by data presentation. There are two main categories of data, they are primary and secondary. If the data is ‘raw’ it is still un-processed, basically it is still in the format that it was collected, e.g. a list of numbers. Primary data is used for the purpose it was collected, the investigator will know exactly where this data came from and the circumstances under which it was collected. Secondary data is used for a different purpose to that which it was collected, because the investigator did not actually collect the data he/she may not know what limitations there are to the data and it may not be one hundred percent suitable for the purpose that they intend to use it for.

Data can be collected by a variety of methods: 1. Direct observation – this can be expensive but is accurate. It also needs to be unobtrusive. 2. Direct inspection – this is a standard procedure done by organisations whether it is permanent or temporary. 3. Written questionnaire – this is relatively cheap. However, it has a low response rate and needs careful design. 4. Personal interviews – these are expensive but they are able to deal with complex issues. 5. Abstract from published statistics – this is cheap, easy to use but may not be directly relevant to what the organisation wants to know.

To ensure that data is un-bias when collecting data random sampling must be used. A random sample means that each item in the data had an equal chance of being selected. However, sometimes data is not random and is sampled by methods where the randomness is fortified in the interests of cheapness and administrative simplicity. The larger the size of the sample is the more accurate the results will be, however, there is an optimum point where there is little to be gained from increasing the sample size further.

The level below which ‘x’ percent of data values fall is called the xth percentile. There are three commonly used percentiles: the 25th percentile which is known as the lower quartile and denoted by Q1, the 50th percentile which is known as the median and denoted by M and finally the 75th percentile which is known as the upper quartile and denoted by Q3. Descriptive statistics produce a single value, help to describe data and identify summary measures. There are two summary measures of data, measures of location and measures of dispersion. Measure of location involves three averages, the mean the median and the mode.