Skip to content

Data validation and management: Checking of combined final file

Back to sections

Checking of combined final file

Checking raw data

Once the final data had been collated and all illegitimate responses removed, further checks were undertaken.

Firstly, all variables were checked to ensure that the number and patterns of response tallied with what was expected based on the survey routing.

Once this was confirmed, further sense-checks were conducted to ensure that the broad pattern of responses made sense against what might be expected.

When processing postal responses, questionnaire serials are occasionally incorrectly entered. This can be caused by the barcode being smudged, causing the scanning software to read it incorrectly, or by the barcode being completely obscured (sometimes by the participant) such that manual entry of the serial is required, with a mistake then being made by the data entry clerk.

It was necessary to ensure that all cases had a valid serial, so that important demographic information could be matched back into the final datafile. This was achieved by firstly comparing the final datafile against the original sample to identify cases with invalid serials.

These cases were then rechecked against the sample to find the closest matching ‘genuine’ serial. Following that, a final sense check was carried out to ensure that the suggested matching serial was sufficiently similar to the scanned serial.

In all cases, either digits had been transposed, or a typographic error had been made (e.g. a visually similar number was inserted).

Checking final survey weights

After the survey weights had been created the following checks were carried out:

  • that the weighted profile of respondents matches the weighting targets as closely as possible;
  • that the range of weights is not excessive (note that the sample design for this study means that there are large weights);
  • that the weights correspond correctly to the mode, phase and group for that case;
  • that every case has a weight >0 for each of the weighting variables where relevant; and,
  • that the weighted analysis is different from the unweighted analysis but that the difference is of the scale and direction expected.

Sign up to our newsletter

You can find out exactly how we'll look after your personal data, but rest assured we'll only use it to make sure you receive our newsletter, to understand how you interact with our newsletter, and to provide administrative information about our newsletter.