Online data
Online data need little editing following fieldwork as the checks and edits are found within the questionnaire.
Where multiple answers are selected on single code answer questions, respondents are asked to correct their answer and certain impossible answers cannot be accepted by the computer.
After the data were received in the office, rules were set for defining missing values (to describe the reasons for the missing value) and a number of further edits and imputations were possible.
Missing values
In the survey data there are various reasons why a question may not have been answered. On an online survey, in order to allow respondents to proceed past questions which they may not know the answer to or do not wish to answer, codes are used for the answers which allow them to say, ‘don’t know’ or ‘prefer not to say’.
There are also questions which may not be applicable because they were not asked for respondents in that group or were not asked during that data collection period. A respondent may also stop completing the survey part way through.
A consistent series of missing values have been used in the data to denote where data is missing: these are described in the following table.
Missing values and codes used
| Code | Description | Application |
|---|---|---|
| -99 | Missing, should have been answered | Respondents who are eligible for a question but have not answered it |
| -98 | Not applicable: Survey routing | Respondents who are not eligible for a question or measure |
| -97 | Incorrectly multi-coded, implausible or cannot be derived | Used on impairment number variable where known to have disability but number not known |
| -96 | Outlier on minutes of activity | Minutes of activity set to missing because of extreme number of minutes or activities |
| -95 | Don't know / Cannot give estimate | Respondents who are unable to answer a question |
| -94 | Prefer not to say | Respondents who explicitly chose not to answer a particular question |
| -93 | Question not asked due to the COVID-19 pandemic | Questions skipped due to routing (not used in 2024–25) |
| -92 | Missing: question not asked in this survey year or term | Question not applicable during that data collection period |
| -91 | Missing: reason unspecified | Used to replace system missing values unless recoded to -99 or -98 |
| -1 | Score/value not available in derived variable | Used where missing information prevents a derived value being created |
Wherever possible, the base for questions has been set to all respondents. However, for questions not asked at all for one group, missing values must be used.
For the main activity measures, the base is all respondents. If there are missing data on one of the activities, this is just treated as not having done the activity.
This is because there are so many different activities asked about and so many different variables which feed in (days activities have been done, minutes and two intensity questions) that if anyone with missing data on one or more of these variables were excluded, there would be a huge number of respondents for whom these key measures could not be calculated.
Furthermore, the questionnaire was designed so that the absence of an answer for having done the activity in the week is treated as not having done the activity for the purpose of the main activity measures, so there are no missing data on whether the activity was done in the last week.