The new 2011 National Household Survey is of interest for two reasons: First is the quality of its data. Second, and to me more important, is whether the replacement of the long-form census with this survey is sustainable if our objective is to maintain the quality of all social and household data produced by Statistics Canada.
On the quality of NHS data, I fully agree with Statistics Canada's view that the survey "will produce usable and useful data that will meet the needs of many users … It will not, however, provide a level of quality that would have been achieved through a mandatory long-form census."
The NHS has released data for just 75.3 per cent of the Canada-wide 4,567 census subdivisions, compared to the 2006 long-form census rate of 96.6 per cent. The data not released were of particularly poor quality. For Saskatchewan, at the low end, the release rate is just 57.4 per cent.
Another problem is data comparability over time. Given the magnitude of change from the 2006 census, it is not clear whether the NHS data reflect a real change in outcomes or simply a statistical artifact due to the change in methodology. For current and future researchers, the gap in 2011 census information will be a major headache.
Some have suggested, correctly, that a researcher may want to be cautious in using the NHS data if they show changes from the 2006 census that would seem to be unreasonable at first sight. However, the problem is bigger than that, because there is no good anchor with which to compare the NHS data.
Consider a variable for which the 2011 NHS data look reasonable. What if the reasonableness is the result of an offsetting combination of bad quality survey data and a real abnormal change in the underlying trend? For example, suppose the average wage of a particular group of workers fell, in real life, by 10 per cent over five years (this information would not be known, but needs to be known from a census), but the survey response bias yielded a wage gain of 15 per cent (this information may not be known), with a net increase in that group's reported wage of 5 per cent over five years in the survey, similar to the overall increase in wages. What should a researcher do when data look reasonable but the underlying factors are unknown?
The more important issue of replacing the census with the NHS is the potential for producing a downward spiral in the quality of social and household data over time. For a statistical agency that ranks among the best in the world, this should be serious cause for concern. The agency has some of the best and brightest statisticians in the world and their excellent work does show up in the mitigation strategy they have applied to produce the NHS data. But no amount of their excellence can overcome the fact that their hands are tied behind their backs.
Statistics Canada collects a considerable amount of social and economic data using a range of surveys. These raw data are affected by response biases. Statistics Canada used to "adjust" these raw survey data by using the long-form census as an anchor. For example, if a population group's survey response rate is low, Statistics Canada would use the group's census weight to generate aggregate wage information.
A census used to be done every five years to ensure that the anchor provided appropriate, up-to-date information in order to adjust data from other surveys. We are now in a funny upside-down world: We're using the old census data to fix the survey results when the objective was to find a new anchor to fix survey results because the old anchor was out of date.
This is a vicious circle. The 2006 long-form census will continue to be used as an anchor to adjust other surveys and the longer we use it, the less reliable it will become. Presumably, survey information gathered in the future using this flawed process will be used to correct the 2006 census anchor, which would then be used to adjust other surveys. At some stage, it will become a process of garbage guiding garbage.
The only way to avoid this is to restore sanity and bring back the long-form census, but the decision can't wait long. Statistics Canada must start preparing soon if there is to be a long-form census in 2016.
Munir A. Sheikh, former chief statistician of Canada, is a distinguished fellow at the School of Public Policy, Queen's University.