Wednesday marks the initial release of data from Statistics Canada's National Household Survey, the voluntary survey that replaces the mandatory long-form census. In the run-up to this release, many analysts warned that NHS data (call it census lite) cannot have the same reliability as census data. This is most evident for small-area data, given the main point of a census is accurate data down to the level of neighborhoods. This may surprise many people, who often only hear news reports of census results for summary stats, such as the population of Canada is 35 million. The fact that Statistics Canada can calculate the response rate to the mandatory short-form census reflects that there are more accurate and efficient estimation methods than a census for macro variables like total population or incomes.
The problems with the NHS data at a micro level is already evident in today's release (see Appendix 3 of Chapter 5 on data quality), and will be further revealed over time. One metric will be how much small-area data Statcan had to suppress to avoid publishing flawed estimates. Analysts over time undoubtedly will unearth anomalies in the data, just as they did for Census data: the 96 per cent response rate for the 2006 Census implies that nearly 1.5 million Canadians did not respond at all, and many others did not answer all the questions. Ultimately, however, we will never know the true quality of NHS data.
What is the best way for users to cope with the inevitable degradation of data quality? Forgotten in the recent debate over replacing the census is that this was the second time in recent decades that the census was rescinded. The first was the 1986 census, which was cancelled in a round of budget cutbacks. It was quickly reinstated, partly because of pressure from the business community, especially retailers.
The business community was noticeably silent in the recent debate over replacing the census. The Retail Council of Canada noted that "There's been no outcry from our members," a view echoed by Canadian Manufacturers and Exporters. The Canadian Council of Chief Executives said of the controversy surrounding the census, "We don't have a dog in this hunt." The markedly different reaction of the business community compared with 1986 reflects how it has found alternative ways of getting data on its customers. Firms can now send personalized promotions based on your recent buying behaviour or current location to your smart phone, using Big Data techniques that don't require years-old census data.
It is not just the business community that has adjusted to the changing role of data in our society. The replacement of the census by the NHS initially caused tremors within Statistics Canada. But it quickly moved on with the job of making lemonade out of the lemons presented by the NHS. This occurred while it was absorbing a nearly one-third cut to its budget, and significant turnover in its senior ranks. Both Statscan and the business community have adapted to the loss of census data. It is time other users followed their lead.
It is not enough to criticize the NHS data and leave it at that. Analysts have to be creative in finding alternatives. There is no shortage of data in our world, most of it much more timely than the census. Most of the National Accounts (like GDP) are based on tax records, which provides a rich source of data on the economy. Stories abound of innovative approaches to data outside of statistical agencies. UPS analyzed the movements of its fleet of trucks, and came up with a policy forbidding left-hand turns that saved it millions of dollars. New York City triangulates power consumption and complaints from renters to identify which overcrowded tenements are likely firetraps. Paypal uses statistical models to identify and stop half the fraud in its transactions. So, while easy (and relatively cheap) one-stop shopping at the Statcan discount census store is over, other outlets are opening up in different organizations all the time. Users should explore the growing number of alternative sources.
Eds: This version clarifies the response rate to the 2006 census.
Philip Cross is Research Co-ordinator at the Macdonald-Laurier Institute and the former Chief Economic Analyst at Statistics Canada.