It's only one week until Canadians get the first taste of the National Household Survey, the voluntary and controversial version of the long-form census. Some economists and statisticians have warned the public about its flaws. Statistics Canada has claimed the data will be "worse" than its predecessor. But how much stock should we put in the data?
Here are four reasons we shouldn't have complete faith in the NHS — and one reason we should.
1. The response rate was low — notably in smaller communities
Statscan distributed the NHS to 4.5 million households and hoped for a response rate around 50 per cent. While the national average was 68.6 per cent (compared to 94 per cent with the mandatory long-form version), Statscan also said nearly 12 per cent of communities had response rates lower than half — mostly in towns with smaller populations. That's a problem. The long-form census was useful precisely because it collected reliable data on smaller communities like these. Low response rates there will amount to bigger holes in the data for those communities.
This is also a problem for smaller provinces and territories without many resources. When the Conservative government announced the end of the mandatory long-form census, the Yukon government said they were stuck with a $1.5-million price tag: the cost of conducting a new survey of their own if they weren't confident in the NHS's results.
2. Some groups are more likely to be excluded
Next week's release focuses on groups with traditionally unreliable levels of participation: immigrants and aboriginal peoples. Before it conducted the NHS, Statscan ran simulations in a few key communities and compared the results against the 2006 long-form census. The NHS-style survey captured accurate data on some groups, like those with college education and people in management jobs. Who did it get wrong? Visible minorities and registered aboriginals. Results for these groups had the highest margins of error.
We'll likely never know which data is accurate and which is not. The repercussions will also ripple outwards, since the long-form census was previously used as a benchmark against other data. Now, statisticians and economics will use the NHS data at their own risk.
3. We can't compare most of it against other data
Some data from the NHS can be compared to the 2011 short-form census. Data on the labour force and education is, to some extent, tracked by other agencies. Statisticians can use this data to help judge the value of this new data. But other information captured by the NHS is only found there, like religion and language spoken at home.
Statscan has also warned the public about comparing the NHS to its long-form ancestor. Since no survey of this scale can be replicated perfectly, statisticians will tread lightly with year-to-year comparisons. But Statscan said there is "a real risk" that changes brought about by the NHS will affect our ability to compare the data over time.
4. Not even Statscan can say how accurate the NHS will be
The agency speculated the results will be less reliable than the long-form census. But they stopped short of giving a final verdict.
"We have never previously conducted a survey on the scale of the voluntary National Household Survey, nor are we aware of any other country that has," Statscan indicated on their website. Since the voluntary census was introduced quickly with little testing, Statscan said it is too difficult to judge how inaccurate the data could be.
And one reason we should have faith in the NHS:
1. Statscan will make adjustments if they're concerned about data quality
In communities where response rates are particularly low, Statscan will combine data with neighbouring communities to create a more complete picture of the area. They can also withhold data if they don't have sufficient confidence in the results. Some values may be suppressed or rounded off if only a few individuals were recorded.
Despite all these reservations, Statscan believes the data will be useful for many users. But it will not, Statscan wrote, "provide a level of quality that would have been achieved through a mandatory long-form census."
Data released on May 8 includes: aboriginal peoples, immigration, citizenship, place of birth, language, ethnic origin, visible minorities and religion. Check back with The Globe next week for an interactive guide to the NHS results.