Jan Kestle has been an active participant in the statistics community in Canada for 50 years as Federal Provincial Statistical Focal Point for Ontario, President of Compusearch, Member of the National Statistics Council and President and Founder of Environics Analytics.
The need to keep personal information private and for reliable statistics on which to base good government is colliding again in a raging public and political debate over a Statistics Canada pilot project. This project would test the use of bank records in evaluating how using existing “big data” can replace traditional methods in a rapidly changing world.
The key point is this: Should we base official statistics on a reliable representative sample of Canadians or only on those who decide they want to give their information to the agency? Feels like déjà vu all over again to me.
The government tried this approach when it made a large part of the 2011 Census voluntary and, as a result, much of the data were unusable. There was widespread support among Canadians for the reinstatement of the compulsory census in order to ensure that government has the foundation for evidence-driven decision-making. Obtaining data only from those who are willing to share does not provide an accurate picture of all Canadians.
We need to dig deeper to understand why it’s not a bad idea for Statscan to use personal bank data of Canadians for the development of crucial official statistics.
As part of a modernization initiative, Statscan has designed a pilot project to test the accuracy and viability of using personal bank data as an alternative to the traditional survey tracking the spending habits of Canadians. The current survey approach requires a sample of households to track and provide a record of their detailed spending for a specified period. This is a time consuming and, some would say, antiquated process since all of that information is summarized in an individual’s banking and credit card records.
Having access to accurate data is crucial to policy makers and citizens alike as they inform cost of living increases, house price fluctuations, social programs, economic development and many other socioeconomic issues.
In this pilot, Statscan would access a random sample of bank data of 500,000 Canadians. This information would be combined with demographic and other data that Statscan already has collected to provide a combined picture of the demographic and spending data. Once the link has been made – carried out by government employees under the very strict provisions of the Statistics Act, which also governs the collection of data directly by Statscan – all identifiers will be removed. The data will be anonymized, aggregated and processed.
The suggestion that this approach will result in the government tracking people’s individual spending in “Big Brother” ways is incorrect. These data will be used to produce statistics comparable to those collected in the traditional way to determine whether this new approach meets the agency’s stringent quality measures. The pilot will even test whether the link to individual demographics is required.
Statscan has the legal right to obtain these data without notifying citizens or obtaining their consent. The Chief Statistician, Anil Arora, has said in public statements that this approach was developed in consultation with the Canadian Bankers Association and the Office of the Privacy Commissioner.
Some have argued that although the Statistics Act does ensure that the data will be kept private and confidential, there is still no real need to move to this new approach at this time. This is wrong. It is increasingly difficult to get respondents to complete long paper-based surveys manually. This will only get worse and the quality of government stats will become even more challenged over time.
Now is the time for testing and evaluating new approaches. These experiments are essential. Throughout this process, the government must also disclose the plans for these changes to Canadians, ensuring that they understand the importance of the data to their lives. It is equally important that they understand that Statscan is an independent agency of the federal government and follows strict laws that protect these personal data, whether they collect them directly or access them from existing databases.
Recently, the government mandated some changes at Statscan to assure it has more independence and adheres to best practices from a scientific and methodology point of view. It also mandated modernization to ensure accurate and cost effective data collection and processing in the face of the changing world of data. And the right to access data and the responsibility to safeguard it are deeply enshrined in the Statistics Act.
But how do we reconcile this with the right of consumers to informed consent relating to use of their personal data? These uses should always be disclosed, and the consumer should always have the right to opt out, with one exception – Statscan.
Our national statistics agency has the right under law to access certain information provided it handles that personal data under the provision of the Statistics Act, which offers all of the assurances of privacy and confidentiality included in the Personal Information Protection and Electronic Documents Act.
These new approaches being tested are essential to the future of Statscan, but they must be implemented with great care and transparency, hence the pilots. We also need effective communication about what Statscan might access from outside data sources and why this must accompany these changes.
In the end, Canadians once again have to decide. Am I willing to share my personal information with the independent professional agency, Statscan, for them to develop high quality data as a foundation for evidence-driven decision making in government, while ensuring that the data are kept private and confidential? Or should we go with an approach that results in making important decisions based on data from only the subset of the population that agrees to provide it?
Testing, developing new methods and communicating effectively will take time. But there has been talk within the national statistics community for decades about using administrative and other existing data to reduce costs and respondent burden. It’s well past the time to start. Statscan is recognized as a leader, if not the leader, in the world statistical community and deserves the trust and support of Canadians to do their job.
We made the right decision with reinstating the mandatory census. Let’s not go backward.