Skip to main content

Michael Wolfson is a former assistant chief statistician at Statistics Canada and current member of the University of Ottawa’s Centre for Health Law, Policy and Ethics.

Recent outrage by some members of Parliament about an alleged secret collection of Canadians’ data by the Public Health Agency of Canada (PHAC) illustrates how misguided many are about the various kinds of personal data – and how they should and should not be used.

Broadly speaking, we should differentiate between two kinds of data that can be collected about people. One is individually identifiable data: In the area of health care, these data typically contain sensitive information – including our names and medical conditions – that we would not want anyone except our health care providers to see.

The other kind of data that pertains to people is known as grouped, or aggregated, data. For example, the number of individuals admitted to hospital who are infected with COVID-19. This kind of data is essential for governments to plan for and provide quality essential services to Canadians.

A second key distinction is how individual data and grouped data should be used. It is vitally important to distinguish “public good” uses from private-sector and commercial uses. Until about a decade ago, private-sector uses were well below the radar, since the means to gather huge amounts of personal data were far more limited. However, with the dramatic growth of big tech firms, along with the dramatic penetration of smartphones into our personal lives, the private sector has become a giant player in collecting and using sensitive, individually-identifiable data.

Opinion: Our faces are who we are to the world. What happens when they become data?

How major institutions want to use debit-card, search and phone data to measure the economy

The problem with the recent loud squawks and ill-informed claims attempting to connect PHAC’s use of cellphone data to an infringement of Canadians’ privacy is that it fails to grasp these distinctions.

One of the most important public health interventions over the past two years in dealing with the COVID-19 pandemic has been the encouragement of physical distancing – if two people are not near each other, the virus cannot spread. An obvious way to monitor how well this is working is to check mobility patterns, such as seeing how often people are staying at home and whether areas with more crowding have higher rates of infection.

Collecting this kind of data would have been virtually impossible 10 years ago, but with modern smartphones, mobile-service providers can observe the nearest cell tower for each phone on their network. It is quite easy for them to determine what percentage of each day a phone is near a single tower. Using this information, they can produce aggregated data for a series of neighbourhoods, generate statistics and then determine, for example, the proportion of cellphones that were in their ‘usual place of residence’ for most of the day – or places that appear more crowded and have a likelihood of higher infection rates.

Even if there were low limits on the radius of the locations assessed – covering, say, at least 20 phones – there is no way the data could be used to identify any unique individual, since they would be aggregated.

These are the kinds of aggregated phone data provided to PHAC, and they have been vital for informing government policies how to best target shutdowns and other physical distancing measures to fight the pandemic while minimizing the inevitable harms for businesses and individuals.

These data do not invade anyone’s privacy.

However, the ways in which telcos and big tech firms are collecting and using individually identifiable data is another story.

The kinds of information routinely collected via the use of smartphone apps and social media sites, for example, are being collected lawfully, with “consent,” and often for good and useful purposes. But does anyone really check the legalese in the fine print of these consent statements before clicking “accept” so that they can thoroughly understand how their data will be used?

The outrage over PHAC and data collection highlights a major issue in Canada. On the one hand, we do not really know how all of the data collected by private-sector firms are being used. On the other, unwarranted privacy fears risk hobbling all sorts of highly important and positive public uses of data.

While the risk of reidentification of individuals from aggregated data may be vanishingly small, it would be helpful to add privacy protections to the criminal code to make it an offence for anyone, intentionally or maliciously, to reidentify any individual.

Equally important would be for Canada’s Privacy Commissioner, assuming his detailed investigation does not uncover anything more than is already public, to state very clearly that PHAC did not invade anyone’s privacy and that the use of these data was clearly for the public good.

Stronger regulation of big tech firms – at the very least, insisting on greater transparency about their data collection and use – remains important, but must be distinguished as another matter entirely.

Keep your Opinions sharp and informed. Get the Opinion newsletter. Sign up today.

Follow related authors and topics

Authors and topics you follow will be added to your personal news feed in Following.

Interact with The Globe