Opinion: Who owns data about you?

Wendy H. Wong is a professor of political science and Principal’s Research Chair at the University of British Columbia. She is also Faculty Affiliate at the Schwartz Reisman Institute at the University of Toronto. Her latest book is We, the Data: Human Rights in the Digital Age.

The ascendancy of artificial intelligence hinges on vast data accrued from our daily activities. In turn, data train advanced algorithms, fuelled by massive amounts of computing power. Together, they form the critical trio driving AI’s capabilities. Because of its human sources, data raise an important question: who owns data, and how do the data add up when they’re about our mundane, routine choices?

It often helps to think through modern problems with historical anecdotes. The case of Henrietta Lacks, a Black woman living in Baltimore stricken with cervical cancer, and her everlasting cells, has become well-known because of Rebecca Skloot’s book, The Immortal Life of Henrietta Lacks, and a movie starring Oprah Winfrey. Unbeknownst to her, Lacks’s medical team removed her cancer cells and sent them to a lab to see if they would grow. While Lacks died of cancer in 1951, her cells didn’t. They kept going, in petri dishes in labs, all the way through to the present day.

The unprecedented persistence of Lacks’s cells led to the creation of the HeLa cell line. Her cells underpin various medical technologies, from in-vitro fertilization to polio and COVID-19 vaccines, generating immense wealth for pharmaceutical companies. HeLa is a co-creation. Without Lacks or scientific motivation, there would be no HeLa.

The case raises questions about consent and ownership. That her descendants recently settled a lawsuit against Thermo Fisher Scientific, a pharmaceutical company that monetized products made from HeLa cells, echoes the continuing discourse surrounding data ownership and rights. Until the settlement, just one co-creator was reaping all the financial benefits of that creation.

The Lacks family’s legal battle centred on a human-rights claim. Their situation was rooted in the impact of Lacks’s cells on medical science and the intertwined racial inequalities that lead to disparate medical outcomes. Since Lacks’s death, the family had struggled while biotech companies profited.

These “tissue issues” often don’t favour the individuals providing the cells or body parts. The U.S. Supreme Court case Moore v. Regents of the University of California deemed body parts as “garbage” once separated from the individual. The ruling highlights a harsh legal reality: Individuals don’t necessarily retain rights of parts of their body, financial or otherwise. Another federal case, Washington University v. Catalona, invalidated ownership claims based upon the “feeling” it belongs to the person it came from.

We can liken this characterization of body parts to how we often think about data taken from people. When we call data “detritus” or “exhaust,” we dehumanize the thoughts, behaviours and choices that generate those data. Do we really want to say that data, once created, is a resource for others’ exploitation?

The Lacks case illustrates how pieces of ourselves, whether biological or data, link to our humanity. The scenario also emphasizes the human-rights values of autonomy, dignity, equality and community in our everyday lives. How should we – and pieces of us – be treated and used by others? This “feeling” matters, especially when data have become so granular in the chronicling of our lives. In Lacks’s instance, her situation further illustrates how systemic racism accentuates how she was treated unequally within the society she lived.

Perhaps physically removed tissues are distant from our identity. But should we view data in the same light? In the digital era, data weave through our existence, capturing life’s ordinary yet important aspects. Data are also co-created: We do the routine things that data collectors create data about. This everyday quality demands a reimagined perspective, recognizing data integral to shaping our lives. What’s being captured isn’t just economic: it’s social, cultural and political. Consider the interactions you have with your devices, whether scheduling appointments, setting reminders, asking for information or seeking entertainment – all on-demand, travelling with you in your pocket. These are all data-generating activities about our experiences. Acknowledging the human quality of data fosters a more comprehensive, respectful and frankly accurate approach to data rights and ownership.

OpenAI’s ChatGPT will ‘see, hear and speak’ in major update

The inherent humanity in data becomes evident when we use it to create replicas as bots or digital doubles. Digital twins of sports icons Carmelo Anthony and Jack Nicklaus, using data about them, raises ethical and existential questions. What boundaries can we establish on data usage postcreation? In this panorama, who holds a valid place in the human community: our physical selves or algorithms derived from our data? Do these digital selves belong to us, or to the companies that make them?

These questions emphasize the implications of datafication. When data serve to construct bots mirroring our identities, influencing our choices, limiting our opportunities, or educating AI systems, the predicament transcends technological concerns. Data alterations echo alterations to our humanity, affecting dignity, autonomy, equality and community.

The global race to regulate AI is on, but regulatory efforts will be grossly inadequate without clear data ownership definitions. Data have significant effects on how we view one another, as mere data sources or data subjects, or as fellow humans whose behaviours and thoughts have value beyond the economic. Our practices reflect social preferences, cultural ties and our very identities. They are collective, something that the European Union is starting to recognize through the Data Governance Act. Here in Canada, the Consumer Privacy Protection Act, which is part of Bill C-27 and currently making its way though the legislative process, sees data as individually held.

Data ownership might seem straightforward to some: They belong to the individual whose activities generated them. This perspective, endorsed by figures such as hip-hop artist will.i.am, argues for personal data ownership as a human right. This viewpoint suggests that individuals should profit from their data, like one would from property.

Yet, many legal experts do not classify data as material goods, distinguishing them from tangible property such as automobiles or real estate. We do not view data derived from everyday activities as intellectual property either, as they don’t embody creativity akin to a song or poem. Instead, they are mundane records of our everyday, sometimes unconscious, actions – like our purchasing patterns or the way we walk.

The issue transcends mere financial gains, touching the core of our identities within societal structures. The alleged existential crisis in AI doesn’t centre on hypothetical rogue machines. It’s about humans and our computers, and what we want the age of data to look like. From the 21st-century vantage point, humanity without consideration for human rights is a bleak one.

We have to remember that data don’t naturally exist; they are co-created through human innovation, necessitating both data sources and data collectors. We all serve as data sources using digital technologies capable of data collection. Conversely, data collectors determine the data to gather. The absence of either entity – the source or the collector – renders data non-existent. Data are collective at inception, and collective in their use. No one data source matters, as AI needs many, many data points to make predictions about us all.

How do we take into account this collective quality of data? Glen Wyl, a researcher at Microsoft, advocates for data unions to collectively advocate for data. My work with Jamie Duncan cautions against an excessive focus on individual data claims and harms, especially in democratic contexts. We have yet to embrace the collective nature of data and the implications of that.

The absence of a straightforward answer to “Whose data?” doesn’t let us evade this pressing question. A human-rights lens applied to data’s qualities and applications could pave the way for addressing significant policy and regulatory voids. Data aren’t going away, and neither are we.

Latest in

Interact with The Globe