We just marked the three-year anniversary of Britain’s shocking vote to leave the European Union, shocking because polls told us there would be no Brexit.
Most young Britons supported remaining in the EU. But many of them didn’t vote that day. The behaviour of these quiet, disengaged young people made the crucial difference to the outcome.
How come the polls missed this? What did pollsters ask these young people who did not plan to vote? Not very much: Pollsters did not speak with many of them. As a result, the Brexit polls overrepresented the views of voters and underrepresented the disengaged.
The wrong Brexit call is one notable example of the real risks of relying on data from the usual, louder, engaged voices rather than the most diverse set of voices, including quieter ones. The same thing has happened in numerous referendums or elections since. A recent example is the Australian election in May in which every single poll called it wrong. Prime Minister Scott Morrison credited his victory to the “quiet Australians” who voted for him.
The risks of excluding quiet voices apply far beyond predicting referendums or elections, to a wide range of critical business and economic issues. To reliably predict demand or to avoid overreacting to a corporate crisis, businesses need to hear from the most diverse set of voices.
Take, for example, Facebook’s newly announced Libra cryptocurrency. The impact of broad-based adoption could be wide-ranging, including potentially diminishing the role of central banks, boosting other cryptocurrencies and disrupting traditional banking. Banks have a tiny fraction of the almost 2.4 billion customers to which Facebook has access. The effects on the financial system hinge on whether Libra will be used widely. But to get a reliable signal on adoption, we need to understand the full range of potential users, many of whom are left out of traditional data collection.
This includes those without bank accounts, potential Libra users in both emerging and developed economies, and those microbusinesses that are already cashless. Hearing the perspectives of a diverse set of young people – those who are going to be driving the future of money – will be predictive. What would it would take for them to start to adopt Libra? Are they concerned about privacy? Do they care whether their money is issued by a technology company or central bank? If we hear mainly from traditional bank customers in North America or analyze the narrow slice of voices on social media, we could significantly misjudge the true likelihood of widespread adoption.
The same principle applies to understanding labour-market trends. Young people and new immigrant groups tend not to participate in the surveys that underlie employment data. But we need more inclusive data that capture these groups or we risk falsely assuming that the economy is stronger than it is (if these groups are suffering), or, on the other hand, weaker than it truly is (if these groups are thriving in the online gig economy).
Traditional tools of business intelligence such as focus groups and panel surveys, and newer ones such as social-media analysis, may amplify our biases when they exclude the diverse set of voices that we need for reliable signals.
Many equity analysts now use machine-learning tools to analyze vast amounts of Twitter and other social-media data in the hopes of gaining higher than average returns. Social-media analysis is appealing because it provides a large, continuing dataset. But big data can heighten the risk of drawing a wrong conclusion when applied to a narrow group of voices.
The Brexit crisis continues to be a good example of why including the most diverse set of voices is critical. Britain is on the verge of leaving the EU without a deal. Many continue to call for a second referendum, hoping to reverse the 2016 result.
But what would the quiet young voices that did not vote then tell us now? To find out, RIWI Corp. randomly engaged more than 6,000 Britons, from March to June, including more than 3,400 people between the ages of 18 to 40. A slightly larger majority of young people than in the previous referendum said they support staying in the EU. But what we also learn from these quiet young voices is that the anti-Brexit camp can only count on half of this support, since roughly half of these young people told us they would not vote. This translates into a high risk of the same result as in 2016.
To more accurately predict the outcome of the 2020 U.S. election, it will be critical to find better ways to hear the voices of those voters who elected President Donald Trump last time, but were reluctant to openly admit their support before the election.
As we evaluate data for decision-making, we need to ask: Who is left out? What can we learn from the quiet voices? Data bias is not only an ethical issue, it is a data-quality and business-intelligence issue. Leaving groups out of data collection means we risk getting our predictions and decisions wrong.
Danielle Goldfarb is head of global research at RIWI Corp., a trend-tracking and predictive data firm active in every country of the world.