The news of the day has become a minefield. As the times we live in become more and more fraught, who can blame us for wanting to look to the future, to know what might be coming? How we do that, and how we can get better at it, has been the focus of Philip Tetlock’s work for several decades. Fifteen years ago, when he was a professor at the University of California, Berkeley, the Toronto-born Dr. Tetlock published his first book for a mainstream audience, Expert Political Judgement: How Good is it? How Can We Know? That study of punditry revealed the so-called experts were hardly better than random chance at forecasting political and economic trends. Ten years later, he teamed up with journalist Dan Gardner for the book Superforecasting: The Art and Science of Prediction, which looked at Dr. Tetlock’s use of large-scale forecasting tournaments to study and measure participants’ predictive abilities. Now a professor at the Wharton School in Philadelphia (and physical-distancing like the rest of us), Dr. Tetlock insists he’s no better than anyone else at predicting what might lie ahead. What he does know is how to improve our forecasting skills, as well as how we use those forecasts, so that, in days to come, we might be served better by predictions of the future.
What do most people get wrong about forecasting?
The first-order mistake is not to recognize how often we’re doing forecasting. We often say things like, “Well, this might happen. I’m expecting this.” If you dissect a typical column in a serious newspaper or magazine, you’ll find people very frequently making claims about how the future is likely to unfold, with lots of words like might, may and could. These are all implicitly probabilistic claims. If I say, “Putin’s next move might be on Estonia,” readers are often unsure about how to translate that into a probability range. “What did Tetlock mean when he said might be?” Tetlock is not saying there’s a 100-per-cent probability of it happening, and Tetlock’s not saying there’s a zero-per-cent probability. He’s saying something in the middle. How wide is that middle range? Somewhere between about 20 and 80 percent. So, you can do these word-to-number translation exercises, and you can discover just how vague vague-verbiage forecasting is, and how it can lead to serious misinterpretations.
You study and train “superforecasters.” What makes a superforecaster super?
It's the willingness to treat forecasting as a skill that can be cultivated, by moving from vague verbiage to numbers, by keeping score, and by being patient and learning from experience. Being willing to acknowledge when you're on the wrong track.
You said recently that the COVID-19 pandemic is “a product of stunning forecasting failures from Wuhan to D.C. to Rio.” What went wrong?
That's a long list. In 2007, the journalist David Epstein, author of the wonderful book Range, has this wonderful quote: "The presence of a large reservoir of SARS-CoV-like viruses in horseshoe bats, together with the culture of eating exotic mammals, is a time bomb." I mean, you can hardly get a clearer warning statement than that, right?
What's wrong with our society that a knowledgeable prediction like that goes unheeded?
This was a chronic-low-probability event. You're rolling the dice each year, and each year there's a low probability of a nasty pandemic. Let's say the dice are set up, and the probabilities of a nasty pandemic coming up in any given year is maybe three or five percent. The epidemiologists say, “Hey, be careful.” And nothing happens. Next year, “Hey, be careful.” Nothing happens. And people tune them out – “They're crying wolf again.” So, there's a difficulty people have in coping with low-probability chronic risks.
Right now, we're being told by most governments and scientists to “listen to the experts.” But am I correct in saying that your work, in part, provides an alternative to relying on experts?
You really hit an interesting point there. In Expert Political Judgement, the book that preceded Superforecasting, I document that subject matter experts – political economic experts in particular – are often over-confident about their knowledge of the future. I think subject-matter experts are essential. But I think they need to be more careful in the claims they make in public forums. And when they do make claims, it behooves them, given their professional responsibilities, to make those claims in ways that are as testable and score-able as possible.
Are you saying that people should live and die by their predictions?
[Chuckling] Live or die. That’s a bit draconian, don't you think?
A bit, yeah.
I think you put your finger on something extremely important. If you exist in a blame-game culture, in which people are always playing gotcha games, and you wind up on the wrong side of maybe, I'm going to say, “You're not an expert at all.” If I'm existing in a world like that, it’s natural, for me as a subject matter expert, to retreat into vague-verbiage forecasting. If you're in a culture where people feel their reputational survival hinges on never being caught on the wrong side of maybe on a big issue, they're not going to do the superforecasting routine of making transparent judgements and learning from them. They're going to avoid it like the plague, because it's career suicide.
So, you're saying for people to improve at forecasting, we have to allow them to make mistakes.
Absolutely. It sounds funny, because I do believe in accountability, but I also believe in psychological safety.
So, what gets in the way of accurate forecasting?
Some of it's just rooted in the human mind. We jump to conclusions from the news too quickly. Then we're too slow to change our mind in response to later information. We have a system that encourages people to play blame games and to dodge accountability for their own forecasts, and to stick it to the other side whenever they make forecasts that look wrong. So, it's a combination of the human mind and the reward-punishment incentives of the political system that don't encourage the active open-mindedness that is a critical part of superforecasting.
What happens at one of your forecasting tournaments?
The key thing is that it's a level-playing-field exercise. So, the 25-year-old analysts get to compete with the 65-year-old analysts and answer the same sets of questions. So, we get to see who can more quickly deliver accurate odds estimates of what's going to happen next. So it's meritocratic. It's not like the usual status hierarchy where some voices are vastly louder than others.
How many questions are we talking about?
As many as you want. Usually hundreds.
How do you measure their results?
Well, this is one of these tricky things. Nate Silver, of the fivethirtyeight.com site, ran into this very question: How do you measure the accuracy of probability judgements? Nate Silver is a poll aggregator, and in 2016, a couple of days before the November 2016 election, he aggregated all the polls and made a considered judgement. He said he thought there was about a 70 percent chance of Hilary winning the election, about a 30 percent chance of Trump. Now, we all know what happened. The 30 percent probability materialized, and Nate Silver had to explain why the fact that Trump won didn’t necessarily mean his method of aggregating polls was wrong. Because 30-per-cent things happen about 30 per cent of the time. You need to look at all of the predictions that Nate assigned 70-per-cent probabilities to. And if it turns out that when Nate says there’s a 70-per-cent likelihood of things happening, those things happened 70 per cent of the time, and that tells you Nate is pretty well calibrated. To score the accuracy of probability judgements, you need to score the accuracy of many judgements, not just one.
Can someone be trained to become a better forecaster?
That’s what we do in our forecasting tournaments. We’ve developed training systems that do indeed improve the accuracy of subjective probability judgements of real-world events by 10 to 15 per cent. If you really want to get the very best out of superforecasters, you want to create a weighted average. If you have a bunch of forecasters, and you average their forecasts, that average will tend to be more accurate than the majority of the forecasters whose judgements were the inputs into the computation of the average. If your goal as a policy maker is to get the best probability estimates on the table, you’re probably well advised not to listen to one superforecaster. You’re much better advised to listen to a group of superforecasters who come from different political points of view, and then average them. Did you ever see the movie Zero Dark Thirty?
I did, about killing Osama Bin Laden.
Right. In it, the director of the CIA has a bunch of analysts around the table, and he’s asking them whether Osama is or isn’t in that particular compound in Abbottabad. Now, imagine he goes around the table and asks each analyst how likely it is that he’s there or not there. Around the table, they all say, “70 per cent.” What should the director conclude is the true probability of Osama being in the compound? Most people think it’s kind of a stupid question, because the answer has to be 70 per cent, right? But there’s a subtlety here, and it’s one of the most important principles of superforecasting. The director should conclude that the answer is 70 per cent if and only if the analysts around the table are clones of each other, and they’ve drawn on exactly the same information to reach the same conclusion. But if one of the analysts is drawing on cyber intelligence, another on code breaking and another on human intelligence – they have all sorts of different information, and they’re independently arriving at 70 per cent. What should the director conclude is the correct answer? Our models would suggest up to 80 or 85 per cent. It’s materially more probable.
Is there a way that a company wanting to improve its profit outlook could apply the concepts of superforecasting to do that? Beyond getting multiple analysts in the room.
I do work with companies and businesses, and I think the answer is yes.
Can you give us a thumbnail?
Don’t be afraid of disrupting stale status hierarchies. There are companies in which the CEO doesn’t want it to become known that Bob in the mailroom can do as good a job at forecasting some trends as the CEO can. It’s the same problem as in the intelligence community, where baby boomer senior analysts don’t want it to be discovered that junior analysts can do as well as they can. Stale status hierarchies make it harder for organizations to be nimble in fluid business times, like now.
So, companies that get input from all levels are better likely to succeed?
It's how they get the input. They need to collect the input in a disciplined way that incentivizes truth telling, no matter who gets offended. It's a very hard thing to pull off. You've got to create the right atmosphere for it.
When I think of forecasting and business, I sometimes think of Bill Gates, who seems to have gotten a number of things right over the years. He anticipated something like the iPhone, for example. But he never built one. What has to happen to connect the forecast to action?
It’s never going to be deterministic. I think the best that responsible social science advisers to policy makers can do is to lay out the facts. They can’t guarantee that the politicians will be wise enough to act on the facts. And they can’t guarantee that the CEOs will be wise enough to act on the facts. Good forecasting is not a substitute for good decision making. It’s an input into good decision making.
Your time is valuable. Have the Top Business Headlines newsletter conveniently delivered to your inbox in the morning or evening. Sign up today.