Cubs win the World Series. Donald Trump wins the White House. What do those two epochal events have in common? Both were considered highly unlikely. And both happened.
Many fans didn’t expect the Cubs to come back from a 3-games-to-1 deficits. But they knew from data that it was statistically possible: Five teams in history had done just that.
Most Americans probably didn’t expect Trump to overcome a polling deficit against Hillary Clinton to win the presidency. Polls were all but unanimous: The odds looked daunting. But daunting isn’t impossible.
In the aftermath of the election, many Americans will judge predictions, projections and premonitions with more skepticism. They’ve learned an important, even comforting, lesson about the limits of polling and other measures: Big Data is not destiny.
Algorithms are formulas written by humans to take the guesswork out of what other human beings will do under certain circumstances. Survey responses to pollsters, consumer buying habits, internet site visits, etc., can be plugged into computer models to suggest people’s future behavior.
But computers don’t read minds. Nor do pollsters. People don’t always say what they think. They change their minds. They can be convinced.
“People mistake having a large volume of polling data for eliminating uncertainty,” writes Nate Silver of the website FiveThirtyEight.com, one of many prognosticators who whiffed the election call. “Yes, having more polls helps to a degree … . Before long, however, you start to encounter diminishing returns. Polls tend to replicate one another’s mistakes.”
Big Data can lead to Big Mistakes. Google Flu Trends, for instance, sought to use data from internet searches to estimate when influenza season would peak and at what level. But it drastically overestimated peak flu levels in the 2012-13 season. That failure “doesn’t erase the value of big data,” wrote David Lazer of Northeastern University and Ryan Kennedy of the University of Houston in Wired magazine. “What it does do is highlight a number of problematic practices in its use …”
Should we toss out data and rely only on experience? That would be a dangerous overreaction. If people believe the data cannot be trusted, they may turn instead to “trusting anecdotes from friends, family and tribe,” political blogger Erick Erickson writes in the New York Times. “Policies will be based on what people think are good ideas, not what data show. This will potentially further divide the country,” he warns.
Humans embrace Big Data — more than they would if it were more accurately billed as Big Guesses — because we live in an unpredictable universe that is often capricious. People feel comforted when they think they know what is going to happen. But reality is elastic. Every moment brings new possibilities. That’s why we vote. That’s why we play the games.
FROM AN EDITORIAL IN THE CHICAGO TRIBUNE