How to avoid making the same mistakes the polls made with Trump

Vasant Dhar

By Vasant Dhar

Getting the data right is hard work and it isn’t an exact science, but it can a worthwhile endeavor especially when the stakes are high.

By Vasant Dhar

The morning after the U.S. presidential election, a reporter asked me why “big data blew it on Tuesday.”

Nate Silver’s model had given Trump a 29% chance of victory and other models tracked by The New York Times had put Trump’s chances at 15%, 8%, 2%, and 1%. Silver pointed to the uncertainty associated with his prediction due to polls not being perfect, “correlated errors” among polls across important states, and larger-than-usual undecided voters. Indeed, small errors in estimates in a few states can have a significant impact on the electoral vote.

But let us not get distracted by the details surrounding the statistics and ask ourselves a simpler question: why were ALL data-driven models wrong in terms of expectation? Considering that policy makers and managers must occasionally make decisions based on expectations despite the uncertainty surrounding them, what can we learn about data-driven prediction for such instances?

Read the full article as published in MarketWatch

Vasant Dhar is a Professor of Information Systems.