Princeton Election Consortium

A first draft of electoral history. Since 2004

Is 99% a reasonable probability?

November 6th, 2016, 11:31pm by Sam Wang


Three sets of data point in the same direction:

  • The state poll-based Meta-Margin is Clinton +2.6%.
  • National polls give a median of Clinton +3.0 +/- 0.9% (10 polls with a start date of November 1st or later).
  • Early voting patterns approximately match 2012, a year when the popular vote was Obama +3.9%.

Based on this evidence, if Hillary Clinton does not win on Tuesday it will be a giant surprise.

There’s been buzz about the Princeton Election Consortium’s win probability for Clinton, which for some time has been in the 98-99% range. Tonight let me walk everyone through how we arrive at this level of confidence. I will also give a caveat on how it is difficult to estimate win probabilities above 90% – and why fine adjustments at this level do not matter for my goals in running this site.

An obvious contrast with PEC’s calculation is the FiveThirtyEight win probability, which has been in the 60-70% range. As a prominent outlier this season, FiveThirtyEight has come under fire for their lack of certainty. Its founder, Nate Silver, has fired back.

Let me start by pointing out that FiveThirtyEight and the Princeton Election Consortium have different goals. One site has the goal of being correct in an academic sense, i.e. mulling over many alternatives and discussing them. The other site is driven by monetary and resource considerations. However, which is which? It’s opposite to what you may think.

Several weeks ago I visited a major investment company to talk about election forecasting. Many people there had strong backgrounds in math, computer science, and physics. They were highly engaged in the Princeton Election Consortium’s math and were full of questions. I suddenly realized that we did the same thing: estimate the probability of real-world events, and find ways to beat the “market.”

In the case of PEC, the “market” is conventional wisdom about whether a race is in doubt. If a race is a certain win or a certain loss, it is pointless to put in money and effort, assuming that the rest of the market is in the game. On the other hand, if a race is in doubt, then it may be moved by a little extra push. Think of it as “math for activism.” This point of view heavily influences my calculations.

>>>

Now think about the FiveThirtyEight approach. I don’t want to get into too much detail. Although they discuss their model a lot, to my knowledge they have not revealed the dozens of parameters that go into the model, nor have they released their code. Even if they did, it is easy to make errors in evaluating someone else’s model. Recall Nate Silver’s errors in his attempted critique of PEC in 2014. So let me just make a few general comments. I am open to correction.

Their roots are in detail-oriented activities such as fantasy baseball. They score individual pollsters, and they want to predict things like individual-state vote shares. Achieving these goals requires building a model with lots of parameters, and running regressions and other statistical procedures to estimate those parameters. However, every parameter has an uncertainty attached to it. When all those parameters get put together to estimate the overall outcome, the resulting total is highly uncertain.

For this reason, the Huffington Post claim that FiveThirtyEight is biased toward Trump is probably wrong. It’s not that they like Trump – it’s that they are biased away from the frontrunner, whoever that is at any given moment. And this year, the frontrunner happens to be Hillary Clinton.

And then there is the question of why the FiveThirtyEight forecast has been so volatile. This may have to do with their use of national polls to compensate for the slowness of state polls to arrive. Because state opinion only correlates partially with national opinion, there is a risk of overcorrection. Think of it as oversteering a boat or a car.

With all that prelude (whew!), let me explain how the Princeton Election Consortium achieves such a high level of confidence.

>>>

We start by generating the sharpest possible snapshot, based on state polls. State polls are more accurate than national polls, which at this late date are a source of unnecessary uncertainty.

For each state, my code calculates a median and its standard error, which together give a probability. This is done for each of 56 contests: the 50 states, the District of Columbia, and five Congressional districts that have a special rule. Then a compounding procedure is used to calculate the exact distribution of all possibilities, from 0 to 538 electoral votes, without need for simulation. The median of that is the snapshot of where conditions appear to be today.

Note that in 2008 and 2012, this type of snapshot gave the electoral vote count very accurately – closer than FiveThirtyEight, in fact.

This approach has multiple advantages, not least of which is that it automatically sorts out uncorrelated and correlated changes between states. As the snapshot changes from day to day, unrelated fluctuations between states (such as random sampling error) get averaged out. At the same time, if a change is correlated among states, the whole snapshot moves.

The snapshot gets converted to a Meta-Margin, which is defined as how far all polls would have to move, in the same direction, to create a perfect toss-up. The Meta-Margin is great because it has units that we can all understand: a percentage lead. At the moment, the Meta-Margin is Clinton +2.6%.

Now, if we want to know what the statistical properties of the Meta-Margin are, we can just follow it over time:

This variation over time automatically tells us the effects of correlated error among all states. Uncorrelated error is cancelled by aggregation under the assumption of independence; what is left is correlated variation. The problem is solved without any regression. Hooray!

As I have noted, the Presidential Meta-Margin tends to move on a one-to-one basis with the Senate Meta-Margin and the generic House ballot. That suggests that downticket effects are powerful, and also that the snapshot calculation does a good job of separating correlated from uncorrelated change.

To turn the Meta-Margin into a win probability, the final step is to estimate how far the results of tomorrow’s election will be from today’s Meta-Margin. As a community, pollsters have pretty good judgment, but their average estimate of who will vote may be off a little. In past years, the snapshot has been quite good, ending up within a few electoral votes of the final outcome. That is equivalent to an uncertainty of less than one percentage point.

Here is a table for the last few Presidential races:

“Actual threshold margin” is estimated using voting thresholds for the several states that were just enough to put the winner over the top. Note that these errors are not symmetric: there seems to be a tendency for the winner to overperform his final Meta-Margin. So it is not clear that Meta-Margin errors are symmetrically distributed. That means we can’t just use the average overperformance – that might be an overestimate of the amount of error that would work against the front-runner. However, the sample is too small to be sure about this.

Another way to estimate Meta-Margin error is to use Senate polls. Here’s a chart from 2014 (look at the Presidential column only):

The directional median indicates a bonus that favors one party over the other. Over the last six Presidential election cycles, the absolute value of the error (i.e. ignoring whether it favors Democrats or Republicans) is 0.6%, really small.

To turn the Meta-Margin into a hard probability, I had to estimate the likely error on the Meta-Margin. For the home stretch, the likely-error fomula in my code assumed an Election Eve error of 0.8% on average, following a t-distribution (parameter=3 d.f.). The t-distribution is a way of allowing for “longer-tail” outcomes than the usual bell-shaped curve.

So…there’s only one parameter to estimate. Again, hooray! However, estimating it was an exercise in judgment, to put it mildly. Here are some examples of how the win probability would be affected by various assumptions about final error:

As you can see, a less aggressive approach to estimating the home-stretch error would have given a Clinton win probability of 91-93%. That is about as low as the PEC approach could ever plausibly get.

I have also included the prediction if polls are assumed to be off by 5% in either direction on average. It is at this point that we finally get to a win probability that is as uncertain as the FiveThirtyEight approach. However, a 5% across-the-board error in state polls, going against the front-runner, has no precedent in data that I can see.

Bottom line: Using the Princeton Election Consortium’s methods, even the most conservative assumptions lead to a Clinton win probability of 91%.

>>>

As I said at the top, my motivation in doing these calculations is to help readers allocate their activism properly. Whether the Presidential win probability is 91% or 99%, it is basically settled. Therefore it is a more worthwhile proposition to work in Senate or House campaigns. Get on over to IN/MO/NC/NH/WI, or find a good House district using the District Finder tool in the left sidebar.

Update: see this exchange, which suggests that a more reasonable uncertainty in the Meta-Margin is 1.1%, giving a Clinton win probability of 95%. Howeve,r to state the obvious, I am not going to change anything in the calculation at this point.

Tags: 2016 Election · President · Senate

115 Comments so far ↓

  • Scott H

    This method assumes a likelihood distribution whose width is solely determined by drift error (0.8% per day) and you know MM today with 100% confidence. But there is also polling error in the measure of MM. The likelihood distribution should be “what is the pdf that we will see on Nov 8 given the probability distribution of MM today?” The likelihood function could take both the error in MM and estimated drift in to account, as others have said, by convolving the distribution from expected drift with the distribution from poling error (from EVs). Then multiply by the prior.

    In the end, it doesn’t change much – Clinton is 97% instead of >99% if nothing else changes.

    http://www.scholarpedia.org/article/Bayesian_statistics#Prediction
    “Prediction in frequentist statistics often involves finding an optimum point estimate of the parameter…and then plugging this estimate into the formula for the distribution of a data point. This has the disadvantage that it does not account for any uncertainty in the value of the parameter, and hence will underestimate the variance of the predictive distribution.”

  • ATF

    “… a more reasonable uncertainty in the Meta-Margin is 1.1%, giving a Clinton win probability of 95%…. I am not going to change anything in the calculation at this point.”

    I understand that changing the code this late in the game would be a bit silly, but you’re also sticking to your guns in part to keep those sweet low Brier scores, aren’t you? ;-)

  • Mike

    Since the predicted probability is so heavily dependent on the error estimate, why not estimate a 90% confidence interval on the error? Then you could publish the resulting probabilities at the top and bottom of the interval (i.e., something like 89%-99%).

    • Brendan

      This is akin to what I was thinking. Even if you can’t actually derive a 90% confidence interval, you could have a few different values for this parameter, creating a few different confidence intervals for the result (e.g., a series of shaded confidence areas in the meta-margin or EV prediction).

  • Edward Allen

    PEC answers the question definitively: the polls show Clinton is ahead. However, while I think you have done a good job of arguing that 538′s approach adds uncertainty, doesn’t their method, which allows for heavily polled states to be used to evaluate poorly polled states, provide a good tool for the activist as well?

  • mike

    “closer than FiveThirtyEight”

    In 2012 you incorrectly called Florida for Romney. 538 got all fifty states correct. How exactly do you spin that as “closer” on the electoral count?

    • Reed

      Sam was talking about final EV count, not state-by-state predictions. I’m not sure what he means with 2012 – 538 was indeed closer here, although FL, from my understanding, was polling so closely that it was a bit of a toss-up.

      2012
      538: 313-225
      PEC: 305-233
      Result: 332-236

      2008
      538: 349-189
      PEC: 364-174
      Result: 365-173

    • Sam Wang

      Yes, I stand corrected on the 2012 state-by-state. I actually found Florida too close to call at the time, but got other states correct individually.

  • PaulC

    I, too, thank you, Sam. In particular I admire the sheer elegance of your work. The simplicity of the meta-margin construct, is a thing of beauty.

    But let me also remind you that you have expressed another important goal; namely, to create a scientific basis for responsible horse-race reporting by professional journalists, among whom there is an institutional bias to cherry pick individual polls seemingly to make races seem closer than they may be. “Poll-aggregation” has come to maturity since 2000 and is now a true force to be reckoned with.

    (Ultimately, by eliminating some number of horse-race noise articles you may be enabling media bandwidth to be refilled with policy signal articles.)

    Nate Silver is and was a pioneer who helped establish poll aggregation. Now, unfortunately, his web site seems to benefit more from the noise than from the signal. If Nate’s “way” has a bias, its toward retaining an overly complex and proprietary system whose complexity adds no accuracy, and whose added uncertainty is used to fuel the kinds of horse race noise articles it used to debunk.

  • Michael

    Is it possible that readers struggle to accept your calculations because they seem counter-intuitive? Maybe people are saying, “it just can’t be this simple! It has to be more complicated than this!” As if it weren’t complicated enough the way you do it.

  • Geoff

    Thanks for this post, Dr Wang. Can one argue for your 0.8% (but perhaps a longer tail) by pointing to the “winner bonus” you’ve written about before? As you wrote a few posts ago, on Obama outperforming his poll numbers against McCain, there may be more “emotional reward” in voting for someone you think is ahead, and that this “implies that there is a hidden bonus for whoever is ahead.”

    The (admittedly few) numbers from recent presidential elections do suggest that if there’s error, it’s more likely to favor the leader.

    To engage in some emotional punditry, I think the optics of the final few days favor Clinton too, what with Comey-redux, Beyonce/LeBron/Springsteen. Trump’s not doing himself any favors wearing a baseball hat that shrouds his face in shadow.

Leave a Comment