This is a case where a series of different factors is catching our methodology being very reactive to the most recent polls and perhaps a bit flat-footed too. Here's why. We use a version of the LOESS regression formula to generate what's called a trend estimate of all the available polls. What that means is that it's not only trying to capture the 'average' or upshot of all the polls together, it's trying to show the trend, specifically which way the trend is moving.
Now there's an additional factor for us. It's been basic to our methodology from the start not to rate or weight different pollsters. There's a good argument for doing so. But our aim has been to be as agnostic as possible. So that means that for us Rasmussen and UPI and Morning Consult count just as much as NBC/WSJ or ABC/WaPo even though I would say the latter two are much more credible than the other three. But again, that's a basic part of our approach, not giving pollsters different weights but putting all the pollsters into a standard formula and seeing what we get. This is a pretty good approach. After the 2012 election, the Obama campaign ran the numbers on each of the top poll aggregators - RCP, TPM and HuffPollster. The Obama campaign's internal numbers came out most accurate. Of the three aggregators, we came out on top.
Two things have happened here that make this result today unrealistically low, I would say, for where the race is. The first is that over the last four years we've had an explosion of lots of different pollsters of very different qualities. That's particularly the case with a lot of the new online polls. What that means is that there are many more polls and the range of quality and reliability between them has grown. That's put our 'agnostic' approach I noted above under some strain.
There's an additional factor on top of that. The lowest quality pollsters are releasing data most often. So for instance Rasmussen and UPI and Morning Consult have a completely non-duplicative set of data each week. Since they are either robopolls or online polls they also tend to have higher numbers of undecided voters and thus both candidates at lower absolute numbers.
The other point is that we put a lot of emphasis on trend. If you look at the trend over the last several days it looks like this.
Fox (+8), UPI (+5), NBC (+10), ABC (+4), Franklin Pierce (+5), Tarrance (+8), MC (+5), Rasmussen (+2) ... There's also not just the spread but Clinton's absolute number. Fox (49), UPI (50), NBC (51), ABC (50), FP (46), GWU (47), MC (46), Rasmussen (43).
As you can see, both in the spread and especially in Clinton's absolute number, there appears to be a clear downward trend in the numbers toward the end. It's particularly clear in Clinton's absolute numbers: four polls at 49 or above and then 46, 47, 46, 43. Now looking at the polls individually, we can see that you have Morning Consult, a relatively low quality poll which has both candidates at lowish levels followed by a very low quality poll in Rasmussen. But again, our regression doesn't look at the quality of the pollster.
(The contrary approach is to create a more complex model that smoothes out these kinds of variables. Again, there's a lot to be said for that. The counter is that you can end up seeing more of the more than what the polls are actually saying.)
To have a sense of how much difference this makes, here's the trend chart from June 1st to the present ...
Now let's look at the same period with Rasmussen, UPI, Gravis Marketing and Morning Consult removed.
Needless to say, a very dramatic difference.
Now, why not change the methodology? Believe me, on a day like today I'd love to. This result today brings together all the weaknesses of our approach - or shows a wobbliness that makes the race seem less stable than it probably is. But the biggest rule, in my mind, of aggregation is you don't change your approach mid-stream. So that's the story.