This post is not about finance, it's about politics. This is why I'm making it a page and not a post. I did this just for fun but, since it's concerned with party politics, I'll start of by pointing out that I'm not an American Citizen, so I am unable to vote and am not peddling a political agenda. However, I am interested in the political environment I live in and, like many others, follow opinion polling. My experience from the campaigns of this decade lead me to believe that each polling organization tends to exhibit a bias, in one direction or the other, and that it is useful to sample multiple pollers to form an opinion as to the actual state of opinion, so to speak.
This leads me towards proposing a factor model for opinion. We'll start with the ensemble of polling results for Presidential Approval Ratings, as assembled by Real Clear Politics. I took this raw data and combined pollers that appeared to be the same organization despite disparate names (for example “Rasmussen” vs. “Rasmussen Reports”, which is obvious, but also “CBS News” vs “CBS News/NY Times” etc.).
We propose the following simple model to describe this data stream
Here the term αi represents the poller specific bias and ft represents the “true” Presidential favourability factor. Ait is the poller's reported approval rating. The dataset also includes Nit, the poller's sample size, so we construct the following χ² statistic for the data.
In the above the “10000” merely allows the data to be expressed in percentages. The term si is more important. There exists pollers within our dataset for which there is only a single statistic quoted and that is the only statistic quoted for that date. This prevents the model being solved for those poller/date combinations as it is bilinear in those parameters. However, we would like to retain this information in some form. To allow this, I introduced a dummy variable which sets the bias to zero for those pollers. Thus their information can be used to estimate the favourability factor but not the bias.
Numerically, the data matrix Ait is extremely sparse and, even though we are minimizing a quadratic form, it is difficult to solve. I took the approach of using a simplex algorithm to find the approximate minimum and then iterating with a gradient solver to allow the estimation of parameter error bounds. Our bias estimates are presented without comment in the table below. To sum up, I will note that this table does not necessarily mean that a poller with a non-zero bias is wrong. The only way to find out the true favourability is to sample sufficiently large proportion of the population that polling the rest would not change the answer. The law of large numbers would suggest the interpretation that the underlying favourability series is a better guess than following an individual poller, but it need not be so for any given realization.
| Poller | Bias/% |
| ABC News/Wash Post | 1.4863 | ± | 0.7333 |
| Associated Press/GfK | 1.1232 | ± | 0.7085 |
| Bloomberg | 1.9990 | ± | 1.2920 |
| CBS News/NY Times | 0.3464 | ± | 0.4960 |
| CNN/Opinion Research | 0.6640 | ± | 0.5463 |
| Cook/RT Strategies | -0.7706 | ± | 4.2197 |
| Democracy Corps (D) | -0.3907 | ± | 0.9312 |
| Diageo/Hotline | 0.4527 | ± | 1.3391 |
| FOX News | 0.2500 | ± | 1.3921 |
| USA Today/Gallup | 1.1554 | ± | 0.3498 |
| Ipsos/McClatchy | 0.8675 | ± | 0.9489 |
| Marist | -1.1538 | ± | 1.3842 |
| NBC News/Wall St. Jrnl | -0.2933 | ± | 1.5054 |
| Newsweek | 0.4614 | ± | 4.6034 |
| NPR - POS/GQR | -0.1049 | ± | 5.7825 |
| Pew Research | -0.6175 | ± | 0.7567 |
| Quinnipiac | -1.8310 | ± | 0.5104 |
| Rasmussen Reports | -2.2421 | ± | 0.3293 |
| Time | 0.5007 | ± | 2.6760 |