Freeman’s Data

Exit Polls Legacy blog posts

Steven Freeman, the author of the widely circulated paper entitled, “The Unexplained Exit Poll Discrepancy,” has posted exit poll tabulations for 49 of the 50 states, plus DC, on his website. While I disagree with many of Freeman’s conclusions about the discrepancy, his data provide a valuable resource: The only collection of “just-before-poll-closing” tabulations I am aware of in the public domain.

While these data are worthy of attention, we should remember some important limitations. The results posted on CNN.com on election night, as now, did not show the overall vote preference measured by the exit polls in each state. Rather, they showed the preference tabulated separately for a wide variety of demographic subgroups and answers to other questions. The tabulations included a table of the results by gender, as well as the percentage of men and women in the total sample. Consider the following example, which is based on the results posted on CNN.com as of today (November 29, 2004 – click the image to see a full size version):

To calculate overall support for President Bush in this sample, multiply Bush’s support among women (48%) by the percentage of women in the sample (54%), then multiply Bush’s support among men (55%) by the percentage of men in the sample (46%) and add the two: (0.48*0.54) + (0.55 * 0.46) = 0.51 (or 51%).

This extrapolation process adds some random rounding error to the tabulations. Freeman also reports his tabulations out to one decimal point (e.g. 51.2%), but the mathematical principle of “significant digits” tells us that results of the underlying calculations are only accurate to the nearest whole percentage point (For an explanation of significant digits, see this site under “multiplying and dividing”).

While we do not know for certain that the results posted by CNN on election night were the final “before poll closing” results, the timing of their appearance online strongly suggests it.  Those who monitored CNN.com on election night reported that exit poll results did not appear in any state until the polls closed in that state. The data that Freeman produces were taken from screen shots just before or after midnight. Given the differences between Freeman’s data and the actual count, we can safely conclude the results had not yet been fully corrected to match final tallies. However, the sample sizes he lists for each state are slightly smaller in every instance I checked than those appearing on the site today. Do the sample sizes differ because of missing interviews or precincts or because of the weighting procedure? Without confirmation from NEP, we can only speculate.

Despite the limitations, Freeman’s data have obvious advantages over other exit poll results reported on Election Day. His tabulations are not based on leaked or “stolen” data or on numbers passed from person to person. They were put into the public domain on the official CNN web site on election night and copied (using “screen shots”) on to a computer hard drive.

I believe Freeman’s data are worthy of our attention for two reasons: The most important is the suggestion by Warren Mitofsky (here and here) and others associated with the exit polls that the discrepancy may result from what survey methodologists call “differential non-response.” That is, Republicans were theoretically more likely to refuse to be surveyed than Democrats. That hypothesis, if proven true, could have important consequences for all political surveys.

Another reason is the continuing speculation about problems in the actual count. Whatever we think about the plausibility of the various conspiracy theories, a fuller presentation of the uncorrected exit poll should shed more light on the issue.  It might even help restore some confidence in the actual count. I would think that the news organizations that own the data would see the public good that might result in putting the relevant tabulations and analyses into the public domain. 

Finally, if Freeman’s tabulations are wrong or misleading, the NEP can easily clear up any confusion by releasing the correct “before-closing-time” tabulations. Similarly, if Freeman is in error in his estimates of the statistical significance of the discrepancies, NEP can tell us more about the appropriate sampling error for the results in each state. Remember, Freeman’s data are derived from results that were publicly released by CNN. Providing more information about the data CNN released and the sampling error associated with it would conform to the spirit (if not the letter) of the principles of disclosure of the National Council of Public Polls (NCPP): “to insure that pertinent information is disclosed concerning methods that were used so that consumers of surveys may assess studies for themselves.”

I have more to say on these data…stay tuned…

Mark Blumenthal

Mark Blumenthal is political pollster with deep and varied experience across survey research, campaigns, and media. The original "Mystery Pollster" and co-creator of Pollster.com, he explains complex concepts to a multitude of audiences and how data informs politics and decision-making. A researcher and consultant who crafts effective questions and identifies innovative solutions to deliver results. An award winning political journalist who brings insights and crafts compelling narratives from chaotic data.