Okay, we know from Part I that survey designed to forecast an election should ideally focus on those who actually vote and ignore non-voters. We also know that we cannot select likely voters by simply asking, “will you vote?” The most common technique among pollsters is to use some combination of questions to identify a group of the most likely voters whose size, as a percentage of adults, resembles the level of turnout expected on Election Day. My posts over the next few days will describe the mechanics of what pollsters do in detail.
For now, however, I want to focus on one issue: Even if we think we know the likely turnout on November 2, can we precisely calibrate the size of the likely voter sample to match it? Much of the recent debate surrounding likely voter models assumes that we can. Let me suggest two reasons that may be a risky proposition.
Problems with Voting Age Population. Although we typically calculate turnout as a percentage of the Voting Age Population (VAP), there are good reasons to be skeptical of calibrating the size of the “likely voter” sample to match such estimates. Michael McDonald, a political scientist at George Mason University, estimates that roughly one in ten persons included in the VAP was ineligible to vote in 2000, including non-citizens (7.7%), ineligible felons (1.5%) and ineligible persons living abroad (1.4%). When he takes these categories into account, McDonald’s estimates of 2000 turnout increase from 50% to 54% (see also the Washington Post op-ed by Popkin and McDonald from 2000).
This finding is relevant because telephone surveys typically exclude non-English speakers and cannot reach felons living in prisons and those living abroad. Telephone surveys also miss other categories of otherwise eligible citizens that rarely vote, including people in nursing homes, soldiers in barracks and those who are mentally incompetent. As such, we should assume that the percentage of real voters included in telephone surveys of adults will be significantly higher than turnout as a percentage of the voting age population. How much higher? Opinions will vary.
Earlier this week, I spoke to Richard Morin, polling director at the Washingon Post, about their likely voter model. He explained that his concerns about problems with the Voting Age Population, combined with greater levels of voter interest measured this year, are why the Post is now using a likely voter model that screens out 40% of all adults rather than a larger number.
Non-response. Another issue that has received surprisingly little attention is the potential impact of lower response rates. If non-voters tend to hang up on pollsters more often than voters, then samples of adults will tend to over-represent likely voters. That might not be a problem if a pollster is simply screening for likely voters. However, it might cause a problem if the pollster first samples adults then tries to calibrate a likely voter subsample to match expected turnout among all adults. If the adult sample overrepresents voters, then the pollster will define the likely electorate too narrowly, even if they guess right about turnout.
Is there any evidence that non-voters hang up on pollsters more readily? Unfortunately, academics have devoted surprisingly little attention to this issue. A recent study in Public Opinion Quarterly provides some theoretical support. Survey methodologists Robert Groves, Stanley Presser and Sarah Dipko found that people are more likely to participate in surveys on topics that interest them. They found, for example, that teachers were more likely to respond to a survey about education, that new parents were more likely to respond to a survey about children and their parents and that seniors were more likely to respond to a survey about Medicare and health. Although they looked at political contributors (who were more likely to respond to all surveys) they did not look specifically at voters.
The challenge of researching non-response is that — duh — we cannot interview those who hang up. One approach is to use reluctant respondents (those who relent and participate after initially refusing) as surrogates for those who ultimately refuse. In his 19931 book, The Phantom Respondent, University of Chicago Political Scientist John Brehm did just that. Looking at election surveys conducted by the University of Michigan in the late 1980s, studies that validated the actual voting behavior or respondents against election records, Brehm found evidence that reluctant respondents tended to be non-voters. Although Brehm derived his findings from some bewilderingly complex statistical models, his conclusion was clear. The surveys he analyzed “oversample voters” due to their “significant levels of non-response” (p. 138).
Keep in mind that Brehm analyzed surveys from the late 1980s with response rates of 60% to 70%. The response rates that news media surveys obtain today are considerably lower. A study in 2003 by Stanford Professor Jon Krosnick, for example, reported that response rates from 20 news media surveys averaged 22% (and ranged from 5% to 39%).
Oddly, I can find no recent academic research looking at whether non-voters hang up on pollsters more often than voters. However, most of the campaign pollsters I know have long assumed just such a pattern. I do not want to overstate this argument, as we have precious little hard evidence either way. However, if pre-election surveys of adults are overestimating probable voters, as seems at least plausible, then some of the “likely voter” sub-samples derived from them may be too narrow.
What is the bottom line? Opinion surveys are very good at measuring current attitudes, but they are imperfect, at best, as predictors of future behavior. When it comes to “modeling” turnout, surveys can separate the most likely voters from the least likely, and help show us the differences between the two. However, this pollster would urge great caution to those who are judging the plausibility of likely voter samples by comparing their size to specific levels of turnout.
More to come…
I’ll need to read these two posts a couple more times before I think I fully understand them, but a quick question pops up that I hope someone can help with:
Why do pollsters try to identify a subgroup rather than apply weighting factors to the entire group?
If voting iteself, and not merely candidate fidelity, can be expressed as a continuum from “certain to vote” to “certain to not vote,” then why not apply some sort of fraction to the entire population, and try and fit various guesses on the type of weighting (linear, exponential, bell) to see what makes most sense?
Maybe your next few posts will help me understand this. Thanks for the great blog!
Quick answer to Anthony. A very few do (apply weighting factors to the entire group rather than identifing a subgroup). CBS/NYT is the only example I know of at the national level.
I’ll get there…
Mark
Anthony’s question touches on the most basic problem with all likely voter models that divide the electorate into two groups, those who are likely to vote and those who are unlikely. Inevitably, some “likely voters” will not vote and some “unlikely” voters will vote. Hence, pollsters should assign voting probabilities to members of the electorate, rather than classify them as likely and unlikely.
I have suggested adoption of “probabilistic polling” as a replacement for likely voter models and also to give a quantitative sense of what it means to be an “undecided voter.” Those interested in learning more about this should see the short report “Likely Voters and Voting Likelihood” that I posted on my personal webpage on October 21. The URL for the report is http://faculty.econ.northwestern.edu/faculty/manski/voting_likelihood.pdf
Economic theory suggests that response rates should fall as incomes rise. The time of a high income person is worth more than that of a low income person. Therefore, the high income person is less likely to waste his time talking to a pollster. Someone with a low income has less to lose.
I have a question that might be related to identifying non-voters (or maybe not): now do pollsters account for respondants who, for whatever reason, lie about their voting intentions?
For example, I know polls here in Germany have great difficulty predicting the outcome for extremist parties, since their voters are reluctant prone to lie or not respond to pollsters.
Charles: I read your paper and really think that you have something there. I hope you have the time and research resources to extend this paper or do new research that tries to find out what that normalization factor really is (in your paper you started with 51/87), and under what conditions one might consider using different factors, and why.
I would think that any model of swing/undecided voter behavior would use a probablistic model to great benefit, if the mathematical model can be developed and tested against prior election data.
Thanks!
Bruce: Perhaps, but the high income person may have considerably more free time than a low income person, especially if said low income person has to hold down 2 jobs, or is a single parent or has some such other difficulties.
I wonder whether any consideration has been given to the proliferation of cell phones, particularly among young voters. I am a 23 year old very likely voter and, like many of my friends, I only have a cell phone, and no landline where a pollster might reach me.
Given the overwhelming support of young voters for the Democrats, this would seem to be an important and overlooked factor.
Mark:
I realize this is a much cruder measure of voter interest and ultimate turnout, but has anyone shown or attempted to show a correlation between debate viewership and turnout for Presidential elections? Although the sample size is small, of course, we’ve had debates on a regular basis since 1976.
Did ABC/WaPo change their likely voter definition again?
Is the appearance of undecideds in the latest tracking poll breaking to the incumbent signal anything?
Will Kerry or Bush blink first and drop thier October surprise?
Zogby poll, that is.
Bush will win based on 2 historically accurate predictors:
1. Reader’s Weekly Poll has been correct in predicting the president 100% of the time since 1956.
I understand that its poll shows Bush winning.
2. Iowa Electronics Market was correct in predicting the presidential victor all the time, except twice. Its predicting Bush victory.
Do you have any comments or analysis?
Thanks
I think you have a very vital point concerning “non-response”, & how it has reduced the usefulness of a great many polls for this electorial season. There are at least 2 reasons why polling as it is currently done is failing to get a proper random sample:
* Effect of Telemarketers. Although the national do-not-call list has dramatically reduced the number of junk calls, people still instinctively avoid calls from people they do not know by one method or another (e.g., use of caller ID, preference for cell phones, etc.).
* Election year fatigue. I live in what has been labelled a “swing state” (Oregon), & the Bush campaign has been running campaign ads since Kerry clinched the nomination — about 6 months now; Kerry’s organization had followed suit within a few weeks.
All of the polls have shown that 80-95% of the potential voters have made up their minds for whom they would vote months ago. All these commercials have done annoy supporters on either side, who now just want the election to be held so they can be done with the matter. Getting a computer-generated call asking how they would vote does nothing only adds to the annoyance, & I’m sure many refuse to participate because of that.
At least Bush has conceded Oregon, & his supporters no longer run his commercials; unfortunately, with several close local races & half a dozen hotly-contested measures on the ballot, there are still enough election-related ads to make up the loss. I expect people will simply want this election over with even more intently.
Geoff