So far, we have been talking about the limitations of polling, a topic that we will discuss in great detail going further. However, with a slew of new polls coming out today, it’s probably worth a quick discussion of my read on where the race stands today.
Let’s step back for a moment. What is it that we are trying to measure? Is it what the endless sports metaphors in the media suggest, a “race,” a “marathon,” featuring a “sprint to the finish,” in which candidates may be “neck and neck,” or one might be “surging” or, say it ain’t so, building an “insurmountable lead?”
No, it isn’t.
What we are really following is an ongoing decision making process involving upwards of a hundred million voters. There are no “points on the board” yet; just millions of individual voters pondering a decision most will effectuate for another six weeks.
Roughly 65-70% of those who will ultimately cast ballots in November (and probably 99% of MP readers) had their minds made up long ago about whom they would support in November. Most of those decided voters also know that neither rain, snow nor dark of night will deter them from casting a ballot. But the remaining 30-40% of those who will vote are still thinking it over. Roughly two thirds of these are leaning strongly in one direction right now, and most will likely find ways to reinforce those opinions in the coming weeks. Others may go back and forth and remain unsure up until the final moments in the voting booth. Also, there are literally millions who are considering casting a ballot who will opt out at the moment for reasons that will materialize on Election Day. And a very big number – I won’t hazard a guess how many –have no intention of voting right now, yet will find the motivation to register and vote for the first time.
Polls released today include empirical support for much of this argument. Look at the follow up questions that measure strength of support in the longer reports available online. In the Fox poll, for example, roughly three quarters of Bush and Kerry supporters say they are certain of their vote. Do the math, and you get a total of 35% of the “likely voters” who are less than certain about their choice, including 12% completely undecided. A similar calculation of the Marist poll results shows 32% less than certain, including 7% completely undecided.
The challenge for pollsters right now is that they are trying to accomplish three rather difficult tasks at once:
1) Draw a truly representative unbiased sample of the general population
2) From that sample, identify the appropriate portion of the population that will actually vote
3) Get those future voters to say how they will vote if the election were held today
Most of the debate raging about polls right now concerns how they are doing at tasks 1 & 2. While no one poll or likely voter model has a monopoly on truth, I believe that collectively, they are giving us a pretty good sense of where the true likely voters (or registered voters if you prefer) are leaning at the moment. If you look at the summary of the polls of likely voters conducted this week provided on realclearpolitics.com here and here, you’ll see an average Bush advantage of 4-6 percentage points. Examine the differences between registered and likely voters in polls that report on The Polling Report (something I promise to do in a more comprehensive way next week), and you’ll see that the race is a bit closer among all registered voters. If the election were held today, Bush would have a slight advantage.
However, even if the polls are doing all three tasks above perfectly, the election is not held today! The race is certainly within the common sense margin of error that contemplates the very real potential for attitude change over the next six weeks…in either direction.
More to come soon…
[…but probably not until Sunday. For the next 24 hours, I will be observing Yom Kippur. L’shana Tova!]
In planning clinical research, we talk about the “funnel”, the source from which we plan to capture the actual test subjects for our research. You need a source of subjects that will give you enough numbers for statistically significant results, without being unduly expensive to reach. At the same time, this funnel has to capture a group of subjects that is reasonably representative of the total patient population that you want your research results to be generalizable to.
We are fortunate in clinical medicine that most research questions have as a very ready and obvious funnel, the folks who show up in a clinic for treatment. You want to know which of two anti-hypertensives does better at controlling hypertension? Just get the reception desk at a primary care clinic, or clinics, to sign up folks who present to have their hypertension treated. Of course, if your clinic is in the inner city, your results may only be generalizable to Blacks, if in the suburbs, your results may not be generalizable to Blacks. If your funnel is a Cardiology or Endocrinology Clinic, your results would only be generalizable to hard-to-control hypertension, since only the difficult cases are usually referred to sub-specialists. If you want your results to be generalizable across all of these groups, you have to do a cooperative multi-center trial, to make sure your funnel sweeps in all the groups of interest.
The closest analogy in your line of work would be the exit poll. Your funnel, grabbing voters as they leave the polling places, gets past the sampling bias hurdles of the temporal question, that your latest blog addresses, and the whole “likely voter” question, that you have talked about in earlier installments. You have your hands on actual voters right after they have cast their votes. There are two problems with this funnel. The information is available too late (Sort of like an autopsy, in my line of work!) to do any good, except perhaps to build a fund of knowledge that will let you save future “patients”. And spatial limitations mean you can’t do a random nationwide sample of voters leaving polling places, you have to camp out at pre-selected key precincts, then apply some assumptions about how certain demographic groups will vote together, in order to project a national (or state) result.
Some of the elements of the funnel for predictive polling are obvious and widely discussed. Everybody understands that a poll done September 25, no matter how closely it tracks voter sentiment, can only reflect what a sentiment that can change by November 2, happens to be today. Musharraf and Karzai could be dead of acute lead poisoning by then, oil could be at $60 a barrel, terrorists could have taken out the Hoover Dam – and all of these events, or a hundred others that might occur, could change voter preferences dramatically. Similarly, everyone understands that turnout can vary, both in toto, and in its distribution among demographic groups, in ways that make projections from “likely voters” (which, it seems to me, are defined mostly by their recent past voting behavior), dicey. For these two elements of your funnel, it seems to me that you guys, within the limits of the general public’s, as opposed to the medical profession’s, attention span, do pretty much what we do in clinical research. We describe our funnel, and let the reader form his or her own qualitative judgment on the generalizability of the results.
This is all by way of introduction to my concerns over the element of your funnel that I don’t understand very well, and not through want of trying. To avoid having to send folks out nationwide walking the block in a random national sample, or to key precincts in a sample that (like the exit polls) would bring in assumptions about demographic cohesion in voter behavior, you guys use the telephone as your funnel. It certainly seems to me that who is reachable by phone (in this era of increasingly sophisticated electronic filters designed to block telemarketers), who will call-back (do you take call-backs, or actively call back repeatedly yourselves in order to keep as much as possible to your original random sample?), who will take the time to answer your questions even if you do reach them – all of this is a funnel that adds a bias to your sample. How big a potential bias is also not well-reported.
Even if it is a big potential bias, because you can’t reach a large percentage of the folks selected at random, or because a large percentage won’t cooperate, it doesn’t, of course matter to the final results, unless this is a differential bias. It doesn’t matter unless those people you can’t reach, or won’t answer, would not have answered in the same proportions as the sample you actually do get responses from. It seems to me that your best guarantee that there is no significant differential bias is empirical, the track record of past actual election results going pretty much as the last polls before Election Day called them.
But what empiricism giveth, it taketh away. It seems to me that the reasonable track record of the “likely voter, reached by telephone” poll is similar in this respect to that of the state of Maine. “As Maine goes, so goes the nation.”, was a rule that held from the mid-19th Century until the 1936 election. The results that year gave rise to the tag, “As Maine goes, so goes Vermont.”. Maine has been a lousy bellwether since. Maine was a bellwether for almost a century, not because it was a microcosm of the electorate, a sort of unbiased sample, but simply because its balance of Republican/Democratic voters was set during this time at about the same quantitative balance as the nation’s. Different demographic mixes in the two balances in these two samples, they just happened to be at about the same quantitative balance for about a century. Then the nation’s and Maine’s demographics changed, and the two samples no longer were at the same balance. I have heard rumblings (but no clear discussion – no numbers!) that voter behavior in respect to answering telephone polls is changing. Is it not likely that these two samples, the national electorate, and folks reachable by phone polls who agree to answer, are diverging from each other, diverging from what has always been as accidental a convergence as that which once held between the national and the Maine electorates? Will we wake up November 3, to find a new conventional wisdom, “As the telephone polls go, so go Utah and Indiana.”?
My understanding is that all the large, conventional, national telephone polls had Bush by anywhere form 3 to 13 points in their last pre-Election Day 2000 outings. The actual outcome, of course, was Gore by half a point, a divergence of 7 or 8 points in the Republican direction from the aggregate (and it seems to me legit to apply simple averaging of raw numbers to polls of such similar methodology, rather than bring out the cumbersome tools of meta-analysis we use in the often more heterodox clinical studies we seek to compare) of the telephone polls. Some tried to explain this as the famous “last-minute surge” that they trundle out as an explanation for every such variance, but this was a larger variance than I can remember in a Presidential race, and I haven’t heard that there is any internal evidence (e.g., trends within the three days typical of the national polls) of a temporal trend. Was 2000, for the conventional telephone poll, really the cycle analogous to 1936 for the “As Maine goes…” rule, only everybody was too transfixed by the FL controversy, and the chance convergence of the poll prediction with the EC result, to notice? Will the divergence be even greater this year? Gallup and Zogby/Rasmussen have been 10+ points apart some weeks…
It isn’t clear that any new funnel (online polls, “robot” telephone polls) would do any better, would be any sort of gold standard. Instead of a probably vain search for a funnel that would not introduce differential bias, why not just do what clinical research does, and admit that no sample is unbiased? We have as a standard a “Table I” as part of a study, that lays out some quick descriptive statistics of the demographic breakdown of the study and control groups. If the study and control groups are skewed badly from the population of interest to the reader, at least everybody can gauge the significance qualitatively, or, if appropriate, a sub-group analysis can be done to see if the results are still significant for the population of interest. If randomization fails, as it often does, to produce a balance in pre-test risks between the study and control groups, you can always adjust the results, if you know about the randomization failure and its magnitude. Shouldn’t pollsters do something similar, report a “Table I” with their polls? Shouldn’t they disclose the demographic breakdown of their samples, and make some comparisons with the demographic breakdown of the electorate in recent cycles? Shouldn’t they adjust results on the basis of such comparisons?
At least one pollster does exactly what Glen Tomkins suggests– report the internal composition of their sample (by race, age, education level, Party ID and some other stuff), report the raw numbers and let you, the reader, decide what to do with the data. No pre-weighting, no guessing, just who they got on the telephone and what the people who answered the telephone said. That pollster is SurveyUSA, best-known for using robocalls instead of live interviewers to ask their poll questions. SUSA did as well as anyone else, I’ve read (at DailyKos) in predicting results for the Dem primaries. As you’d expect from a poll which doesn’t weight, smooth or normalize at all, they’ve produced a number of notable outliers in both directions, though the weirdest ones have favored Bush: in the past two weeks SUSA has had Bush winning New Jersey by four points, a tie in Maryland, and Bush winning Oregon by one. It’s also reported Kerry behind by only one in West Virginia and by only two in Missouri, when other pollsters show high single figure leads for Bush. Most ominously for Democrats SUSA shows Bush and Pete Coors winning in Colorado by several points, with Coors over 50, when other recent surveys have had Ken Salazar ahead and the Prez race virtually tied.
What do the other pros think of SUSA?