The apparent problems with Tuesday’s exit polls were obviously topic number one with MP’s readers yesterday. Before jumping into what went "wrong" and why, it is important for all of us to be clear about what we know and what we do not. We might start by considering why the networks do exit polls in the first place. I see three:
2) To provide data to help the networks project winners, especially in states where one candidate holds a huge lead
3) To provide networks and other news organizations some advance notice of the likely election outcome well before the polls close so they can plan their coverage accordingly.
The National Election Pool (NEP) designed procedures for conducting the exit polls to fulfill these three missions: Interviewers phone in raw data from completed questionnaires at several intervals during the day. To provide networks guidance on likely outcomes, NEP releases partial data several times during the afternoon. These are the numbers that get leaked and appear all over the Internet. As I wrote Tuesday, my understanding is that unlike the final numbers, these early releases are not weighted to reflect the actual turnout that day.
Shortly before the polls close, the interviewers call in tabulated results, as well as some measure of actual turnout at their precinct. NEP uses the actual turnout data to weight each state’s poll, so that the regional distribution of respondents matches the actual turnout for that day. The last report and turnout data allows for a tabulation provided to the networks just before the polls close that they use for projections. If the margin is well outside the margin of error (typically 4% for a state) the networks will use the exit poll alone to call the state.
As Martin Plissner, a former executive political director of CBS News, wrote yesterday in Slate,
Exit-poll surveys in some 29 states showed margins for George Bush or John Kerry great enough to conclude that the chances the leading candidate losing was essentially zero. On that basis, when the polls closed in those states and before any votes were counted, 16 of them were placed in the president’s corner and 13 in the senator’s. They tended to be places like Kansas and Rhode Island.
Needless to say, if the lead is within (or even close to) statistical sampling error, the networks will not make a projection on the basis of the exit poll alone. To enable projections in these cases, NEP also does tabulations of actual returns obtained for a larger random sample of precincts (often referred to as "key precincts"). They also continue to use and update the exit poll. As returns start to come in, the exit pollsters weight each individual precinct sample by the actual vote cast by all voters at that precinct. Thus, as the night wears on, the accuracy of the exit poll gradually improves.
Why bother with the exit poll when real votes are available? The poll helps analysts determine the size and preferences of key subgroups with increasingly greater precision. What is the vote among Independents? African Americans? Young voters? New registrants? How do those patterns compare with pre-election expectations? Knowing the answers to those questions helps guide those at the network "decision desks" in making projections.
Also, weighting the poll by the actual vote improves its accuracy for its third and most important mission: providing an analytical tool for journalists and the rest of us who want to interpret and explain the election outcome. When a final result for a state is available, the exit pollsters weight the entire sample to match the vote results (there is often a mismatch due to drawing a sample of precincts rather than the entire state). That is the reason bloggers and others noticed that exit poll results posted on CNN and other news sites changed overnight. It was not a conspiracy, just standard practice.
So, given this procedure, what can we say about what seemed to go so wrong?
With respect to the second mission, correctly calling outcomes, the exit polls did well. "No wrong projections [of winners] were made; the projections were spot on,"
said Joe Lenski of Edison Research (the company that conducted the NEP exit poll along with Mitofsky International) to the Washington Post’s Richard Morin.
True enough, but what about all those mid-day numbers everyone saw on the Internet? The official answer from people like Lenski mirrored what you heard from me on Tuesday, that mid-day numbers are less reliable and only reflect the views of those who have voted so far. Actually, they went a bit farther. "The leaking of this information without any sophisticated understanding or analysis," said Lenski, " [made] it look inaccurate." They were "about as accurate as they usually are," wrote Plissner, adding "the problem was that…the exit polls were being seen by thousands of people who didn’t know how to read them….like any sophisticated weapon, they are dangerous in the hands of the untrained."
Is that fair?
I went back and looked at the numbers that Jack Shafer posted on Slate at 12:15 p.m. Pacific Time (3:15 Eastern). He posted results for 10 states, but most focused on Ohio and Florida which both showed Kerry one percentage point ahead of George Bush. Of course, both Kerry "leads" were well within sampling error and, given the smaller mid-day sampling, also within an acceptable range of the actual result.
But there is something else interesting about these results: Kerry’s standing against Bush in all ten states surpassed what he received on election night. At 4:28 Pacific Time (7:38 Eastern, Shafer posted more recent numbers for an even larger list of states, 16 in all, plus the national result (51% Kerry, 48% Bush). Again, the same pattern: Kerry’s performance on the partial exit polls surpassed his ultimate performance nationally and in 15 of 16 states. So whatever was happening, it was not just the random variation due to sampling error. If you don’t believe me, try flipping a coin and see how often you can get heads to come up 16 of 17 times.
The NEP officials seem to concede there was some Democratic bias in the early numbers. An Associated Press story from yesterday said:
The NEP had enough concerns that its early exit polls were skewing too heavily toward Kerry that it held a conference call with news organizations mid-afternoon urging caution in how that information was used. Early polls in New Hampshire, Pennsylvania, Minnesota and Connecticut were then showing a heavier Kerry vote than anticipated.
Pollsters anticipate a post-mortem to find out why that happened. Some possibilities: Democrats were more eager to speak to pollsters than Republicans, or Kerry supporters tended to go to the polls earlier in the day than Bush voters.
The same AP story reported that after the initial release showing Kerry ahead by three points, "as the day wore on, later waves of exit polling showed the race tightening." You can see that pattern in Jack Shafer’s numbers for Ohio, Florida and Pennsylvania. However, did the final surveys just before the polls closed continue to show a consistent Kerry bias? I cannot answer that question, although it would be easy enough if we had the final results shared with the networks just before the polls closed in each state. I am sure there are reporters among readers of this blog who saw the data releases just before the polls closed. Perhaps someone can email me and set me straight.
The issue is not whether the decision desks at the networks paid any attention to the small Kerry leads in the early Ohio and Florida, but whether news organizations relied on them in planning coverage and discussing the race in the late afternoon. It was not just bloggers. Very serious reporters from very serious media outlets jumped to the conclusion that Kerry was running the table, just like all those "unsophisticated" bloggers.
And then there is the issue of why the networks and NEP gave no consideration to the virtual certainty that these numbers would make their way into the public domain. It ought to be obvious by now that giving exit polls to 500 or so reporters, editors and producers — all of whom have phones and computers — is essentially the same as putting them in the public domain. It was not exactly a surprise that the leaked exit polls would be all over the Internet, yet they had no strategy to help the consumers of leaked numbers understand what they were looking at.
In their post-mortems, the networks need to consider that far more Americans consumed raw exit polls in their partial, dirty, unreliable state than will ever examine the final cross-tabulations now available. Too many came away from the experience convinced that exit polls are biased and unreliable. Confidence in the final numbers, the ones we all rely on to understand the election, has been seriously shaken. We simply cannot blame that on the bloggers.
If partial exit poll data is "dangerous in the hands of the untrained," and we choose to leave it lying around where the "unsophisticated" will play with it, doesn’t it make sense to at least publish a warning label?
In Defense of Exit Polls
Martin Plissner says “the real problem is not that the exit polls were wrong. They were about as accurate as they usually are. The problem was that in the age of the Internet the exit polls were being seen by thousands of people who didn’t know how to …
Exit poll- axed…
Lots of detail here about why exit polls can be wrong. Today, that statistically-significant writer points to a Slate piece, then goes on to say this particularly troubling bit: I went back and looked at the numbers that Jack Shafer…
However, did the final surveys just before the polls closed continue to show a consistent Kerry bias?
Yes, they did. At least, I was watching Fox, and the numbers they had at 9 PM EST were still very pro-Kerry. Further, they claimed that the reason they were being so slow to call some states for Bush (despite 10+ point leads for Bush with 15%+ precincts in) was that the exit poll numbers they had were pro-Kerry.
Perhaps they still didn’t have access to the latest numbers, but I doubt that.
I’ve got a better suggestion: end “purpose 3”. No one gets the poll data for a state until the polls close.
Ok, I’ll be nice. They can have the data 1 minute before the polls close, which gives them ti to decide if they want to call the state as soon as the polls are closed.
They have no real need for the data before the polls are closed. Keeping it secret from everyone (as in, the tabulators use a local computer to send the information to a centeral computer, tha is programmed not to release any results until the appropriate time) means no one will try to game it, so it makes it more likely the data will be valid, too.
I have a better idea–if the results are within the margin of error, then no one gets them until they have been weighted against the actual vote count.
It will be interesting to see what the bias is attributed to.
Are we SURE that we can eliminate the possibility that some kind of electoral fraud tipped votes from the Democratic to Republican columns?
I had assumed that the intense scrutiny of this election by media and lawyers had precluded this, but the 16 of 17 pattern on exit polls raises the idea again in my mind at least.
It seems that when observers try to pinpoint election fraud in developing countries, one of the biggest red flags is a significant mismatch between the final exit polls and the actual election results.
The possibility that the large exit poll mismatch was caused by fraud cannot be discounted. The real potential for wide-scale, national fraud very much exists. This due to the awful design failures in a great number of the current electronic voting machines. Design failures that could allow a single, rogue, hacker to throw an entire nationwide election. The “single hacker theory” if you will.
Those not familiar with software development likely have no idea how simple it would be perpetrate such a fraud. Many have read about the controversy of these machines not printing paper receipts. But the truth is far worse than that, as many of the machines have no security and no internal audit trail!
Meaning anyone could re-write the data at nearly any point in the process while leaving absolutely no signature of the tampering. They could also transmit false data back to the servers which gather the data, and in some cases, re-write the data on the central vote collection servers themselves. All these frauds could be designed to require no action by the hacker on election day. All could be coded months in advance, just waiting to activate at the chosen moment.
The discussion linked to below on Slashdot.org describes how just such a fraud could be perpetrated. This is not tin-foil hat stuff, large-scale electronic election fraud is going to happen, if it hasn’t already.
http://politics.slashdot.org/comments.pl?sid=128331&threshold=0&commentsort=3&tid=103&tid=219&mode=thread&cid=10717046
A Modest Proposal.
I hope that this will be taken as a joke instead of a slash at the blog author, who I esteem…. ๐
I think the discrepancy between the exit polls and the actual vote is suspect, and probably shows a glitch or actual fraud in the counting of the vote rather than any defect in the exit polls.
So…. It seems to follow the spirit of the times (or perhaps the Times) to consider a radical change.
Remember the debate about using statistical sampling methods to estimate the census a few years back, as opposed to ‘an actual enumeration’ of the US population?
In the same spirit I wish to make a humble proposal. That we use the exit polls instead of the actual count of the votes to determine the US President and other major races.
This would have the major advantage of not being subject to bootless debates about hanging chads, optical glitches, late-arriving absentee ballots, or recounts. It would be far more inclusive and diverse in that even people normally not entitled to vote (felons, disgruntled French tourists, illegal immigrants, and children) would be able to participate merely by failing to divulge their failure to pass scrutiny to the exit pollers. Most importantly the networks would be finally be able to call the election early without fear of contradiction by the actual ballots.
I propose this as an intermediate step to a method FAR more scientifically au courant and less expensive which we could proceed to in 20 years perhaps. Selecting the President by forecast. An accurate forecast of the vote could be drawn up taking into account all the relevant data. This model would include opinion polls (properly balanced by Party ID of course), the Incumbent Effect, and all the other demographic factors which we know and love.
Applying all these factors against a data base derived from a proper census using the best statistical sampling methods we could determine the winner of the Presidentcy and the membership of the Congress without having a vote at all!
Cmon, we need to make this beautiful vision a Reality! What do you think?
Thank you for explaining the difference between the final exit polls and the cross-tabulated exit polls. Can you examine their relative value more deeply? Continue the serious on exit polls so to speak.
So, lets discount the early exit polls.
And yes we understand that the cross tabulated exit polls are very reliable, see Andrea Moro’s analysis:
http://www.econ.umn.edu/~amoro/Research/presprobs.html
So, please tell us what accounts for the differences between the final exit polls, those released right around poll closing, and the cross-tabulated exit polls.
1. What goes into adjusting the final exit polls into the cross-tabulated exit polls?
2. Do “spoiled” or “provisional” votes play any role in the difference?
Gratefully yours,
Alex
Put another way:
If I understand you correctly, Cross tabulated exit polls are adjusted by REPORTED vote counts. But there is a class of vote that is UNREPORTED, “spoiled” votes (hanging chads) and “provisional” votes.
Has there been any work to evaluate “pre-cross tabulated” exit poll date with an analysis of spoiled/provisional votes?
Gratefully yours,
Alex
I looked at the CNN site at around 8:30 ET, and they had posted their exit poll numbers as of that time. It was easy to compute the estimated national percentages – multiply the candidates’ totals by key subgroups times the weighting of each subgroup (provided on the CNN site). The national result – 48.26% for Bush. Presumably the 8:30 ET numbers were final, and the persistent Dem tilt was still there. Those numbers also showed a smaller margin for Bush among men and a larger Kerry margin among women than the numbers you will see posted now. And, although I am less sure of my memory on this point, I believe it showed more self-identified Dems voting than Rs. The current numbers show a 37-37 draw; I think the 8:30 numbers showed a 39-35 Dem edge. Again, I am positive about my other points, but a little hazy on this one. Perhaps the exit pollsters failed to account for the turnout surge in GOP areas nationwide and underweighted for the rural, pro-GOP areas until the magnitude of the surge became apparent? That would account for virtually every point noted above.
Where do I get my hands on the cross tabs for particular states? I want to do some indepth number crunching for Pennsylvania, Maryland, New York (mainly UpState), West Virginia, Ohio, Michigan and Kentucky. Where do I get this info in a spreadsheet or database format? And how much does it normally cost?
Thanks,
Fester
Henry,
I hope your point on the surge of GOP turnout is addressed. This does seem the only plausible argument if true. But I would have questions about it.
Why would a scientific exit poll not catch a GOP surge? Did they not have exit polling in GOP precints? Precints were randomly picked after all. And if I understand Mystery Pollster, final exit polls are weighted by actual precint turnout, so I’m really not sure that a GOP surge shouldn’t have been caught.
I think this debate is really healthy, and I hope the blog creates a series on exit polling.
Experienced politico Dick Morris suspects foul play in the exit polls..
“This was no mere mistake. Exit polls cannot be as wrong across the board as they were on election night. I suspect foul play.”
http://www.thehill.com/morris/110404.aspx
The NEP had enough concerns that its early exit polls were skewing too heavily toward Kerry that it held a conference call with news organizations mid-afternoon urging caution in how that information was used.
Or too look at it another way, NEP called the networks to apologize for publishing numbers that weren’t helping the evening’s script.
Great site…. You lamented in an earlier post that now the election is over nobody will want to visit your site. Well, with the buzz about exit polling discrepencies, those of us who are skeptical of our democracy’s ability to prevent fraud will be dropping by educate ourselves. If you want to take your posts in that direction, I assure you people will come here.
Myself, knowing a bit about the Repub’s ability to take any action necessary to win AND the ease with which one person could rig the election, I think the possbility of fraud should be investigated by more than just disaffected leftys and bloggers with too much time on their hands. A major news org should check this out, it would be the story of the decade and would make Watergate look like check kiting.
Reading Dick Morris’ piece in “The Hill” was a hoot btw. He thinks it was the exit polling that is faulty and smells conspiracy. He thinks it was designed to make Kerry look like the winner and suppress Bush turnout. What is more plausible? The news orgs pressuring the exit pollers to weight the results toward Kerry OR fraud.
I think I know the answer but will ask the question anyway.
Isn’t it very silly for people like Andrea Moro to say that cross tabulated exit polls were very reliable in picking the correct winner considering that “cross tabulated exit poll numbers” are “final exit poll numbers” that are adjusted to match the actual vote count?
http://www.econ.umn.edu/%7Eamoro/Research/exitpolls.html
Does National Election Pool keep the “final exit poll numbers” where it can be accessed (although I realize it is of little use after votes have been counted except to provide starting numbers for post-mortem analysis as Mystery Pollster explained in this post).
A different question for which I don’t know the answer.
If “cross tabulated exit poll numbers” have taken into account the actual vote count, why not make the adjusted numbers match the actual vote count exactly?
How did the NEP know during the day that their exit poll numbers were wrong? Based on what … other exit polls?
In the NY Times this morning there’s a very unsatisfying article about the exit poll problem which states that the NEP knows the reason for its systematic errors. But the article fails to eludicated any of said reasons. Does anyone know what NEP is telling people?
I’d also like to reiterate what several people above have said about the pro-Kerry (51%) polling numbers being posted on the major news organizations’ web sites until quite late. I can’t say for sure what numbers were up how late, but I can say that at 7:30 PM I was knocking on doors in Toledo, and that I didn’t have a chance to check any web sites (I found the numbers on CNN) until after I had returned to Ann Arbor (about an hour away) and helped put the kids to bed — so it had to be 9:00 at the very earliest.
I’m having a hard time buying the notion that the hand-waving arguments we’ve been been given so far — like skewed temporal turnout (i.e., more Dems vote early) or skewed willingness to talk (Dems sought out exit pollers) — can explain the problems. My first reaction was that the NEP misjudged the turnout in rural areas, and that that led them to mis-weight their early numbers.
Mark — please help us out here?
Considering that many of the polls in Ohio was still open well past 8 PM EST on election day, how could final exit polls be released (not accounting for the actual vote count) before all the polls were closed? Was NEP taking into account people who were still waiting in a line to vote? Weren’t many of these late-closing polls in urban area? This raises a non-polling question – if people have to wait much longer in some districts (I’m guessing poorer districts) to vote, is it akin to suppressing/discouraging voting in those districts. I admit it was nice to see in some ways (reminiscent of Afghanistan).
Most of the foregoing analysis is woefully lacking in an understanding of exactly what “polling” can accomplish, and what “margin of error” actually means.
MOE refers to sampling error, and can be exemplified by the following statement: “Four heads in ten flips of a coin is within the margin of error.” In other words, few would be surprised to get 4 heads and 6 tails in 10 flips.
Polling is completely different. If your sampling protocol (you have a protocol, right?) dictates that you poll the next person you see, and that person refuses to tell you anything, it is _not_ acceptable to use the next person’s information.
Unless the demographic profile of “refusers” exactly matches that of people who gladly spill their electoral secrets, your poll is highly suspect.
Any claim of a MOE in the single digits in this scenario is laughably non-scientific.
Aside: would you tell a stranger all of your ballot selections? Along with your demographic details? I know I never would.
Let’s think about sampling error in surveys. Start at your polling place. The survey researcher is standing with a clipboard at 7 am, asking exiting voters to stop and talk. Who are these voters, at 7 am? People on their way to work, probably. How many want to stop and chat after having stood in line for an hour to vote? I’ll bet none, or maybe a couple of people who have nothing else to do.
At 10 am, who comes out of the voting booth to talk? It’s probably stay-at-home moms and senior citizens. They may have a lot of time to talk to researchers. So the mid-day reports reflect the views of non-working people, and skew toward women. (at least that’s how it was at my polling place).
The late voters are once again the working folks, and perhaps since they’re on their way home they have time to talk to the researcher.
So of course the exit polls were wrong at mid-day. There’s no way the data could have accurately reflected the total population of actual voters.
Samples are just that: samples. They are selected (randomly or otherwise) in the hope of providing data that will project the tendencies of an entire population. Given that this act–voting–spanned 12 hours or so, there’s no way that an accurate sample could have been collected in less than 12 hours.
I think exit polls are terrific–when they are complete. I think exit pollsters shoudl release the data when the polls close, and when it has been weighted to reflect the turnout. That would force news people to report on actual numbers, rather than guesses. Of course, election coverage before the polls close as dull as dishwater even now. Imagine without exit polls….
It would be great to see some analysis of the final (re-weighted) exit polls, and what they really do indicate (and what, in fact, they may seem to indicate, but actually don’t).
This seems to be the main reason to keep exit polls, and I’m sure the pros are doing all sorts of interesting analysis. It would be great to hear about some of it.
(We’re hearing a lot about how moderates and independents actually went for Kerry; what really happened? Recall all the arguments about whether weighting by part-id was a good thing; how good a job did pollsters do with their weighting? How does this election fit in with previous such trends? Etc.)
Dear Heet,
Read the Dick Morris article again.
http://www.thehill.com/morris/110404.aspx
1. Dick calls for an investigation
2. Investigation can prove FRAUD in polls or provide evidence for faulty reported results.
1. Investigation
The first half of his article argues exactly that final exit polls are trusted, are weighted by gender, are weighted by turnout, and ARE USED TO PREVENT FRAUD IN THE COUNTING OF VOTES in places such as Mexico. All of these are serious reasons to take this discrepancy seriously.
2. Exit Poll FRAUD or Reported Vote error
Notice that all his initial arguments support that an outcome of an exit poll investigation traditionally reveal shortcomings in the reported vote counts. At the end, he argues that the investigation this time would reveal FRAUD in the exit polls. He doesn’t even argue that he expects to discover error in the exit polls, but that only fraud could explain the unprecedented discrepancy.
This raises serious questions about the willingness to discount exit polls as unreliable. The question is are the exit polls fraudulent. Of course, is it just me, or does Dick sound a little coy when he suspects foul play in the exit polls after having pounded the table that in the past the foul play he would have suspected was in the Reported Vote!
Morris makes extremely good points concerning the validity of exit polls as they apply to the actual vote. He even points out his personal experience in using exit polls to detect and insure against wide scale fraud in Mexican elections.
But then he totally contradicts his own experience by suggesting that in this case, the tampering had to occur in the exit polls, not the actual voteโฆ I can’t fathom how he arrives at this judgement.
Because of electronic voting machines, rigging the actual elections would seem to be a far simpler feat.
Rigging exit polls would either have to enjoin an army of shill voters visiting the randomly chosen polling places in each of the state affected. Or that the Associated Press purposefully tampered with the exit poll counts. Either of those possibilities I find to be beyond the realm of possibility.
But tampering with electronic voting machines that have neither paper trails nor audit trails… That is a very reasonable possibility. And just as Morris suggests in Mexico, I believe a fraud in the actual voting is a much more likely reason for the huge mismatch in the exit polls and vote.
Replies to Fester and Alex in Los Angeles,
Fester – CNN has all the exit polls by state on its site. Go to the state and click on exit polls. I have never seen state data broken down by portion of the state, but whole state data is easily available for free
Alex – I don’t know why a scientific exit poll would not catch GOP turnout, but one reason might be that I do not think the precincts are randomly selected. I think the precincts are pre-selected to produce a cross-section that approximates the state’s demographics, and then the voters are randomly selected from those precincts. If the initial precinct selection overweighted Dem areas because historically they have comprised a larger share of the electorate, then the initial results would have been properly weighted based on history but improperly weighted based on 2004 facts. But I checked my emails again and am sure that the numbers have changed on the site re: percentages Bush took for men and women from 8:30 ET to now, and the only reason that would have been done scientifically is if they were re-weighting their results to reflect changes in turnout in different portions of the country.
Michael certainly has a point. A quick
check to see if all the mismatches
between exit polls and final results
were in areas that used the SAME model
of electronic voting machines is in
order.
We all know what Diebold said in Ohio.
But without knowing where the differences
were, and the type of machines – if any –
that were used; this is all idle speculation.
I bet, though, that you can tell one’s
Presidential preference but noting
where the fraud is believed to have
occurred.
Exit Poll Fraud = Bushie
Vote Count Fraud = Kerryite
Media whore alert!!
I may (or may not) be on ABC World News Tonight this evening. The story is about the merits of releasing exit poll information to the public the day of the election. My mantra: the democratization of information is a…
Good point about Morris, Alex. Why would he write such a nutty thing? Preemption or distraction? Or, maybe just a silly-assed rebuttle for conspiracy theorists. In any case, it doesn’t make any sense on its merits.
Check out https://www.blackboxvoting.org to see some people who actually thought ahead and did FOIs for tons of audit info the moment the polls closed. They’re confident they’ll find fraud if it happened. Good, oversight is lacking and the electorate is not confident in the results.
Hi Heet,
That you do not think the precints were randomly selected is not the reason why the GOP turnout was missed, it is the reason why I ask that you read Mystery Pollster statement that:
“The exit pollster begins by drawing a random sampling of precincts within a state, selected so that the odds of any precinct being selected are proportionate to the number that typically vote in that precinct.”
Of course, we are both wrong, because prior turnout history does lower the odds that small-rural precints would be seleced for exit polling. Good catch, Heet. ๐
Could someone explain by what method or formula results are revised to bring them more in touch with official results, and why that is supposedly a valid technique in case of conflict, that is, why it would not be more rational to revise the official results to be more in touch with exit-poll results?