It’s now apparently official. Brosnan of the Scripps Howard News Service reports and includes a hint as to how the "revision" was accomplished:
Initial network exit polls on Election Day overestimated President Bush’s support among Hispanic voters, an NBC official said Thursday.
Revised figures show Bush received 40 percent of the Hispanic vote, not 44 percent, said Ana Maria Arumi, elections managers for NBC News. That would still be a 5-percentage point gain for Bush over Democrat John Kerry compared to the 2000 race against Al Gore…
The major television networks used pollster Warren Mitofsky to sample 250 key precincts on Election Day. Arumi said the exit poll over sampled in South Florida where Republicans are strong among Cuban-Americans.
For the revised figures the networks combined 50 state exit polls, which reflected more than 70,000 interviews, Arumi said.
To clarify (what I can) from the above:
The National Election Poll (NEP) sampled far more than 250 precincts on Election Day. That may have been a reference to the number of predominantly Latino precincts. I also assume that the reference to "oversampling" of Cuban voters in South Florida was something NEP did intentionally, and then weighted back appropriately in the previously released Florida results. This would have been a strategy to counteract the problem I described on Wednesday and assure an adequate and more accurate sampling of Florida’s Cuban voters [Perhaps not — see below]
The last quoted paragraph is also important: On Election Day, the National Election Pool (NEP) did a seperate, stand-alone national exit poll of roughly 14,000 interviews that produced the intial estimate of Bush’s support among Hispanics at 44%. If this report is right, they have subsequently rolled together all of the interviews conducted in the various statewide exit polls (and weighted them so that the full sample is geographically correct. The larger sampling of Latino precincts would have less sampling error; thus the downward revision.
Still unclear is whether they did 70,000 interviews statewide (the number you get if you add up all the reported sample sizes reported on sites like CNN.com; see this MIT/Cal Tech report) or the roughly 150,000 that Warren Mitofsky reported on the News Hour. Yet another question for Shuster’s press conference.
UPDATE: Another article on this press conference by Rueters’ Alan Elsner implies that the overrepresentation in Florida was not intentional:
Looking at a larger state-by-state sample of 70,000 voters conducted Election Day, NBC elections manager Ana Maria Arumi said the original exit poll had overrepresented Hispanics in southern Florida, who were more likely to be pro-Bush than Latino voters elsewhere.
"The number now comes out at 58 percent for Kerry and 40 percent for Bush," Arumi said at a news conference.
The William C. Velasquez Institute, the organization that sponsored the forum at which Arumi spoke, also put out a press release with more details. And not suprisingly, Ruy Teixeira is all over this issue.
UPDATE II: Blogger Steve Sailor anticipated the rationale for this revision in some interesting thoughts posted the comments section in my earlier post.
Mark wrote, and later corrected:
” I also assume that the reference to “oversampling” of Cuban voters in South Florida was something NEP did intentionally, and then weighted back appropriately in the previously released Florida results.”
Intentially “oversampling” completely contradicts your explanation that precincts are selected randomly.
I understand you believe this didn’t happen, but couldn’t you write “intentional oversampling” COULDN’T happen since precincts are randomly selected.
Could you please clarify this confusing point?
Thank you!
I wonder what this means for the theory that Republicans were less likely than Democrats to participate in the exit polls?
Alex,
I’m making these numbers up: Suppose you have a fictional state with two kinds of precincts: Heavily Hispanic and all others. A random sampling of 50 precincts yields exactly 2 of the heavily Hispanic precints. If we do 100 interviews per precinct this would yield about 200 interviews in the 2 precincts.
We decide we want more interviews in Hispanic precincts, so we “oversample” as follows: We first divide, or “stratify” the list of precints, and then randomly select TWO samples: One a sample of 48 non-Hispanic precincts, the second sample of *4* Hispanic precincts (double the number we should have).
When we tabulate the results, we multiply all of the respondents in the 4 Hispanic precincts by 0.5 so that their value is “weighted back” to the appropriate value.
In reality the numbers are more complex, but the basic principle remains: “Oversampling” makes the sample no less random as long as the results are weighted back to the appropriate probability of selection.
Make sense?
This gets more bizarre by the day. I guess the situation now is: there is a bitter election in the Ukraine. The final results don’t match the exit polls. So the internataional community, including the US, puts pressure on the gov’t of the Ukraine to hold a new election.
In the US Presidential election, the final results don’t match the exit polls, so we must avoid conspiracy theories and accept the results.
Post 11/3, I steeled my heart to get over the loss. My intuitive sense that it would be a 5 point victory for Kerry was wrong; the nation had spoken and I must learn to live with the results.
Now, four weeks later? I have less faith in the US electoral process than I did after the 2000 election. None of the numbers seem to add up. Is anyone going to dig into this, or does it just go down the memory hole?
Cranky
it is reassuring that these people can adjust to the facts. Even so, it seems likely that they won’t tell us exactly how their model deviated so far in the first instance, that continuing revisions have to be made. This means that the cloud of deligitimization will remain over the election, which one could imagine to be the reason that the left will be raising these issues for some years.
Thank you for clearing that up.
When oversampling is done, generally, how many categories are created within each state? Is it just “Hispanic” precincts that are “oversampled,” or is data for multiple groups within states sought after in this way? Just want to get a feel for how complicated these exit polls truly get.
Does oversampling improve accuracy in type 3 prediction exit polls? Or, could oversampling introduce unexpected errors, say if the categorizations were off or the “weighting back to appropriate probabilities were off?” Or maybe, if black voters were overrepresented in the Hispanic precinct oversampling.
For prediction purposes, it seems oversampling introduces the *potential* for errors that a truly random sampling would have avoided. Oh well, our loss. Mitofsky was not designing a solely predictive exit poll, after all. I imagine the Ukrainian exit polls and the international AUDIT exit polls probably use straight random precinct selection because their intent is probably strictly vote prediction/verification.
As always great explanations for the public.
Does Teixeira really have credibility on anything anymore?
He writes persuasively, but he’s always wrong.
Surely Teixeira was right that the Hispanic numbers were wrong?
As for “blogger Steve Sailer” he has an ideological reason for minimizing Bush’s Hispanic showing: see http://www.prospect.org/weblog/archives/2004/12/index.html#004960 for documentation of his, ahem, interesting views on race and,er, even more interesting ideological stablemates of Vdare. He doesn’t want the GOP to think it can win very many Hispanic votes with pro-immigration policies. For that reason, if the exit polls overestimated Bush’s Hispanic showing, Sailer probably underestimated it.
statistics don’t lie — people do — or something like that
Mystery Pollster continues to follow-up on the election tracking polls (here, here, and here only for example), and it’s always good stuff. Since the election if there’s been a polling story in the news, or a new theory/rumor floating around,…