Today’s must read for polling junkies is undoubtedly the exchange between Warren Mitofsky and Mickey Kaus (on the "smoking gun" item I blogged on yesterday). I cannot do it justice with a quick excerpt — you should definitely read it all — but here is the text of Mitofsky’s email reply to Kaus:
The so-called smoking gun you wrote about was in the hands of every subscriber to our national election poll throughout election day. What took you and the others two months to locate it? At least a dozen news organizations have had this smoking gun since 11/2.
Second, the complex displays you ridicule, which were the source used by the leakers for the numbers that got posted by bloggers on election day, are not the tables you and others discovered. I stand by my original statement. Had you asked me I would have told you as much.
Third, if you doubt that we warned the NEP members on election day why don’t you ask one of them? Or is ridicule with your eyes closed your preferred method of sounding smart?
And lastly, if my clients were as misinformed as you seem to think how come none of them announced an incorrect winner from the 120 races we covered that day? It seems that the only ones confused were the leakers and the bloggers. I guess I should include you in that list, but I’ll bet you don’t make mistakes. We have never claimed that all the exit polls were accurate.
Then again, neither is your reporting.
warren mitofsky
Ouch. I definitely have some thoughts on this one, but unfortunately, my day job prevented me from writing more today. I’ll try to update this post later tonight. For now, read the Kaus item in full (and if any of this exchange needs additional "demystifying, please post a comment and I’ll try to clarify).
———
UPDATE: First, because at least one highly valued reader heard it differently than I intended, let me clarify what I meant above by "ouch." It was that pained feeling I had watching someone I admire – and I’m talking about Mitofsky here – do something so obviously inappropriate. It was the way I would imagine Ohio State fans must have felt 26 years ago watching their legendary coach Woody Hayes punch that defensive back receiver from Clemson.
I’ll come back to my reaction to Mitofsky’s email, but the heart of this exchange is a question I considered a few weeks ago, "were the exit polls really wrong?" Looking back, I realize that I would have been well served by a good editor on that post, because while I asked a provocative question, I never made it clear where I stood. Moreover, by putting quotation marks around the word wrong (and then using the same phrase as the title of my exit poll FAQ), one could conclude I saw nothing "wrong" at all.
The point I wanted to make then was the exit polls were obviously wrong in some ways, not so wrong in others. Everyone, even Mitofsky, concedes that the just-before-poll closing exit polls had an average "error" (or, to some, a "discrepancy") of roughly 2% in Kerry’s favor compared to the actual count.
Where the exit polls were right – or at least, not quite "wrong:" The errors were too small to achieve statistical significance in all but a handful of statewide polls. They were not large enough to give Kerry a lead beyond sampling error in any states that he ultimately lost, and not large enough to result in any wrong calls on election night.
[In an update, Kaus suggests a failing I overlooked: Projections in states like South Carolina might have been called earlier on actual vote returns but for exit poll errors in Kerry’s favor that implied those races would be close. As he writes: "The purpose of exit polls is obviously not simply to prevent the announcement of an ‘incorrect winner.’ It’s also to allow the earlier announcement of the correct winner"].
Where the exit polls were obviously wrong: As the Washington Post‘s Richard Morin put it in November, the errors were "just enough to create an entirely wrong impression about the direction of the race in a number of key states and nationally." And Kaus is right — it wasn’t just bloggers, but sophisticated journalists and political insiders who reached the wrong conclusion looking at those numbers on Election Night.
Of course, supporting the official network projections is only one mission, and arguably the least important. The exit poll subscribers also pay to get (a) some early indication of the outcome on Election Day so they can plan their coverage and (b) data to support analytical stories written on Election Night that explain the outcome and characterize the race among demographic subgroups. Here the exit polls obviously failed. News organizations planned coverage on the assumption that Kerry would win. Some stories based on the early evening cross-tabs apparently had to be rewritten. As John Ellis — a former analyst for both NBC and Fox News — wrote on his blog shortly after the election:
The lost productivity at places like The New York Times and The Washington Post, where literally hundreds of reporters and editors spent the equivalent of an 8-hour work day writing and preparing fiction…all of the consumers of this content have to be asking themselves: "why in the world do we pay for this?"
So who is to blame for that wrong impression? That is the central argument between Mitofsky and Kaus and others. Was it Edison/Mitofsky for how it managed and disseminated the results? Should the networks have spent more to assure better interviewing and coverage? Were they both wrong to resist disclosure of basic methodological details that might have helped reporters and editors and even bloggers better understand the limits of exit polls? Should those editors, reporters, and bloggers have known better? I tend to exempt the consumers, but otherwise, I find it difficult to place all the blame in one place, especially given how little we really know about what went wrong and why.
Having said all this, I will admit that I may have a bit of a blind spot with respect to Warren Mitofsky. He is, deservedly, a living legend in the field of survey research. In the 1960s and 1970s, working with a small group of colleagues at CBS News, he helped invent not only the exit poll, but also the CBS likely voter model and a practical methodology for random digit dial (RDD) telephone surveys that remains in use in to this day. He also spearheaded creation of the disclosure standards that explain the ubiquity of the "margin of error" in news stories about polling.
Of course, he also has a notoriously thin skin about criticism. In this regard, unfortunately, the email to Kaus speaks for itself.
I am willing to cut Mitofsky some slack — at least until I know more — about the nuts and bolts of why the exit polls were off. However, I tend to agree with his critics in one respect: The lack of transparency about basic methodology, the instinct to deny obvious problems and then blame the bloggers and his habit of lashing out in anger at criticism are at odds with someone of Mitofsky’s well deserved reputation and stature.
Conclusions:
– Kaus is a jerk. (Given the amount of traffic he sends towards this site, I don’t expect this sentiment to be replicated outside comments…)
– Mitofsky is also a jerk.
– Mitofsky’s exit polls are weak.
The first and third conclusions are not anything new.
I’d expect that the weaknesses of Mitofsky’s results are directly tied to his budget. If his clients want better accuracy, I’d expect they’ll have to pony up more dollars.
But low budget or not, given that Mitofsky has had a decades-long monopoly on exit polls, and given that his results are inevitably weak, if I were the client, I’d think it was time to give someone new a shot.
Petey is a jerk! (But an esteemed jerk.)
Mitofsky is “someone new”, even though he goes back a long way in the EP biz. Mitofsky was uninvolved in the VNS era and the 2000 meltdown. (I don’t recall if he was implicated in the abortive 2002 effort to build “Son of VNS”.)
“Mitofsky is “someone new”, even though he goes back a long way in the EP biz. Mitofsky was uninvolved in the VNS era and the 2000 meltdown.”
And perhaps I’m an uninformed jerk as well…
I was under the impression that Mitofsky has been the guy in charge of ALL American national exit polling since he invented the genre over 20 years ago.
I’ll call on the non-jerk Mark Blumenthal to arbitrate this one.
Or the exit polls were RIGHT, Kerry won, but Bush stole another election through voter inimidation and fraud.
Petey, RonK:
You’re both right (though I’ll stay out of the “jerk” exchange): Mitofsky was not formally part of VNS in the last few election cycles. He did help set up the first “network pool” exit poll, “Voter Research and Surveys” which later evolved into VNS. And in 2000 (perhaps earlier, I’m not sure) he and his current partner Joe Lenski served as exit poll/Election Night consultants for CBS and CNN. The leadership at VNS consisted of his principal deputies. See the bio at the Edison/Mitofsky site:
http://www.exit-poll.net/election-night/aboutmitofsky.html
By the way, welcome back Petey…
“You’re both right (though I’ll stay out of the “jerk” exchange)”
It’s actually a lovefest between RonK and me. We bonded in the trenches as the Deanie hoards threatened to break through the lines…
I think there are a couple of real problems with exit polls as they are done in this country, and they are related. One is that the pollsters don’t make a real attempt to sample each state in such a way as to accurately measure the demographics. The second is that they go a long way to hide as many details as possible, and to make the process as opaque as they can imagine. I suppose this is intended to forestall criticism. Good criticism would either improve the polling methodology, or it would make the customers realize what a pathetic piece of crap they are really purchasing.
All this aside, it is my understanding that they use the “Zogby method” of adjusting the demographics of their unweighted sample to either a prior demographic model, or they use the final results from the election to adjust the demographics. (Hence, if you look at the final exit polling numbers, magically they are equal to the final vote percentages.)
I’ve heard Andrew Kohut of the Pew Research Center discuss the exit poll methodology. It seemed to me in listening to him that he could barely keep from holding his nose while talking about the methodology being used (though he admitted that he didn’t know all of the details due to the lack of transparency).
“… or they use the final results from the election to adjust the demographics”
That would be just … wrong! I don’t see how that could possibly be justified.
Anybody?
Could somebody please explain to me why we need exit polls at all? To me, their results are too shaky and questionable to be worth the huge problems they have caused in the recent elections. When wrong results get out, as just happened, they shake people’s faith in the election results–a far worse consequence, if you ask me, than the wasted time of a few journalists who wrote the wrong stories on Election Night. When election results are in question, network commentators pretend they can’t tell us what the exit polls show, while winking and mugging through the cameras to make sure we understand the real message. And when polls are correct, as far as they go, but the results get out too early, as happened a couple of election cycles ago, the results of the election itself are compromised when those in Western time zones are discouraged from going to the polls at all. Why do we need this confusing hassle? Why can’t the media just wait for the actual election results like all the rest of us, and write their stories after the polls close? Seems to me this is what they ought to be doing anyway, since the early results, even when weighted, are so patently unreliable that anybody who leaks them or relies on them is derided as foolish. Why is this waste of time and money considered such a good idea?
Thor’s Hammer writes: ‘… or they use the final results from the election to adjust the demographics”
That would be just … wrong! I don’t see how that could possibly be justified.’
I’ve made the same point, and asked the same question. The only answer I’ve got back is the rather Stalinist one that the tallies are, ipso facto, correct. Not being a Stalinist, it’s hard for me to swallow that, but evidently other people aren’t similarly burdened.
Beatrix writes: “When wrong results get out, as just happened, they shake people’s faith in the election results”
Personally, I’d rather not rely on ‘faith’ when it comes to elections. I’d rather rely on something more substantial, such an an independent verification that the results are what officials claim they are.
Exit polls are one way to get that verification.
Not when the exit polls are so obviously wrong that they have to be “adjusted” (read “cooked”) to match the demographics of the actual results. I’d agree with you, Mairead, if there were any actual reason to believe that exit polls accurately reflect the actions of voters. But there’s no reason to think they do. The pollsters themselves concede that the polls do not do this when they use this last-minute “adjustment” to repair their results.
Exit polls are one way to get that verification.
One thing to remember is that exit polls are just that, polls. They have a margin of error associated with them. While you can use them as rough guides, you have to remember that they can be skewed one way or another due to random chance.
With that in mind, remember that all of the exit polls, please read a previous post at
http://www.mysterypollster.com/main/2004/12/have_the_exit_p.html
With the money quote:
“In short, Mitofsky and Lenski have reported Democratic overstatements to some degree in every election since 1990. Moreover, all of Lenski and Mitofsky’s statements were on the record long before Election Day 2004.”
Mitofsky and Edelman’s review of the 1992 Exit polls also said that there was Democratic bias in 1984 and 1988 as well…
“The difference between that final margin and the VRS estimates (in 1992) was 1.6 percentage points. VRS consistently overstated Clinton’s lead all evening…Overstating the Democratic candidate was a problem that existed in the last two presidential elections” (Mitofsky and Edelman, 1995, pp91-92).
So… the “problem” has been in every election since 1984.
Were they off before 1984? Perhaps, but I’d have to go re-read the texts a bit closer.
So… the polls have been overstating the Democratic candidate’s proportion at least since 1984. Does this mean there is fraud in the election tally in every election since 1984? What else could explain such systematic bias?
The issue here is that the average of states or the national exit poll has NEVER been this far off. A casual look at the Freeman data tells you, that while for ~40 of the 50 states, they were within the margin of error, the bias of the polls was pretty solidly toward Kerry. And, the national popular vote poll was WAY WAY off.
What can explain this? It could be either: 1) sampling error; 2) non-sampling error; or 3) innacurate tally.
Given that it is highly improbable (don’t believe the probability calcs you’ve heard from Simon/Baimon and Freeman, but it is still highly improbable) that the discrepancies could have occurred by chance alone, I suspect it’s a combination of #1 and #2. The tally may be innacurate as well, but how can we tell this from the exit poll? I don’t think you can.
beatrix,
You are mistaken, the exit polls were not skewed due to random chance, they were skewed due to bad methodology. Exit Polls can in fact be done extremely accutately..in fact BYU did an exit poll in utah that was extremely accurate.
Hey Mark,
Do you know what plans, if any there are to improve the exit poll methodology for future elections? I have personally been concerned that I havent heard any suggestions from pollster on improving the exit polls.
I don’t think I said random chance caused the bad results–instead, I questioned the methodology that has the pollsters tinkering with their results to match the elections. Frankly, I have no idea what causes bad results in exit polls; I am no statistician, just a voter who’s watched the messes caused by exit polls in the last three presidential elections in a row. No matter what causes the bad results, the polls seem to be generating them. My actual question was why the polls are accepted as a necessity in the first place. And I’m still wondering.
Beatrix:
The exit polls are the best way to get a sense of the voter profile. Read this blog’s Exit Poll FAQ for the type of data unique to exit polls.
The problem is that while valuable for voter profile data, it is true that exit poll vote counts probably should not be trusted with current methodologies. Of course, we should ask how accurate voter profiles themselves are in cases of such large discrepencies. This doesn’t mean the vote tallies are super accurate or worthy of your faith, IMO. It is not either/or, both can be wrong, and believing vote tallies in close elections without independent verification is not my cup of tea.
BUT don’t forget that the exit polls CAN be done accurately, as Mairead pointed out. And there may be other ways to audit or otherwise independently verify close election results. Sadly, I think the motivation to do so on a national scale, or even duplicate nationally BYU’s exit poll methodology does not yet exist.
“You are mistaken, the exit polls were not skewed due to random chance, they were skewed due to bad methodology. Exit Polls can in fact be done extremely accutately..in fact BYU did an exit poll in utah that was extremely accurate.”
Actually, they could be skewed due to random chance AND bad methods (AND vote fraud). Likewise the BYU poll could have “nailed” the election result based on skewed random chance AND bad methods.
The BYU exit poll had some margin of error. I imagine it couldn’t have been better than +/-2%.
That means 95 of 100 times, the poll will show within this range – this is due to random chance alone. Therefore, assuming a PERFECT poll methodology (and perfect vote count), the poll could be +/-2% from the election tally.
Suppose the poll underestimated Bush’s percent by 2% – is this “accurate”? Statistically, yes. According to the methods, it’s the best that could be done.
A little tougher to measure is the effect of the methods. E.g., what if sampling error alone yielded a 2% skew toward Bush, but then there was something wrong with the methods (either coverage error, differential non-response, poor training of pollsters, coding error, etc.) that skewed the result BACK 2% to the center.
The result would be an exit poll that “nailed” the election tally, but can we say that the exit poll was “extremely accurate”, can we?
The problem is, statistically, we don’t know if the BYU exit poll was “extremely accurate” or if some combination of random sampling error and non-sampling error can explain the “extremely accurate” outcome.
But consider that the exit poll for every presidential election since 1984 (that we know of) has had Democratic bias. That is like saying you flip a fair coin 6 times and get all 6 tails (not considering the significance of the discrepancy here – only the odds that it falls on one side of the distribution or another). Can random chance alone explain this? Sure.
But this is likely a simplistic way to look at this question. Meaning, analysis of the state-by-state or precinct-by-precinct variance have probably lead Mitofsky and Edleman to conclude that 100% of the 1992 discrepancy could not be explained by sampling error alone.
Rick writes: “But consider that the exit poll for every presidential election since 1984 (that we know of) has had Democratic bias.”
How do we know that? Because it differs to the tallies. How do we know the tallies are correct? We don’t!
Therefore, we also don’t know that there was any ‘Democratic bias’.
All we *know* is that something in the system is broken somewhere. Everything else is either a pious hope or a tinfoil-hat theory, depending on the state of one’s liver.
To have that be the limit of our knowledge is *not* good enough in a nation that purports to be a democracy. This should be the political equivalent of a ten-alarm fire. That it’s not is a betrayal of democracy on many levels, beginning with those who spin the situation with bland, reassuring words that aren’t supported by the evidence.
Beatrix writes: ‘Not when the exit polls are so obviously wrong that they have to be “adjusted” (read “cooked”) to match the demographics of the actual results. I’d agree with you, Mairead, if there were any actual reason to believe that exit polls accurately reflect the actions of voters. But there’s no reason to think they do. The pollsters themselves concede that the polls do not do this when they use this last-minute “adjustment” to repair their results.’
It’s worse than that, tho, Beatrix–they’re ‘repairing’ their numbers using tally numbers they have no good reason to believe are correct. The exit polls might be much closer to the truth than the tallies–they might well be *reducing* the accuracy when they ‘repair’ the exit numbers. And other evidence suggests that that’s exactly what’s been happening.
Our problem is that assumptions are being pushed that are very servicable to the few in power, but harmful to us–the majority of people whose nation it is. And there are a lot of professionals who, whether innocently or maliciously, are colluding in that and betraying the scientific principles they have an ethical obligation to uphold.
Rick,
First off, the margin of error for the Utah Poll was pretty small .2% or something like that if I recall because they used HUGH samples. and they got a result pretty close to the money.
Second the national Exit poll also has some small margin of error like .2% because it involves a hugh sample…but differed by much more than that…there is no doubt that skew in the exit poll due to random chance was a small and insignificant percentage of the skew.
Im not saying they need to do a utah size sample exit poll in every state..that would be hughly expensive…im saying they need to use methods which remove all the systemic error..so that when they look at the national exit poll..that will be right on the money.
Mairead wrote:
“It’s worse than that, tho, Beatrix–they’re ‘repairing’ their numbers”
Mairead:
They may be repairing their numbers, but not to trick you. The “repairing” is not nefarious and hidden, it is a standard procedure, we all know about it, and Mark discusses it here:
http://www.mysterypollster.com/main/2004/11/the_difference_.html
Also, Mark is not claiming that reweighting exit poll data to match the vote tally in any way proves the vote tally. The claim is that reweighting serves other purposes.
Unfortunately, the NEP does not release the unweighted data, so that confuses many people and makes things seem more sinister than they are.
I point this out to you, because I otherwise agree with the principles in your posts. Modern, national election standards are sorely needed in this country, I agree.
Brian, you wrote:
“First off, the margin of error for the Utah Poll was pretty small .2% or something like that if I recall because they used HUGE samples. and they got a result pretty close to the money.”
I was just guessing about the CI. BTW – to get +/- .2%, they would have had to had sampled over 125,000 voters assuming SRS (at n this high, the clustering effect is virtually non-existent). Did they? [(s.e. = CI/CL; se = .0002/1.96 = .0010204.) AND (s.e.=1.96*SQRT(.25/n); .0010204^2=1.96*(.25/n); .000001/1.96=.25/n; n=.25/.0000005; n=125,000)] Unless I did the math too quickly, I get 125,000 for +/-.2% at 95% CL. +/-1%, 95% CL is around 16,000 if I’m not mistaken (haven’t calc’d it).
So, if you know the sample size, you can calculate the margin of error using the above formulas (again, assuming SRS; if you know the reported CI, you can calculate an estimate of the design effect!).
“Second the national Exit poll also has some small margin of error like .2% because it involves a hugh sample…but differed by much more than that…there is no doubt that skew in the exit poll due to random chance was a small and insignificant percentage of the skew.”
Go to the NEP web-site. The 2004 national exit had a CI of +/-1%. If you assume a SRS, it was +/-.8%. This shows that at around 13,000 interviews, the design effect of the 2004 US exit polls virtually evaporated. But, for the Ukranian exits of similar sample size, the design effect was MUCH larger (I’ve done some analysis of this if you care to read it – it was linked to by The Daou Report).
You are correct the Utah margin of error was like 1% i believe, they claim to have sampled 9000 voters.
But I guess even when the error is 1% most of the time your result will be closer then 1% to the true result which is why they were so close.
Petey’s the only guy I ever saw get troll-rated for an apology (over some obscure errant cite in a Deaniac pyrotechnics exhibition).
There’s always a tide running on Election Day that doesn’t show up in the tide tables.
IOW, “I blame Society”. (H. Simpson, “Plow King” … but originally from Bongo in a Life in Hell episode, IIRC)
Some interesting numbers:
Bay Village – 13,710 registered voters / 18,663 ballots cast
Beachwood – 9,943 registered voters / 13,939 ballots cast
Bedford – 9,942 registered voters / 14,465 ballots cast
Bedford Heights – 8,142 registered voters / 13,512 ballots cast
Brooklyn – 8,016 registered voters / 12,303 ballots cast
Brooklyn Heights – 1,144 registered voters / 1,869 ballots cast
Chagrin Falls Village – 3,557 registered voters / 4,860 ballots cast
Cuyahoga Heights – 570 registered voters / 1,382 ballots cast
Fairview Park – 13,342 registered voters / 18,472 ballots cast
Highland Hills Village – 760 registered voters / 8,822 ballots cast
Independence – 5,735 registered voters / 6,226 ballots cast
Mayfield Village – 2,764 registered voters / 3,145 ballots cast
Middleburg Heights – 12,173 registered voters / 14,854 ballots cast
Moreland Hills Village – 2,990 registered voters / 4,616 ballots cast
North Olmstead – 25,794 registered voters / 25,887 ballots cast
Olmstead Falls – 6,538 registered voters / 7,328 ballots cast
Pepper Pike – 5,131 registered voters / 6,479 ballots cast
Rocky River – 16,600 registered voters / 20,070 ballots cast
Solon (WD6) – 2,292 registered voters / 4,300 ballots cast
South Euclid – 16,902 registered voters / 16,917 ballots cast
Strongsville (WD3) – 7,806 registered voters / 12,108 ballots cast
University Heights – 10,072 registered voters / 11,982 ballots cast
Valley View Village – 1,787 registered voters / 3,409 ballots cast
Warrensville Heights – 10,562 registered voters / 15,039 ballots cast
Woodmere Village – 558 registered voters / 8,854 ballots cast
Bedford (CSD) – 22,777 registered voters / 27,856 ballots cast
Independence (LSD) – 5,735 registered voters / 6,226 ballots cast
Orange (CSD) – 11,640 registered voters / 22,931 ballots cast
Warrensville (CSD) – 12,218 registered voters / 15,822 ballots cast
ca. 88K votes.
Alex in LA writes: “Also, Mark is not claiming that reweighting exit poll data to match the vote tally in any way proves the vote tally. The claim is that reweighting serves other purposes.”
But unless the numbers are known to be good, the only purpose they can *possibly* serve is a propagandistic, soporific one. It’s total junk science to use unverified numbers as though they’re known to be good, and there’s never any legitimate reason to do junk science.
(That should read ‘…is a propagandistic ( and in this case soporific) one.’)
Biran, you wrote:
“But I guess even when the error is 1% most of the time your result will be closer then 1% to the true result which is why they were so close.”
True, that’s the nature of a normal distribution! But, statistically, we say that if it’s within the margin, the “true” result could be off by the margin unless we have other reasons to explain the error.
The problem is that we rarely, if ever know the “true” population mean and we never take the sample 100 times to get the “mean of means.” So, that is why I say that if at 95% confidence it shows +/-X% and you actually hit the “population mean” (election tally in this instance), that doesn’t necessarily mean your poll was perfect. Know what I’m saying?
Of course, the pollsters will brag about how great their polls are. Take Zogby about his 1996 prediction… (I think it was 1996 that he got super close in a pre-election poll right?)
“+/-1%, 95% CL is around 16,000 if I’m not mistaken (haven’t calc’d it).”
Sorry, that’s at 99% CL.
Okay – I did screw something up. Memo to Rick and rest of MP readers: don’t try calculating sample size on a pad and paper when you can’t remember the correct formula; especially when you have access to this calculator:
http://www.surveysystem.com/sscalc.htm
Assuming a simple random sample, a sample size of 9,000 would give you about +/-1% at 95%CL. So, was there a design effect for the Utah exit? I bet it was small – 9,000 is a large sample size for such a small state.
Also, I was way off on my +/-0.2 sample size. It should be 240,100 at 95%CL.
Mairead, can you please provide a source?
Also, I wrote: “I bet it was small – 9,000 is a large sample size for such a small state.”
Wrong again…uhhh…if the population parameter is over 100,000, it doesn’t really matter what the sample size is relative to the total population.
Only an obsessed nerd would know, but it wasn’t a Clemson receiver…it was a defensive lineman who intercepted a pass and was tackled out of bounds right in front of the Ohio State bench.
Hi Mairead, you might want to check this out,
http://www.ohio.com/mld/ohio/news/10143328.htm?1c
I think this has been known for a while. You can also check out this web site for the actual number of votes cast in each precint in Cuyahoga county,
http://boe.cuyahogacounty.us/BOE/results/EL45.txt
Mairead:
If your purpose is not to audit an election but to try to improve your voter profile, then there is a scientific justification to argue:
*Assuming* the vote tally is more accurate than the exit poll, we can improve our voter profile by reweighting, etc, etc..
Your charge of propaganda is actually very reasonable, and I agree with it, because it is clear the NEP AND THE NETWORKS did not explain the reweighting to the public, have hidden the unweighted data, and have not even acknowledged the public’s legitimate concerns.
So, in practice the reweighting has played an unfortunate, and who knows maybe even deliberate propagandistic role, but that does not discredit the validity of the method for its intended purpose.
That is how I understood Mark’s post explaining the different data types. But let me know how you read it.
Rereading Mairead’s post I realize I haven’t answered his charge that since the vote tally is not known to be good it can’t be used for any scientific purpose. Not sure how to respond to that.
So let me just state that the reweighting procedure is a hypothetical procedure that, if a vote tally is ever shown to be good, would have provided better voter profile data than the exit poll alone. Maybe it can even be considered a hypothesis that is testable.
That is not junk science, even if it turns out to be wrong. It just seems you are arguing against the vote tally and throwing the reweighting procedure out with the bath water, so to speak. I don’t think that is scientifically necessary.
Alex in LA writes “So let me just state that the reweighting procedure is a hypothetical procedure that, if a vote tally is ever shown to be good, would have provided better voter profile data than the exit poll alone. Maybe it can even be considered a hypothesis that is testable.
That is not junk science, even if it turns out to be wrong.”
I can’t imagine how you can conclude it’s not junk science, Alex. Imagine some physician who prescribes without knowing what the chemical does. He says ‘if the medication turns out not to be poison, then that’s better than if I hadn’t prescribed.’. You’d call that good science? I’d call it malpractice!
(Just for the record, ‘Mairead’ is usually rendered in English as ‘Margaret’ 🙂
Mike Tocci writes: “you might want to check this out,http://www.ohio.com/mld/ohio/news/10143328.htm?1c
I think this has been known for a while.”
Oooops, thanks Mike. [busily scrapes egg from face]
Has everyone seen this?
http://www.votersunite.org/info/SnohomishElectionFraudInvestigation.pdf
A brief quote:
“Paper ballots were used for absentee ballots as well as election day provisional ballots (which together were 67.6% of the entire vote) while Sequoia touch screens recorded the remaining 32.4% of all votes on Election Day. After all paper ballots have been counted, Gregoire leads in Snohomish county by 97,044 to 95,228 for a lead of 1,816 votes. However, with only a remaining 32.4% being touch screen ballots, Rossi picks up almost five full percentage points, and wins Snohomish County’s electronic vote 50,400 to 42,143, a net difference of 8,257 touch screen votes, for a net victory for Rossi in Snohomish County of by over 6,441 votes.
In winning Snohomish County, Rossi became the only Republican since 1992 to beat a competitive Democrat in Snohomish County.7 For example, in the most recent hot race in Washington state, Democrat Maria Cantwell ousted incumbent Republican Senator Slade Gorton by a narrow margin statewide of only a couple thousand votes.
However, in Snohomish County, Cantwell won easily, taking 132,148 votes to Gorton’s 91,265 votes, or approximately 59% of the vote in both absentee and election day voting.”
Mairead – Interesting precinct numbers. If I follow your reasoning, you feel that the exit polls may be accurate, and that it may be the actual vote tallies which are wrong. You are supporting this by, in part, saying that at least ~88 thousand extra votes were registered in those precincts in Cuyahoga county.
Given that the vote tallies were in Bush’s favor, any “irregularities” such as you imagine would have to be by Republicans.
There are a couple of major problems with your suggestion however.
The first is that the county is Democratic. So how are Republicans going to commit massive voter fraud? Those in charge of the elections in the county will be Democrats!
Second, from the Cuyahoga website furnished by Mike Tocci, Bush only managed 221,000 votes in the county compared to Kerry’s 448,000. Subtract 88,000+ from Bush’s already meager total? That seems farfetched. And let me note that Voinovich, the Republican Senatorial candidate, actually managed to eek out a slight win in Cuyahoga.
I don’t believe there was any significant and/or organized fraud committed. However, if there was any, the Democrats are far, far likelier to have been the culprits.
The real smoking gun which everyone is overlooking in the exit polls is the slanted structure of the questons designed to produce a false impression that as many voters as possible were motivated to vote primarily by ‘moral values’. For more on this aspect of the issue take a look at http://www.elitistpig.com
Dave
Hi Margaret,
Looks like we’ll have to agree to disagree.
I don’t think it is junk science to perform additional manipulations on data, even if those manipulations turn out to be based on data that is later shown to be bad. If it is not known to be bad at the time, you would be within your scientific rights, so to speak.
But I think your concern is not with violations of the scientific method. So I repeat, the reweighting process is not intended to say, “upon a second look, our exit polls show Bush should have won, afterall.” The reweighting is an attempt to improve the quality of the voter profile data by using the “presumably” more accurate vote tally.
I think we all agree with you that networks are wrong to not explain this to the public and to hide the “real” exit polls, which show Kerry ahead.
I hope our dialogue has been useful to you. More useful would probably be Mark’s post I referred to earlier.
The facts are I don’t understand your assertions about the Freeman paper. Some calculations are in order when you make an assertion about significance. Take the Ohio vote. (Many states may fall in sampling error). The question to ask is assume the final tally i.e. the Bush win is correct, then what is the likelihood that a the percentage favoring Kerry would be found in a sample of size given by the number of respondents. The variance of the sample given Binomial proportions would be .49*.51/N .49*.51/1963 = 1.273 * 10*-4 with a standard error of .011, On the other hand the difference in the election proportion and the sample proportion =.036
The ratio calculated is 3.19 standard errors with a P value of .0008 percentage points. A highly signicant result. If we assume Bush won but the p value was immeasurably different from .5 then the standard error is 1.273*10^-4 the standard deviation is .013 the difference of proportions is .021 and the ratio calculated is 1.86 which is significant single tailed at the 3% level. Notice here I very charitably assume the final tally is not an accurate proportion. Mark you need to explain your assertions just not make them. If we were serious about elections (the US never has been) the discrepancies between exit polls and data need to be explained. The argument I have seen made that exit polls are frequenctly beyond the margins need to be explained, for example its by no clear that there have been many cooked Presidential Elections in the past, people simply have not cared to investigate. By cooked I don’t necessarily mean Fraud, incompetence and intimidation are enough to do the job.
Statistical reliability should be built into vote counting. Given what we now know about so called election practises in the US, its dubious aside from reassuring ourselves that if the election is at all close, that it could equally well have gone the other way. As Jesse Helm’s used to say in another context, our elections befits “a third rate water Buffalo type of country”.
Michael Cohen, I think you are failing to account for the design effect. Also, how do you justify the single tail?
Freeman’s statistics are based on the assumption that there is no bias in the exit poll (sampling error is the only effect on the poll) and that the election result is accurate. Then he determines that that the OH result (as well as PA & FL) are significant at 95% CL (single-tail).
Given the assumptions, we are concerned with the significance/non-significance question. If there is a significant result, then we reject the null. If it’s not significant, however, we do not reject the null. However, this finding of non-significance can be Type II, but when our threshold is 95% CL, we must refuse to reject the null, even if it is close.
Okay, my point?
Given the assumptions used by Freeman (fair election and perfect exit poll), the question is: Was there a significant difference between the exit poll and the election result (i.e., was the difference outside the margin of error). By definition, a confidence interval is two-tailed (hence +/-X%).
By asking the question: Did Kerry’s proportion singificantly exceeded the election result?; Freeman interjects a hypothesis into his nullhypothesis that violates his underlying assumption of no fraud and sound poll. He is failing to acknowledge that, given a fair election result and a perfect poll (no bias), the odds that the discrepancy could have been significant in the other direction are exactly the same. By using a single-tail, he is suggesting something in violation of his assumptions.
Simon and Baiman do this as well.
Another problem that both Freeman and Simon/Baiman have. They apply the confidence interval to the exit poll, and not the election result.
When you have an accurate vote count, you know the population parameter. The confidence interval of a poll is associated with the mean of samples, which means that if you take the same poll 100 times and then take the average of those 100 polls, you will get a mean. This mean has the confidence interval, not a single poll from the 100.
The only time you apply the confidence interval to a single poll is when both the true mean and the mean of samples is not known. Assuming no bias in the poll or fraud in the election result, the true mean is statistically the same thing as the mean of samples. Therefore it is absolutely inappropriate to distort the data and analysis by applying the confidence interval to the exit poll data given their assumptions. Their presentation is illogical, biased, and distorting. This is simple to correct; it’s only a matter of presentation. The use of the single-tail however, has implications for their findings.
I’ve done some prelimary analysis. Given a two-tail test, use of the NEP rough guides for the confidence interval (not the 1996 1.3 factor for design effect), and depending on the rounding procedure, the OH, FL, and PA discrepancies can be significant, or not significant. The data are too fuzzy to make this determination.
We all know the exit polls are way off. The Simon/Baiman paper executive summary included the following bullets:
The Simon/Baiman executive summary states:
* Evidence does not support hypotheses that the discrepancy was produced by problems with the exit poll.
* Widespread breakdown in the fairness of the voting process and accuracy of the vote count are the most likely explanations for the discrepancy.
* In an accurate count of a free and fair election, the strong likelihood is that Kerry would have been the winner of the popular vote.
I think they put forth statistical analysis (terrible and misleading analysis), to give these bullets some sort of credibility to those ignorant of stats. These bullets certainly may be accurate; but their analysis certainly does not support them.
Let me correct myself, Freeman applied the confidence interval correctly (see graph on page 12 of his paper). Simon/Baiman do not.
(I’ll just point out to Alex that my *name* is Mairead, and suggest that his confusion between my name and its conventional rendition into English could possibly have some significance for his similar confusion between knowledge and assumption generally. One can, indeed, be ‘within one’s scientific rights’ to act on what later turns out to be bad data—but only if one has good reason at the time of acting to think it’s good data. The history of voting tells us that there is no such reason in the case of the ballot box, when highly partisan forces are in charge.)
Und damit basta.
Thanks Rick for your explanation. I agree with some of this but not others. First of all there is no reason to adopt the Motovsky Edison guidlines for significance, I adopt the standard 5 % level used in scientific work universally. Motovsky may have to be in the position of not ever producing a false positive for the networks. He is obviously more concerned about this than making a correct prediction i.e. seeing an effect. The four standard deviations he recommends as a confidence interval is far in excess of standard scientific practise. Its in fact a p value of .000003, and would rule out almost all scientific work.
Let me be clear there are many states that were significant in the way I calculated at the 5% level and there were many that wasn’t. I did them all on a spreadsheet which I am happy to post I don’t see your assertions about single tail being biased at all. The natural thing in my view to ask if you think the Kerry exit poll to be too large relative to the popular vote is whether it falls out of a one tailed 95% confidence interval about the election result which you regard as true. Presumably the same would hold if Bush and Kerry were interchanged i.e. you could ask if the Bush total was too large or small as well. I don’t think this makes any difference in the case of Ohio even with the variance inflation. I need 1.96 standard deviations
two tailed and 3.16/1.3 is
Asking the question about whether the mean of a fixed sample of a certain size agrees with a fixed population mean of a true election which should be a constant because of the sample size is exactly what a confidence interval for. Its in principle no different from asking whether whether a given spectral line is consistent with the known emission spectrum of hydrogen.
By the way looking at the Freeman data carefully made me realize some of what he did was a red herring. Kerry won Pennsylvania on Freeman’s account, Bush won Florida on his account, the only state where there was a discrepancy between the outcome and the maximum likelihood prediction from the poll was Ohio. Ohio is so ridden with irregularities some of which I personally witnessed, its election was so poorly conducted that it should have been thrown out whatever the exit polls said or not.
I did not apply the 30% variance for the cluster design because I had no solid statistical reference for the correction.
I was standing by a pollster. My intuition was that the voters in the precinct were independent not correlated but who knows. The people on line rarely knew each other, a condition I would think necessary for inter cluster correlation. I would like to see where this variance correction comes from. I could derive something but if you have a reference it would be great.
When I did the analysis on all the other states anything remotely resembling statistical significance between the poll results and the outcome proportion I discovered that all the errors were in favor of Bush. Colorado, Minnesota, Ohio, Pennsylvania and Florida were significant with in your view my liberal way of doing things. A simple sign test will show
that the the likelihood of this happening
is one part in 64 or significant at the 1.5%
level.
Its really good to have this conversation because everything else I have just seen is assertion. Save the Simon and Freeman and Berkeley Papers. There is lots of assertion that something is wrong because out there but without fixing assumptions its impossible to know what people are really talking about.
–mike
Mairead, I do apologize for addressing you incorrectly a second time. I am not too familiar with Irish names (and you may not be Irish!)
” It’s total junk science to use unverified numbers as though they’re known to be good,”
With this I agree, in that the NEP and networks do act as if the reweighting to match election results is the “correct” exit poll, when it obviously is no more valid than the pure exit poll for purposes of knowing who won the election. This may not have been clear in my posts.
I wish our exchange had been clearer, and for that I also apologize. My enthusiasm overreached my abilities in attempting to meet the challenge you and Beatrix and others posted on this blog on the reweighting issue.
Good luck.
Warren Mitofsky: An Appreciation
Sixteen years ago, I called Warren Mitofsky at his office in New York with a question. What made the conversation remarkable was neither the reason for my query nor the substance of his answer. What was remarkable was that he…