Zogby Troop Poll: The Random Probability Sample

Iraq Legacy blog posts Polls in the News Sampling Issues

On the Zogby poll of U.S. troops in Iraq, I need to make one point that was implicit in my comments on Wednesday a bit more explicit.  While much is shrouded in secrecy, one aspect of the methodology is clear from the information that John Zogby has provided on-the-record:  The survey did not involve a "random probability" sample of all American troops serving in Iraq.

The principle of random sampling is what makes a poll "scientific."  To meet that standard in this case, every member of the U.S. armed services in Iraq should have had some chance of being selected (or to put it statistical terms, the probability of selection had to be either equal or known for every member of the population).  As I wrote yesterday, the constraints Zogby faced in gaining access to troops at "undisclosed locations throughout Iraq" made random selection of those locations impossible.

It is also unclear — both from information in the public domain and from what Zogby shared with me in confidence — whether his selection procedures amounted to random probability sampling even at the undisclosed locations.  I did not press Zogby for details on that process, because under the terms of our agreement I could not report the details.  While I could speculate about his procedures, unfortunately, doing so would require disclosure of information I promised not to disclose.

However, as this example provides an opportunity to learn something about the survey process, consider how the Gallup organization went about conducting a "strict probability based" random sample of ordinary Iraqis citizens in 2004.   To grossly oversimplify the design, Gallup randomly selected 350 neighborhoods ("primary sampling units") from a list of over 116,314 in Iraq (the "sample frame"), using population statistics to make sure the probability of selection was proportional to the size of the neighborhood.  In other words, every neighborhood had an equal probability of being selected. 

Then, interviewers went to each neighborhood and compiled a list of every family living within every dwelling in that neighborhood and randomly selected families from that list. Within each selected household that agreed to participate, they took an inventory of all family members over 18 years of age and randomly selected one adult to be interviewed in a way that insured that both genders had an equal chance of inclusion, with no one allowed to self-select into the sample. 

Thus, Gallup used a standardized procedure that gave every Iraqi adult (excluding those in institutions and those in two small Kurdish "governates" – see the footnote) a chance of selection, and the selection procedure was random at every step.  That, as Gallup puts it, is a "strict, probability based sample design."  Note also how much detail Gallup was willing to disclose about their sampling methods, despite very real concerns about the safety of their interviewers.

Now consider one of the examples I mentioned on Wednesday, the survey of Katrina evacuees conducted jointly last year by the Washington Post, the Kaiser Family Foundation and the Harvard School of Public Health.  In that case, it was not feasible to use random techniques to try to survey the population of all of those who had to evacuate their homes because of Katrina.  The full population had scattered widely, and the researchers lacked a public list of all evacuees or count of evacuees by their new geographic locations.  So instead they did a survey limited to some (but not all) of the shelters housing evacuees in the Houston area.  The selection of these locations was neither random nor representative of all evacuees, but the pollsters made no such claims.  Instead they were careful to report the results as projective of "evacuees living in shelters" in Houston.

But review the procedures the researchers used and you will see the effort made to randomly select respondents at each location. 

For areas where the evacuees either had limited mobility or were non-mobile — –for example, cot areas occupied largely by elderly or infirm evacuees, or TV lounge areas — interviewers moved through the respondent population. Specifically, interviewers were given a random number and instructed to count off this number of people before beginning the first/next interview. After an interview was completed (or a refusal obtained), interviewers would again count off using the random interval before selecting the next respondent.

For areas where evacuees were mobile — for example hallways and evacuee service areas — interviewers stayed in one particular spot throughout the interviewing period. They then counted people who passed their defined location and chose the (randomly generated) nth person to interview. This selection criteria was duplicated at the conclusion of each contact attempt, whether it was a completed interview or a refusal.

So what can we say about the degree to which the Zogby survey used random probability sampling to survey U.S. troops?  Again, as I wrote earlier in the week, the method Zogby used to gain access to the undisclosed locations constrained his ability to select them.  The selection was not random, but since he will not disclose the locations, we cannot take their identity into account in evaluating the results.  The release specifies a sampling error of  3.3% (a statistic that, given the sample size, is based on the assumption of simple random sampling), but that margin is a bit deceiving.  Plus or minus 3.3% compared to what?  All we know for certain is that the poll was not a random sample of the population of all U.S. troops in Iraq. 

As to the selection of respondents at those unspecified locations, we also do not know the procedures used to select respondents.  Again, I did not press Zogby on the details of those procedures in our conversation because I would not have been able to report them here.  I believe MPs readers deserve more than "trust me" as an explanation.  Obviously, for me to speculate now about what Zogby might have done would require getting into the details I promised I would not reveal.

Some news organizations (like ABC News) have adopted strict standards that require the use of probability sampling and bar reporting of surveys based on intercept selection techniques in absence of a "credible sampling frame." Others are obviously less rigorous. 

In the business world, commercial market researchers sometimes use non-random sampling  (including many Internet based "panel" surveys) when rigorous probability samples are impractical or prohibitively expensive.  However, the most ethical of these market researchers do not attempt to dress up such "convenience" samples as more than they are.  Their clients pay for such projects on the assumption that the information obtained, while imperfect, is the best available.

John Zogby insists it is enough that those of us who have heard more about his survey’s methodology conclude that it was "honestly and objectively done."  I think he misses an important point. Consumers of Zogby’s Iraq troop poll data also need to understand where it fits on the continuum between strict probability-based sampling and non-random convenience sampling.  Zogby certainly believes that "security concerns" prevent further disclosure, that we do not "need to know" more.  Perhaps.  But without knowing more, it is hard to decide whether to trust the results. 

Mark Blumenthal

Mark Blumenthal is political pollster with deep and varied experience across survey research, campaigns, and media. The original "Mystery Pollster" and co-creator of Pollster.com, he explains complex concepts to a multitude of audiences and how data informs politics and decision-making. A researcher and consultant who crafts effective questions and identifies innovative solutions to deliver results. An award winning political journalist who brings insights and crafts compelling narratives from chaotic data.