Another ‘Phantom Swing’? Investigating Differential Nonresponse in 2018

In an era of single digit response rates, many Americans distrust polls and some say they are inherently skewed. The truth is more complicated. Pollsters have long studied the way people respond to surveys and worked to prevent the errors that that non-response can cause, mostly with success, but there is growing evidence of very rare “phantom swings” driven by nonresponse. This article reviews an analysis data I presented this year at the annual conference of the American Association for Public Opinion Research (AAPOR) on one such episode during campaign 2018.

Do lower response rates make polls less accurate? There is little evidence they do. Pew Research finds “little relationship between response rates and accuracy” once their surveys have been weighted to match population demographics. On average pre-election polls continue to be as accurate (and, sometimes, as inaccurate) as they have always been, even after 2016.

But pollsters and consumers of their data rightly fear the circumstances when the assumptions behind survey weighting break down. The worry? Those who choose to participate in the survey might have different views than those who do not, especially if difference were to extend across all of the demographic subgroups used for weighting. The complex-sounding term for this type of polling error is “differential nonresponse bias.” While it is difficult to detect and arguably rate, we have seen evidence that brief episodes of this phenomenon occurring during in recent elections.

In the 2012 and 2016, pollsters who conducted “panel” surveys that recontacted previously interviewed respondents found evidence of “phantom swings” in voter preferences caused by changes in the composition of their samples. During the 2012 campagin, Democrats were more likely than Republicans to participate in panel surveys conducted just after the first presidential debate, in which President Barack Obama had fared poorly against challenger Mitt Romney. During the 2016 campaign, Donald Trump’s supporters were less likely than Hillary Clinton supporters to participate in panel-back reinterviews conducted just after the release of the Access Hollywood video.

During the 2018 campaign, something similar happened in SurveyMonkey’s polling following congressional hearing in which Christine Blasey Ford aired accusations of sexual assault against Supreme Court nominee Brett Kavanaugh.

How We Collected the Data

Over a span of nearly two years, from February 2017 to November 2018, SurveyMonkey conducted ongoing weekly tracking of President Donald Trump’s approval rating. Our typical weekly sample was unusually large compared to other polls, usually over 10,000 interviews, reaching a total of roughly 1.3 million respondents over two years.

Respondents were selected from the more than two million people per day who take the thousands of surveys fielded every day on the SurveyMonkey platform. After completing such a survey, a random selection of those people saw an invitation to share their opinion about where “you stand on current events.”

During 2017 and 2018, every respondent who clicked through started with the same two questions: A closed-ended question about the issue that mattered most (with the list of issues) and the approval question: “Do you approve or disapprove of the way Donald Trump is handling his job as president.”

Typically, 70% to 75% of those answered one of these questions continued and completed the full survey (which also included another included topical questions that varied from week-to-week). We reported results from those who completed the survey, after weighting the data to match the demographic composition of the U.S. (more on details on SurveyMonkey’s methodology here).

Given that we always started with the Trump approval question, the completion data served as a critical tool to evaluate what happened just after the Kavanaugh/Blasey-Ford hearings in late September 2018.

Trend in Trump approval

After a decline during the first months of his administration in 2017, Trump’s job approval varied within a narrow range (39% to 46%) between May 2017 and September 2018 in SurveyMonkey tracking. It was even more stable between March and September 2018, when Trump’s our trend line was consistent with an average rating of 44.2%, plus or minus a our typical modeled error estimate of 1.5%.

But our first weekly roll-up following the Kavanaugh/Blasey-Ford hearing on September 27 – fielded from September 27 to October 3 – Trump’s approval jumped three percentage points, hitting 47% for the first time since the earliest days of his administration in our tracking. It remained at 47% the next week, then faded. Trump’s approval rating for the last two weeks before the election – 44% and 45% – was consistent with the prior average.

Two things stand out about the post-Kavanaugh hearing jump in Trump approval. First, given the very large weekly samples, the post-hearing increase is highly statistically significant. It did not occur by chance alone. Second, the three-point jump, from 44% to 47%, is modest as polling shifts go. It seems big in contrast to eerie consistency we had measured in Trump’s rating in the preceding months, but had we sampled only a thousand Americans each week as most pollsters do, instead of ten thousand plus, the shift might have been lost in the sampling noise.

That said, the other pollsters that produced regular tracking of Trump approval reported a similar, more muted bumps up, as evidenced in the trend lines reported by FiveThirtyEight and HuffPost Pollster.

SurveyMonkey’s polling also showed a nearly identical shift in our tracking of the “generic” U.S. House vote question (which we asked on much smaller samples on a mostly weekly basis). The 2–3 point shift just enough to move from a clear lead for the Democrats – within range of what they needed to win the House – to a dead heat with the Republicans.

While the polling averages for the generic House vote did not show a similar bump [FOOTNOTE TK?], a post-Kavanaugh/Ford bump was in evidence in the Upshot/Sienna polls in competitive House races, various polls in competitive U.S. Senate races and an NPR/Marist poll showing greater Republican enthusiasm nationally. In other words, to the extent that other polls picked up the same bump, it helped drive some of the media narrative around the campaign in early October.

Real or Phantom?

Was the shift real? Three pieces of evidence strongly suggest that it was mostly an artifact of changing sample composition.

The first clue was that the responses to the party identification showed a shift nearly identical to what we had seen in Trump approval. Between July and September, 38% of our samples identified with or lean to the Republicans, 42% identified or leaned to the Democrats. In the two weeks following the Kavanaugh/Blasey-Ford hearing, the Republican percentage jumped three points (to 41%), Democrats fell by two (to 40%).

At the same time, Trump’s approval showed virtually no change when tabulated by party.

[TREND IN APPROVAL BY PARTY]

That combination strongly suggests a shift in the kinds of people making up our samples, but since party is an attitude – the question asks, “do you consider yourself a Republican, Democrat or independent – this evidence alone is not conclusive. It is at least theoretically possible that the hearings changed both views on Trump and American’s partisan attachments.

The second evidence comes from looking at the demographics of our samples before we apply survey weights [FOOTNOTE to explain that weighting by demographics would conceal differences in the demographics of the unweighted respondent pool?]. In the immediate aftermath of the Kavanaugh/Blasey-Ford hearing, we more non-college women and more Americans in rural zip-codes.

The differences were slight (between 2 to 3 percentage points) but highly statistically significant given the massive sample sizes, and consistent with the increase we also saw for Republicans and self-described conservatives in the unweighted data.

By the end of October, our unweighted sample composition returned to an exact match to our July to September samples’ representation of these demographic and partisan subgroups.

Finally, and perhaps most important, we saw a three-point jump in the completion rate among Trump approvers (from 75.3% to 78.8%) for the three weeks following the Kavanaugh/Blasey-Ford hearing, a rate that faded back to 76.1% for the two weeks before the election. The differences among Trump approvers were not large enough to be statistically significant.

Remember, we can tabulate these different completion rates because we asked about Trump approval at the very beginning of every survey we conducted during 2018. That said, the completion rate, alone, tells us nothing about respondents who did not click when the saw the survey invitation.

Conclusions

SurveyMonkey’s tracking surveys in late 2018 yield evidence consistent with a brief episode of differential response bias: Trump supporters were slightly more eager to complete our surveys for three weeks following the Kavanaugh/Blasey-Ford hearing in late September 2018. As a result, the tracking surveys showed a small, apparently phantom uptick in Trump’s approval rating, a trend that would have never appeared had the willingness to complete our surveys remained constant.

What can pollsters do to identify and mitigate this problem in the future?

First, make greater use of classic panel surveys, like those that helped detect the phantom shifts of 2012 and 2016, that interview the same respondents before and after key political events. These polls are critical tools for monitoring change among individual voters during predictable major campaign events (such as debates, political conventions or the final weeks of campaigns).

Second, samples drawn from lists of registered voters can often include either the respondents’ actual party registration or party primary voting (or modeled scores largely based on these data). Since the same data is available for every voter on this lists, these samples allow for checks on nonresponse by partisanship and defensible party weighting strategies to correct any such bias.

Third, these episodes highlight the value of weighting by some self-reported measure of partisanship when the first two methods are not feasible. Weighting by self-reported past vote, as described by Rivers and Lauderdale, is one approach that minimizes the problems of weighting on party identification.

It is likely that we will see similar phantom swings in the future, as we now have evidence of momentary “differential nonresponse” at critical junctures of three of the last four national elections. The harder issue is knowing when such false trends occur.

Put another way, there is also a danger in making too much of these occasional polling misfires. It would be wrong to dismiss every new trend as another episode of “differential nonresponse,” because the three recent examples had very unusual preconditions. The first presidential debate in 2012, the release of the Access Hollywood tape in 2016 and the Kavanaugh nomination in 2018 were exceptionally high-profile events that fully dominated national news coverage for days or weeks. We should not expect ordinary political stories to have the same impact.

1 thought on “Another ‘Phantom Swing’? Investigating Differential Nonresponse in 2018”

hyh says:

September 30, 2019 at 4:28 pm

Do you think SurveyMonkey is more sensitive to differential nonresponse bias as compared to other pollsters with different methods of obtaining response? Or perhaps other pollsters change their numbers to account for what is suspected to be nonresponse bias (e.g. by party affiliation)?

Welcome back by the way!

Another ‘Phantom Swing’? Investigating Differential Nonresponse in 2018

More Stories

MysteryPollster Is Back! (Sorta)

MysteryPollster is Back!

Gelman: Nonresponse Not So Rare

1 thought on “Another ‘Phantom Swing’? Investigating Differential Nonresponse in 2018”

Leave a Reply Cancel reply

More Stories

MysteryPollster Is Back! (Sorta)

MysteryPollster is Back!

Gelman: Nonresponse Not So Rare