NYT: We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results,US Opinion poll 2016,US Presidential Election survey Result
You’ve heard of the “margin of error” in polling. Just about every article on a new poll dutifully notes that the margin of error due to sampling is plus or minus three or four percentage points.
But in truth, the “margin of sampling error” – basically, the chance that polling different people would have produced a different result – doesn't even come close to capturing the potential for error in surveys.
Polling results rely as much on the judgments of pollsters as on the science of survey methodology. Two good pollsters, both looking at the same underlying data, could come up with two very different results.
How so? Because pollsters make a series of decisions when designing their survey, from determining likely voters to adjusting their respondents to match the demographics of the electorate. These decisions are hard. They usually take place behind the scenes, and they can make a huge difference.
To illustrate this, we decided to conduct a little experiment. On Monday, in partnership with Siena College, the Upshot published a poll of 867 likely Florida voters. Our poll showed Hillary Clinton leading Donald J. Trump by one percentage point.
We decided to share our raw data with four well-respected pollsters and asked them to estimate the result of the poll themselves.
Here’s who joined our experiment:
• Charles Franklin, of the Marquette Law School Poll, a highly regarded public poll in Wisconsin.
• Patrick Ruffini, of Echelon Insights, a Republican data and polling firm.
• Margie Omero, Robert Green and Adam Rosenblatt, of Penn Schoen Berland Research, a Democratic polling and research firm that conducted surveys for Mrs. Clinton in 2008.
• Sam Corbett-Davies, Andrew Gelman and David Rothschild, of Stanford University, Columbia University and Microsoft Research. They’re at the forefront of using statistical modeling in survey research.
Here’s what they found:
|Charles Franklin||42%||39%||Clinton +3%|
|Patrick Ruffini||39%||38%||Clinton +1%|
|Omero, Green, Rosenblatt||42%||38%||Clinton +4%|
|Penn Schoen Berland Research|
|Corbett-Davies, Gelman, Rothschild||40%||41%||Trump +1%|
|Stanford University/Columbia University/Microsoft Research|
|NYT Upshot/Siena College||41%||40%||Clinton +1%|
How to make the sample representative?
Pollsters usually make statistical adjustments to make sure that their sample represents the population – in this case, voters in Florida. They usually do so by giving more weight to respondents from underrepresented groups. But this is not so simple.
What source? Most public pollsters try to reach every type of adult at random and adjust their survey samples to match the demographic composition of adults in the census. Most campaign pollsters take surveys from lists of registered voters and adjust their sample to match information from the voter file.
Which variables? What types of characteristics should the pollster weight by? Race, sex and age are very standard. But what about region, party registration, education or past turnout?
How? There are subtly different ways to weight a survey. One of our participants doesn’t actually weight the survey in a traditional sense, but builds a statistical model to make inferences about all registered voters (the same technique that yields our pretty dot maps).
Who is a likely voter?
There are two basic ways that our participants selected likely voters:
Self-reported vote intention Public pollsters often use the self-reported vote intention of respondents to choose who is likely to vote and who is not.
Vote history Partisan pollsters often use voter file data on the past vote history of registered voters to decide who is likely to cast a ballot, since past turnout is a strong predictor of future turnout.