# Mind the ‘credibility’ gap

## Colby Cosh finds out what subsets, modelling assumptions and ‘non-probability samples’ have to do with polling these days

(J Pat Carter/AP Photo)

Over the weekend, the estimable David Akin was talking U.S. politics with Ipsos’s Darrell Bricker on Twitter when he noticed an unfamiliar verbal oddity in a Reuters report on the polling firm’s recent survey of early voters.

Obama leads Romney 54 per cent to 39 per cent among voters who already have cast ballots, according to Reuters/Ipsos polling data compiled in recent weeks. The sample size of early voters is 960 people, with a credibility interval of plus or minus 3.5 percentage points.

Huh, what’s this “credibility interval” business? Sounds like a different name for the good old margin of error! But why would we need a different name for that? This question, it turns out, is the pop-top on a can of worms.

The polling business has a problem: when most households had a single land-line telephone, it was relatively easy to sample the population cheaply and well—to estimate quantities like voter intentions in a clean, mathematically uncomplicated way, as one might draw different-coloured balls from a single urn to estimate the amounts of each colour amongst the balls on the inside. That happy state of affairs has, of course, been reduced to chaos by the cell phone.

The cell phone, increasingly, does not just divide the population into two hypothetical urns—which is basically how pollsters originally went about solving the problem. Its overall effect (including the demise of the telephone directory) has affected the math of polling in several ways, all of them constantly intensifying; declining response rates to public surveys (“Get lost, pal, you’re eating up my minutes”) are the most obvious example. Put simply, individual members of the public are no longer necessarily accessible for polite questioning by means of a single randomizable number that everybody pretty much has one of. The problem of sampling from the urn has thus become infinitely more complicated. Pollsters can no longer assume that the balls are more or less evenly distributed inside the urn, and it is getting harder and harder to reach into the urn and rummage around.

So how are they handling this obstacle? Their job, at least when it comes to pre-election polling, is becoming a lot less like drawing balls from an urn and more like flying an aircraft in zero-visibility conditions. The boffins are becoming increasingly reliant on “non-probability samples” like internet panel groups, which give only narrow pictures of biased subsets of the overall population. The good news is that they can take many such pictures and use modern computational techniques to combine them and make pretty decent population inferences. “Obama is at 90 per cent with black voters in Shelbyville; 54 per cent among auto workers; 48 per cent among California epileptics; 62 per cent with people whose surnames start with the letter Z…” Pile up enough subsets of this sort, combined with knowledge of their relative sizes and other characteristics, and you can build models which let you guess at the characteristics of the entire electorate (or, if you’re doing market research, the consumerate).

As a matter of truth in advertising, however, pollsters have concluded that they shouldn’t report the uncertainty of these guesses by using the traditional term “margin of error.” There is an extra layer of inference involved in the new techniques: they offer what one might call a “margin of error, given that the modelling assumptions are correct.” And there’s a philosophical problem, too. The new techniques are founded on what is called a “Bayesian” basis, meaning that sample data must be combined explicitly with a prior state of knowledge to derive both estimates of particular quantities and the uncertainty surrounding them.

A classical pre-election voter survey would neither require nor benefit from ordinary knowledge of the likely range of President Obama’s vote share: such surveys start only with the purely mathematical specification that the share must definitely be somewhere between 0 per cent and 100 per cent. A Bayesian approach might start by specifying that in the real world Obama, for no other reason than that he is a major-party candidate, is overwhelmingly likely to land somewhere between 35 per cent and 65 per cent. And this range would be tightened up gradually, using Bayes’ Law, as new survey information came in.

This is probably the best way, in principle, to make intelligent election forecasts. But you can see the issues with it. Bayesianism explicitly invites some subjectivity into the art of the pollster. (Whose “priors” do we use, and why?) And in making the step from estimating the current disposition of the populace to making positive election forecasts, one has to have a method of letting the influence of old information gradually attenuate as it gets less relevant. Even nifty Bayesian techniques, by themselves, don’t solve that problem.

Pollsters are trying very hard to appear as transparent and up-front about their methods as they were in the landline era. When it comes to communicating with journalists, who are by and large a gang of rampaging innumerates, I don’t really see much hope for this; polling firms may not want their methods to be some sort of mysterious “black box,” but the nuances of Bayesian multilevel modelling, even to fairly intense stat hobbyists, might as well be buried in about a mile of cognitive concrete. Our best hope is likely to be the advent of meta-analysts like (he said through tightly gritted teeth) Nate Silver, who are watching and evaluating polling agencies according to their past performance. That is, pretty much exactly as if they were “black boxes.” In the meantime, you will want to be on the lookout for that phrase “credibility interval.”  As the American Association for Public Opinion Research says, it is, in effect, a “[news] consumer beware” reminder.

## Mind the ‘credibility’ gap

1. You may be giving the pollsters more credit than they deserve for “trying hard to be transparent.” You are one of the few people that I know of to mention response rates as a big problem.

2. Very interesting Cosh, I have been wondering why Americans are having such heated discussions about polls and their reliability.

Maybe soon statisticians will create model for our political preferences – call it a ‘voter prediction’ score – and instead of releasing polls, we will focus on weekly sales of hamburgers, cotton balls, and pillows to find out who is winning/losing.

• “I have been wondering why Americans are having such heated discussions about polls and their reliability.”

A (very) roughly Bayesian approach works well here… Americans get heated about polls for the same reason that people everywhere else get heated about polls; because they show the wrong person winning. Where an election is very close, as this one is, more people get heated more often because the “wrong” guy is in the lead for more people more often.

3. NY Times ~ How Companies Learn Your Secrets:

Andrew Pole was hired by Target to use the same kinds of insights into consumers’ habits to expand Target’s sales. His assignment was to analyze all the cue-routine-reward loops among shoppers and help the company figure out how to exploit them.

The only problem is that identifying pregnant customers is harder than it sounds. Target has a baby-shower registry, and Pole started there, observing how shopping habits changed as a woman approached her due date, which women on the registry had willingly disclosed. He ran test after test, analyzing the data, and before long some useful patterns emerged. Lotions, for example.

As Pole’s computers crawled through the data, he was able to identify about 25 products that, when analyzed together, allowed him to assign each shopper a “pregnancy prediction” score. More important, he could also estimate her due date to within a small window, so Target could send coupons timed to very specific stages of her pregnancy.

http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html?pagewanted=1&_r=1&hp

4. Colby, when are you going to come to the perfectly reasonable conclusion that pre-election polls are a dog’s breakfast, and should be ignored completely? It’s a moving target.

Call it a credibility gap, or assign it a degree of fudge (I know from previous experience that pollsters don’t like this word http://www2.macleans.ca/2011/10/25/our-gerontocracy/ ).

But, what will all the talking heads and strategists talk about if there isn’t some form of polling? Sampling from an urn? Maybe a urinal is a better analogy.

5. I don’t do polls even when they call my land-line.

Not because I don’t want anyone to hear my opinions, but because I’ve been push-polled enough to not want to endure that vileness again.

If someone can figure out a way to do online polling that isn’t completely worthless (somehow…) I’d be happy to let a pollster know my views.