Out of a random sample, how many needs to answer in order to be statistically relevant?
Weirdly, the short answer is about 1067. But it's a long way from being that simple.Every published poll should tell you the sample size and the margin of error. We just ignore them because we are lazy and ill-informed.Let me insert here that if a mathematics or polling expert weighs in, they can check my numbers. There could be complete mistakes in what I describe but I imagine the only way to say it a lot more precisely is to do the graphs and explain it in a way that isn't intuitive for non-mathematics experts. (Of course we should all understand a little algebra to consider ourselves well educated, but in practice, we don't.)You can find books - a number of them - that address this simple question, simply because it isn't simple.The accurate result doesn't have to assume 100% turnout, it depends on whether you want to know that "what if", or if you want to know how the election is going to turn out in the real world. it just has to make an calculated assumption about who the likely voters are. Some polls just take a sample from all registered voters. That poll is obviously heavily skewed away from the actual voters. Many times, assuming that older Republicans tend to be the most reliable voters, means that in a poll of all voters, the Republicans are shown a couple of percentage point or more behind where they really will be on Election Day. Lots of polls say they are polls of "likely voters". Sometimes this is just the first question on the poll, how likely are you to vote in the election next month? In my opinion that's a terrible way to identify likely voters, many who will end up blowing it off know that they should say definitely voting because they are good people, when they really aren't, and they really won't.It turns out, due to math that is way beyond my comprehension (and which I suspect is some sort of dark magic), that whether you are trying to figure out the results of an election with 20,000 voters, or one with 80,000,000 voters, you can pretty much stick with the same size of sample. Seems crazy doesn't it? True, the more people you ask, the more accurate it is. But after a while, if you go from, say, 2000 people to 5000 people, the change in accuracy is so minimal that it's a waste of money to go to that extreme.Since the numbers are obviously not factually, predictably, omnisciently accurate, you have to start by understanding the "margin of error". This is all a lot of algebra that you once knew but have forgotten. If a candidate has 65% support, that really means that there is a 95% chance that their real support is somewhere between 62 and 68% - not terribly accurate, but this is a sort of bell curve where there is maybe a 50% chance that 65% (i.e. 64.5 to 65.49) is exactly right (that's a wild guess, I made it up. But generally it's right, there's only a 1% chance it's outside that range but increasingly better odds as you get closer to the result.In addition to whatever other margins of error are at work, remember that some voters will lie to you.The sample size means nothing unless your "universe" of respondents is representative of the public at large. If you went to a cosmetics trade show and surveyed 4,000 people, it wouldn't mean much because you would have a non-representative sample. If you put up a poll on your candidate's website, guess what will happen? If you put up a poll on anybody's website, the results will be just limited to whoever takes that poll - so it won't reflect reality.Firms get paid tens of thousands of dollars for polling, and they obviously can charge more if they have a good track record. In addition to choosing a representative sample, they have to write the questions that will get an accurate result.The way the questions are worded is crucial. If you're asking about a preference for two or three different candidates, you'd certainly have to rotate the names when asking the question, or people will choose the third one. You have to make sure the callers don't tip their hand, a little change in inflexion will "sell" one of the answers. (And of course you can purposely warp the outcome: Do you support the felon slumlord Smith or the Nobel Prize-winning Saint Dennis Jones? Even more egregious is "Can Dan Goodguy count on your vote next Tuesday?" Most people will say yes.In practice, of course, nobody knows who will end up voting, as turnout varies according to any number of variables. Upon whether, for instance, the candidate has some unique appeal to any group of voters, from women to Poles to hunters to Cornhuskers. Who else is on the ballot is a huge question, if you're polling for a Congressional candidate you'd have to consider whether a Senate or Presidential candidate further up the ballot is actually going to be the more important turnout determinant. Once the poll is done you can often get a good idea about what sort of voter supports which candidate. With men versus women, that number is fairly accurate even though your respondents are about half the size of the whole poll. If you want to know how you're doing with 18-year-old college graduate women, you may find that there's only one of those in the poll, so it (incorrectly) suggests that you will get 100% of that vote. Unless your candidate is the Duke of Cambridge, that isn't so.Even less predictable things like the weather, what's on TV, and other big distractions that might come up, there was a NYC election held on September 11, 2001 (to cite an extreme example). Last minute things like negative revelations on either side may depress the vote significantly.Early voting is yet another curveball. If the dynamics of the race change at the last minute (your opponent is filmed ripping down your signs - this really happened), you have to remember that some large percentage of the voters may already have voted.How will you reach all these voters? Many of them use only cell phones and you can't call random cell phones. This skews away from younger voters who are more likely to be the ones without a "home" landline phone number.1067.