Sampling Error Example

January

January 13, 2016

Iowa Sampling Error Example

As the Iowa caucuses approach, campaign tittle-tattle begins to focus on the polls and what appears to be conflicting polling results.

The following table contains polling results for Ted Cruz, Donald Trump, and Marco Rubio in the Iowa Republican Presidential Caucus compiled by RealClearPolitics.com as of January 13, 2016. The six polls comprising the RealClearPolitics.com average for January 13 range from a lead of 4 points for Donald Trump to a lead of 4 points for Ted Cruz.

Poll	Date	Cruz	Trump	Rubio	Spread
RealClearPolitics Average	1/2-1/10	26.7	26.2	13.0	Cruz +0.5
DM Register/Bloomberg	1/7-1/10	25	22	12	Cruz +3
PPP	1/8-1/10	26	28	13	Trump +2
Quinnipiac	1/5-1/10	29	31	15	Trump +2
ARG	1/6-1/10	25	29	10	Trump +4
FOX News	1/4-1/7	27	23	15	Cruz +4
NBC/WSJ/Marist	1/2-1/7	28	24	13	Cruz +4

After looking at the polling results above, many would claim that there must be "outlier" polls because it would be impossible for such a range in the polls between Cruz and Trump. They would be wrong.

Sampling error accounts for the differences.

The tables below indicate the results from 50 random samples based on the RealClearPolitics.com average for January 13, 2016. The second column in the last row shows the average for all 50 samples. The average for all 50 samples should be the closest to the population (and that's why averages from polling aggregators like RealClearPolitics.com work).

Your browser randomly assigns a candidate to a population of 125,000 based on the averages above. Your browser then creates 50 random samples of 500 and tabulates the results. (If the sample tables do not appear below, update your browser.)

A sample size of 500 was selected to match the sample size of the DM Register/Bloomberg poll and a population of 125,000 was selected based on the approximate Iowa Republican caucus turnout. New samples are created each time this page is loaded. Refresh this page to see new sample results.

The results below are from sampling error alone and do not include any additional error that is caused by the practical difficulties of conducting a poll (and that additional unknown non-sampling error is included in the polls in the RealClearPolitics.com table).

Please note:

The sample results rarely refect the actual population values. Some sample results may be close, but the population values are better represented by the 50-sample average.

The error statement for the DM Register/Bloomberg poll states:

Questions based on the subsamples of 503 likely Democratic caucus attendees or 500 likely Republican caucus attendees each have a maximum margin of error of plus or minus 4.4 percentage points. This means that if this survey were repeated using the same questions and the same methodology, 19 times out of 20, the findings would not vary from the percentages shown here by more than plus or minus 4.4 percentage points.

This statement is incorrect. The "This means" portion of the statement assumes that the survey results are the actual population values, but there is no way of knowing that as this example of sampling error illustrates.

The sample results for Cruz and Trump reflect the ranges in the RealClearPolitics.com averages.

Sampling error cannot be avoided. Each of the 50 samples is equally as likely to be selected and therefore even a "high quality" poll with a sample size of 500 will be subject to the sampling error shown in the tables and will most likely miss the population values.

The more polls in the poll averages, the better as poll averages should better reflect the population.

Sampling error can be reduced by increasing sample sizes, but increasing sample sizes will not eliminate sampling error.

ARG Home