Testing Differences Between Proportions

From Displayr
Jump to navigation Jump to search

Consider a study showing that 65% of 43 people aged 18 to 24 prefer Coca-Cola compared to 41% of 39 people aged 25 top 29. If we wish to test whether the difference between these proportions is significant, we need to compute a p-Value (see Formal Hypothesis Testing for a general discussion of the logic of statistical testing).

Most introductory statistics course and textbooks present a standard formula for computing statistical significance between proportions. This formula is rarely used in commercial survey analysis as it makes a series of assumptions that are not consistent with real-world data sets. The rest of this page discusses other tests of proportions that are designed to overcome these limitations.

The standard test of proportions

Introductory statistics course and textbooks present a standard test of the difference between proportions.

Where P1.png and P2.png are the two proportions and N1.png and N2.png are the sample sizes:



z is evaluated using a standard normal distribution.


The analysis of weighted data

The standard test makes a technical assumption known as [i.i.d.]. When data is weighted this assumption is violated.

The most straightforward modification of the test in this situation is toe replace the sample size by the Effective Sample Size and to compute Pp12.png using the weighted sample size. This approach is adopted by most of the widely used commercial market research programs (e.g., SPSS IBM Data Collection Model programs, Uncle, WinCross, CfMC, Quantum), although sometimes with additional minor variations (e.g., Yate's correction). These programs also commonly treat the test-statistic as a a t-statistic, variously computing the number of degrees of freedom as the sum of the effective sample sizes and minus one or minus two.

A more rigorous approach is to use a special-purpose algorithm that can be used for comparing proportions.[1].



  1. Wand, Jonathan. (2012 Conditionally Accepted) “Credible Comparisons Using Interpersonally Incomparable Data”. American Journal of Political Science, Table 2.