# Statistical Testing in Crosstabs

Crosstabs uses a z-test for testing the null hypothesis that two population parameters are equal. This is commonly referred to as a significance test, or hypothesis testing. The z-test evaluates two columns at a time in a data table; e.g. two means or two proportions. The z-test assumes sample is randomly drawn from a normal distribution, and is appropriate for large sample sizes (N≥ 30).

## 1: Significance Test on Weighted Data

When data is weighted, this alters the distribution of the sample to match the population. Drastic alterations in the sample data can occur and compromise the integrity of the results. When conducting a significance test on weighted data, such errors in the weighting process need to be accounted for.

In Crosstabs, the effective base size is used when stat testing weighted data. Using the effective base size will ensure that improper statistical conclusion aren’t made from a sample that has been drastically altered.

### 1.1: Example:

For a z-test of proportions using unweighted data, the following inputs are used:

Group 1: unweighted percentage, unweighted base size

Group 2: unweighted percentage, unweighted base size

With weighted, data, the following inputs are used:

Group 1: weighted percentage, effective base size

Group 2: weighted percentage, effective base size

The effective base size is a good evaluation of the weighting procedure. If weighting drastically inflates the data for a sub group of the sample, the effective base size will be lower. When weighting results in small alterations to the sample, the effective base size will be closer to the unweighted base size.

## 2: Formulas

### 2.1: Significance Testing Proportions:

Ps1 = Larger Percent

Ps2 = Smaller Percent

N1 = Base size of larger percent

N2 = Base size of smaller percent

Pu = Population Proportion

Pu = [((N1 - 1)*Ps1) + ((N2 - 1)*Ps2)] / ((N1 - 1) + (N2 - 1))

SDS = Std. Deviation of sampling distribution

SDS = (Pu(1 - Pu))^1/2 * [((N1 - 1) + (N2 - 1)) / ((N1 - 1) * (N2 - 1))]^1/2

Z = Test Static

Z = (Ps1 - Ps2) / SDS

If Z ≥ 1.96 then Larger Percent is Significant Higher than Smaller Percent at the 95% level.

If Z ≥ 1.645 then Larger Percent is Significant Higher than Smaller Percent at the 90% level.

### 2.2: Significance Testing Means:

X1 = Larger Mean

X2 = Smaller Mean

N1 = Base size of larger percent

N2 = Base size of smaller percent

S1 = Standard Deviation of Larger Mean

S2 = Standard Deviation of Smaller Mean

S1 = [∑(X1i)^2/(N1 -1) – (X1bar)^2]^1/2

S2 = [∑(X2i)^2/(N2 -1) – (X2bar)^2]^1/2

SDS = Std. Deviation of sampling distribution

SDS = [((S1)^2 / (N1-1)) + ((S2)^2 / (N2-1))]^1/2

Z = (X1 – X2) / SDS

If Z ≥ 1.96 then Larger Percent is Significant Higher than Smaller Percent at the 95% level.

If Z ≥ 1.645 then Larger Percent is Significant Higher than Smaller Percent at the 90% level.

### 2.3: Calculating effective base size:

effective base = sum of the weights squared / sum of the square of the weights