Null or unobserved categories: structural and incidental zeroes

Top Up Down  A A

There are two types of unobserved or null categories: structural zeroes and incidental/sampling zeroes.

 

Structural null categories occur when rating scale categories are number 10, 20, 30,... instead of 1,2,3. To force Winsteps to eliminate non-existent categories 11, 12, 13, either rescore the data IVALUE= or specify STKEEP=NO.

 

For intermediate incidental null zeroes, imagine this scenario: The Wright & Masters "Liking for Science" data are rescored from 0,1,2 to 0,1,3 with a null category at 2. the categories now mean "disagree, neutral, agree-ish, agree". We can imagine that no child in this sample selected the half-smile of agree-ish.

The category frequencies of categories 0,1,2,3 are 378, 620, 0, 852

The three Rasch-Andrich threshold parameters are -.89, +infinity, -infinity.

The +infinity is because the second parameter is of the order log(620/0). The -infinity is because the third parameter is of the order log(0/852).

Mark Wilson's 1991 insight was that the leap from the 2nd to the 4th category is of the order log(620/852). This is all that is needed for immediate item and person estimation. But it is not satisfactory for anchoring rating scales. In practice however, a large value substitutes satisfactorily for infinity. So, a large value such as 40 logits is used for anchoring purposes. Thus the approximated parameters become -.89, 40.89, -40.00 for SAFILE=. With these anchored threshold values, the expected category frequencies become: 378.8, 619.4, .0, 851.8. None of these are more than 1 score point away from their observed values, and each represents a discrepancy of .2% or less of its category count.

 

Extreme incidental null categories (unobserved top or bottom categories) are essentially out of range of the sample and so the sample provides no direct information about their estimates. To estimate those estimates requires us to make an assertion about the form of the rating scale structure. The Rasch "Poisson" scale is a good example. All its infinitude of thresholds are estimable because they are asserted to have a specific form. But see Example 12 for a different approach to this situation.

 

Our recommendation is that structural zeroes be rescored out of the data. If categories may be observed next time, then it is better to include a dummy data record in your data file which includes an observation of the missing category and reasonable values for all the other item responses that accord with that missing category. This one data record will have minimal impact on the rest of the analysis.