Global fit statistics

Top Up Down  A A

Winsteps reports global fit statistics and approximate global log-likelihood chi-square statistic in Table 3.1.

 

Example: To compare the fit of "Rating Scale Model" (RSM) and "Partial Credit Model" (PCM, ISGROUPS=0) analyses. The number of categories and items is in the Table heading. The chi-square test is:

(global chi-square for RSM analysis - global chi-square for PCM analysis) with d.f. (PCM Categories - 2*Items - RSM categories+2)

 

The variance tables report the relative sizes of explained and unexplained variances.

 

The chi-square value is approximate. It is based on the current reported estimates which may depart noticeably from the "true" maximum likelihood estimates for these data. The degrees of freedom are the number of datapoints used in the free estimation (i.e., excluding missing data, data in extreme scores, etc.) less the number of free parameters.

For an unanchored analysis, free parameters = non-extreme items + non-extreme persons - 1 + (categories in estimated rating-scale structures - 2 * rating-scale structures).

 

Thus, for the "Liking for Science" data of 75 children administered 25 items. There are 74 non-extreme children and 25 non-extreme items. The data are complete so there are 74 x 25 = 1850 data points. The free parameters are 74 + 25 - 1 + (3-category rating scale - 2 x 1 rating scale) = 99 parameters. So the degrees of freedom are 1850 - 99 = 1751. The log-likelihood chi-square is 2657.91. So that the significance p=.000, i.e., the data exhibit highly significant misfit to the Rasch model, as is nearly always expected.

 

If you wish to compute your own global (or any other) fit test, the response-level probabilities, residuals etc. are reported in the XFILE=. For instance, for a global fit test, you could add up all the log-probabilities. Then chi-square estimate = - 2 * log-probability. A different chi-square estimate is the sum of squared-standardized residuals. You can count up the number of free parameters. For complete dichotomous data, it is usually the minimum of (number of different person marginal raw scores, number of different item marginal scores) - 1.

 

For a more local fit test, the chi-square estimate is -2 * sum of log-probabilities of relevant data or sum of squared-standardized residuals for the relevant data. The degrees of freedom approximate the count of data points less L'/L for each relevant person parameter and N'/N for each relevant item parameter, where L' is the number of responses by the person included in the local test and L is the total number of responses by the person. N' is the number of responses to the item included in the local test and N is the total number of responses to the item.

 

Deviance statistics are more trustworthy. They are the difference between the chi-squares of two analyses, with d.f. of the difference between the number of free parameters estimated.

 

The Rasch model is an idealization, never achieved by real data. Accordingly, given enough data, we expect to see statistically significant misfit the model. If the current data do not misfit, we merely have to collect more data, and they will! In essence, the null hypothesis of this significance test is the wrong one! We learn nothing from testing the hypothesis, "Do the data fit the model (perfectly)?" Or, as usually expressed in social science, "Does the model fit the data (perfectly)?" Perfection is never obtained in empirical data. What we really want to test is the hypothesis "Do the data fit the model usefully?" And, if not, where is the misfit, and what is it? Is it big enough in size (not "statistical significance") to cause trouble? This is the approach used in much of industrial quality-control, and also in Winsteps.

 

The general principle for degrees of freedom is "number of data points used for estimating the parameters - number of free parameters estimated". So, in most Rasch situations, when computing the d.f.:

1. omit extreme scores (persons and items).

2. data points = number of datapoints in non-extreme scores.

3. number of free person parameters = number of persons

4. number of free item parameters = number of items - 1, because the items are usually centered at 0 so one parameter is constrained to be the negative of the sum of the other parameters.

5. number of free parameters for each rating scale structure = number of categories in the structure - 2.

6. So dichotomies have no free parameters
"Rating scale model" = one structure for all the data
"Partial credit model" = one structure per item
Grouped-item model" = one structure per item-group.