Table 28.3 Person subtotal detailed summary statistics |
Top Up Down
A A |
(controlled by PSUBTOT=, UDECIMALS=)
These summarize the measures from the main analysis for all persons selected by PSUBTOT=. Table 28.1 shows one-line summary statistics. Bar charts are shown in Table 28.2. Detailed summary statistics in Table 28.3.
TOTAL FOR ALL 34 NON-EXTREME KIDS
-------------------------------------------------------------------------------
| RAW MODEL INFIT OUTFIT |
| SCORE COUNT MEASURE ERROR MNSQ ZSTD MNSQ ZSTD |
|-----------------------------------------------------------------------------|
| MEAN 6.9 14.0 -.19 1.01 .99 -.2 .68 -.1 |
| S.D. 2.1 .0 1.97 .10 .94 1.2 1.29 .7 |
| MAX. 11.0 14.0 3.73 1.11 4.12 2.5 6.07 2.2 |
| MIN. 2.0 14.0 -4.32 .82 .18 -1.5 .08 -.7 |
|-----------------------------------------------------------------------------|
| REAL RMSE 1.18 TRUE SD 1.58 SEPARATION 1.34 KID RELIABILITY .64 |
|MODEL RMSE 1.01 TRUE SD 1.69 SEPARATION 1.67 KID RELIABILITY .74 |
| S.E. OF KID MEAN = .34 |
| MEDIAN = -.26 |
-------------------------------------------------------------------------------
MINIMUM EXTREME SCORE: 1 KIDS
TOTAL FOR ALL 35 EXTREME AND NON-EXTREME KIDS
-------------------------------------------------------------------------------
| RAW MODEL INFIT OUTFIT |
| SCORE COUNT MEASURE ERROR MNSQ ZSTD MNSQ ZSTD |
|-----------------------------------------------------------------------------|
| MEAN 6.7 14.0 -.37 1.03 |
| S.D. 2.4 .0 2.22 .17 |
| MAX. 11.0 14.0 3.73 1.85 |
| MIN. .0 14.0 -6.62 .82 |
|-----------------------------------------------------------------------------|
| REAL RMSE 1.21 TRUE SD 1.86 SEPARATION 1.55 KID RELIABILITY .70 |
|MODEL RMSE 1.05 TRUE SD 1.96 SEPARATION 1.87 KID RELIABILITY .78 |
| S.E. OF KID MEAN = .38 |
| MEDIAN = -.26 |
-------------------------------------------------------------------------------
EXTREME AND NON-EXTREME SCORES |
All items with estimated measures |
NON-EXTREME SCORES ONLY |
Items with non-extreme scores (omits items or persons with 0% and 100% success rates) |
ITEM or PERSON COUNT |
count of items or persons. "ITEM" is the name assigned with ITEM= : "PERSON" is the name assigned with PERSON= |
MEAN MEASURE etc. |
average measure of items or persons. |
REAL/MODEL ERROR |
standard errors of the measures (REAL = inflated for misfit). |
REAL/MODEL RMSE |
statistical "root-mean-square" average of the standard errors |
TRUE S.D. (previously ADJ.SD) |
observed S.D. adjusted for measurement error (RMSE). This is an estimate of the measurement-error-free S.D. |
REAL/MODEL SEPARATION |
the separation coefficient: G = TRUE S.D. / RMSE Strata = (4*G + 1)/3 |
REAL/MODEL RELIABILITY |
the measure reproducibility |
S.E. MEAN |
standard error of the mean measure of items or persons |
For valid observations used in the estimation,
NON-EXTREME persons or items - summarizes persons (or items) with non-extreme scores (omits zero and perfect scores).
EXTREME AND NON-EXTREME persons or items - summarizes persons (or items) with all estimable scores (includes zero and perfect scores). Extreme scores (zero, minimum possible and perfect, maximum possible scores) have no exact measure under Rasch model conditions. Using a Bayesian technique, however, reasonable measures are reported for each extreme score, see EXTRSC=. Totals including extreme scores are reported, but are necessarily less inferentially secure than those totals only for non-extreme scores.
RAW SCORE is the raw score (number of correct responses excluding extreme scores, TOTALSCORE=N).
TOTAL SCORE is the raw score (number of correct responsesincluding extreme scores, TOTALSCORE=Y).
COUNT is the number of responses made.
MEASURE is the estimated measure (for persons) or calibration (for items).
ERROR is the standard error of the estimate.
INFIT is an information-weighted fit statistic, which is more sensitive to unexpected behavior affecting responses to items near the person's measure level.
MNSQ is the mean-square infit statistic with expectation 1. Values substantially below 1 indicate dependency in your data; values substantially above 1 indicate noise.
ZSTD is the infit mean-square fit statistic t standardized to approximate a theoretical mean 0 and variance 1 distribution. ZSTD (standardized as a z-score) is used of a t-test result when either the t-test value has effectively infinite degrees of freedom (i.e., approximates a unit normal value) or the Student's t-statistic distribution value has been adjusted to a unit normal value. When LOCAL=Y, then EMP is shown, indicating a local {0,1} standardization. When LOCAL=L, then LOG is shown, and the natural logarithms of the mean-squares are reported.
OUTFIT is an outlier-sensitive fit statistic, more sensitive to unexpected behavior by persons on items far from the person's measure level.
MNSQ is the mean-square outfit statistic with expectation 1. Values substantially less than 1 indicate dependency in your data; values substantially greater than 1 indicate the presence of unexpected outliers.
ZSTD is the outfit mean-square fit statistic t standardized to approximate a theoretical mean 0 and variance 1 distribution. ZSTD (standardized as a z-score) is used of a t-test result when either the t-test value has effectively infinite degrees of freedom (i.e., approximates a unit normal value) or the Student's t-statistic distribution value has been adjusted to a unit normal value. When LOCAL=Y, then EMP is shown, indicating a local {0,1} standardization. When LOCAL=L, then LOG is shown, and the natural logarithms of the mean-squares are reported.
MEAN is the average value of the statistic.
S.D. is its sample standard deviation.
MAX. is its maximum value.
MIN. is its minimum value.
MODEL RMSE is computed on the basis that the data fit the model, and that all misfit in the data is merely a reflection of the stochastic nature of the model. This is a "best case" reliability, which reports an upper limit to the reliability of measures based on this set of items for this sample.
REAL RMSE is computed on the basis that misfit in the data is due to departures in the data from model specifications. This is a "worst case" reliability, which reports a lower limit to the reliability of measures based on this set of items for this sample.
RMSE is the square-root of the average error variance. It is the Root Mean Square standard Error computed over the persons or over the items. Here is how RMSE is calculated in Winsteps:
George ability measure = 2.34 logits. Standard error of the ability measure = 0.40 logits.
Mary ability measure = 3.62 logits. Standard error of the ability measure = 0.30 logits.
Error = 0.40 and 0.30 logits.
Square error = 0.40*0.40 = 0.16 and 0.30*0.30 = 0.09
Mean (average) square error = (0.16+0.09) / 2 = 0.25 / 2 = 0.125
RMSE = Root mean square error = sqrt (0.125) = 0.354 logits
TRUE S.D. is the sample standard deviation of the estimates after subtracting the error variance (attributable to their standard errors of measurement) from their observed variance.
(TRUE S.D.)² = (S.D. of MEASURE)² - (RMSE)²
The TRUE S.D. is an estimate of the unobservable exact sample standard deviation, obtained by removing the bias caused by measurement error.
SEPARATION coefficient is the ratio of the PERSON (or ITEM) TRUE S.D., the "true" standard deviation, to RMSE, the error standard deviation. It provides a ratio measure of separation in RMSE units, which is easier to interpret than the reliability correlation. (SEPARATION coefficient)² is the signal-to-noise ratio, the ratio of "true" variance to error variance.
RELIABILITY is a separation reliability (separation index). The PERSON (or ITEM) reliability is equivalent to KR-20, Cronbach Alpha, and the Generalizability Coefficient. See much more at Reliability.
S.E. OF MEAN is the standard error of the mean of the person (or item) measures for this sample.
MEDIAN is the median measure of the sample (in Tables 27, 28).
Message |
Meaning for Persons or Items |
MAXIMUM EXTREME SCORE |
All non-missing responses are scored correct (perfect score) or in the top categories. Measures are estimated. |
MINIMUM EXTREME SCORE |
All non-missing responses are scored incorrect (zero score) or in the bottom categories. Measures are estimated. |
LACKING RESPONSES |
All responses are missing. No measures are estimated. |
DELETED
|
|
IGNORED BEYOND CAPACITY |
Deleted and not reported with entry numbers higher than highest active entry number |
VALID RESPONSES |
Percentage of non-missing observations. Not shown if 100% |
CUTLO= CUTHI= |