Abstract
We compared 5 different statistics (i.e., G index, gamma, d', sensitivity, specificity) used in the social sciences and medical diagnosis literatures to assess calibration accuracy in order to examine the relationship among them and to explore whether one statistic provided a best fitting general measure of accuracy. College undergraduates completed separate 15-item vocabulary, probability, and paper folding tests by answering a test item and indicating whether or not the item was answered correctly. We computed scores for each of the 5 calibration statistics using the same raw scores for each test and compared 3 theoretical models, including 1-, 2-, and 3-factor confirmatory factor analysis solutions. Results supported the 3-factor model over the 1-factor and 2-factor models with respect to goodnessof- fit indices and least number of estimated parameters. The 3-factor solution was consistent with the hypothesis that the 5 individual calibration scores are related to 2 different types of 2nd-order processes (i.e., accuracy of judgments about correct and incorrect performance), as measured by sensitivity and specificity that are subsumed under a general 3rd-order discrimination process as measured by d'. Implications for a theory of calibration accuracy and measurement practice were discussed.
Original language | American English |
---|---|
Journal | Journal of Educational Psychology |
Volume | 106 |
DOIs | |
State | Published - Nov 1 2014 |
Keywords
- Accuracy
- Calibration accuracy
- Measurement practice
DC Disciplines
- Educational Methods
- Curriculum and Social Inquiry
- Curriculum and Instruction
- Educational Assessment, Evaluation, and Research