خلاصة:
Classical test theory and item response theory are widely perceived asrepresenting two very different measurement frameworks. Few studieshave empirically examined the similarities and differences in theparameters estimated using the two frameworks. The purpose of thisstudy was to examine how item statistics (i.e. item difficulty and itemdiscrimination) and person statistics (i.e. ability estimates) behave underthe two measurement frameworks i.e. CTT and IRT. The researcherstried to compare the two models from both theoretical and practicalperspectives. For this purpose, first, a theoretical comparison of the twomodels was carried out; then, a sample of 3000 testees taking part in theEnglish language university entrance exam was used in order tocompare the two models practically. The findings showed that personstatistics from CTT were comparable with those from IRT for all threeIRT models. Item difficulty indexes from CTT were comparable withthose from all IRT models and especially from the one-parameterlogistic (1PL) model. Item discrimination indexes from CTT weresomewhat less comparable with those from IRT
ملخص الجهاز:
This is because the mathematical model used to derive item parameters in IRT is derived based on the estimated latent trait (θ) and not the test taker's total score.
Studies by Courville (2004), Fan (1998), Hwang (2002), Lawson (1991), MacDonald and Paunonen (2002), Skaggs and Lissitz (1986, 1988) and Stage (1998a, 1998b, 1999) have all referred to little difference between IRT and CTT estimates.
In Stage's (1999) work with the SweSAT test READ, she states that, "the agreement between results from item-analyses performed within the two different frameworks IRT and CTT was very good.
There is not any difference between the CTT-based and IRT-based item difficulty statistic (estimate) in the three IRT models.
There is not any difference between the CTT-based and IRT-based item discrimination statistic (estimate) in the two IRT models.
The English part of the foreign language university entrance exam contains 70 multiple-choice items forming six subparts: structure (10 items), vocabulary (20 items), word order (5 items), language function (5 items), cloze test (15 items), and reading comprehension (15 items).
The subparts consisted of structure (10 items), vocabulary (20 items) word order (5 items), language functions (5 items), cloze test (15 items), and finally reading comprehension (15 items).
There was a high correlation between CTT-based and IRT-based estimates of item discrimination, and the values were the same for the two IRT models (.
e. cloze test and reading comprehension, there were lower, albeit strong correlations between the CTT-based and the 3PL IRT-based item discrimination estimates.