Test and item information

Một phần của tài liệu Statistical test theory for the behavoial science (Trang 179 - 184)

In classical test theory the variance of measurement errors was a relevant concept. The inverse of the error variance is an indicator of the precision with which statements about persons can be made. In

P P g

P

P g

P

k

k k k k

h

( | ) ( | ) ( ) ( )

( | ) ( ) ( |

θ θ θ θ θ

θ

x x

x

x x

= =

)) ( )g h

h q

θ

∑= 1

EAP( )θ = θ ( | )θ

∑= k k k

q

P x

1

IRT it appears advantageous to start with the precision of measure- ments. The two central concepts are test information and item infor- mation.

The test information at a level of latent ability θ, I(θ), gives, under some conditions, the precision with which θ can be estimated at this ability level. The conditions are as follows:

1. We have chosen the adequate model.

2. The item parameters are accurately estimated.

3. We use maximum likelihood estimation—that is, we use opti- mal weights for the estimation of θ; with other estimation methods (e.g., with number right scoring) we speak of the information of a scoring formula (Birnbaum, 1968); the latter information cannot exceed the test information.

4. The test is not too short.

The test information has a very convenient property. It is the sum of the item informations (Birnbaum, 1968). So, the contribution of each item to the accuracy of a test may be considered apart from the contributions of other items. The test information is

(9.31)

where wi(θ) is the optimal item weight from Equation 9.28, P'i(θ) is the derivative of Pi(θ) with respect to θ, and Ii(θ) is the item information of item i,

(9.32)

The item information is equal to the square of the slope of the ICC at θ divided by the local error variance, which is the item variance given θ. For polytomous items, results are given in Exhibit 9.3.

I

w P

w P P

i i

i

i i i

( )

( ) ( ) ( ) ( )[

θ

θ θ

θ θ

=

⎡ ′

⎣⎢

⎦⎥

∑ 2

2 1 (( )] ( )

θ θ

i

i i

∑ =∑I

I P

P P

i

i

i i

( ) ( )

( )[ ( )]

θ θ

θ θ

= ′

2

1

Exhibit 9.3 Optimal weights and information of polytomous items

Optimal weight (Equation 9.28) and item information (Equation 9.32) cannot directly be generalized to the case of polytomous items. With polytomous items, each option has an optimal option weight. Let Pik(θ) be the option characteristic curve. Then the optimal weight associated with option k is

The sum of the weights of the chosen options equals zero at the maximum likelihood estimate. The option weights function in quite a different way than the weights for dichotomous items. The reason for the difference is that the weight for a dichotomous item is an item weight: the options correct and incorrect are not weighted separately, as the score for incor- rect is set equal to zero. The item weight is the difference between the option weight for correct and the option weight for incorrect.

For the two models in Equation 9.6 and Equation 9.7, the score weight of category k can be written as k plus a factor independent of k. The option score k can be written as

While k does not depend on the item and option parameters, the polyt- omous Rasch models have a sufficient statistic for the estimation of θ—the sum of the category numbers of the options chosen.

The item information, the sum of the option informations, is

Samejima (1969) proved that under the graded response model the item information increases if categories are split into more categories.

It is easily demonstrated that the item information of a dichotomous item, as given by Equation 9.32, is a special case of the item information for polytomous items.

w P P

P

ik

ik ik

ik

( ) ln ( ) ( )

θ θ ( )

θ

θ

=∂ θ

∂ = ′

k w= ik( )θ −wi0( )θ

I P

P

i ik

k ik m

( ) ( )

θ ( )θ

= ′ θ

∑= 2 0

In the 3PL model, the item information is

(9.33)

where pi(θ) equals the probability of a correct response to the item if ci would have been equal to 0. In the 2PL model, the item information equals ai2 times the item variance given θ, and in the Rasch model the item information equals the item variance given θ, the error variance on the true-score scale.

In Figure 9.8 information functions are displayed for three items.

The information of the item with c = 0.0 and a = 2.0 exceeds the information of the item with the lower discrimination parameter for a large range of the latent ability. With increasing a, the information at θ = b increases, while the information at more distant abilities decreases. A perfect Guttman item discriminates at only one point—it discriminates between persons with θ smaller than b and persons with θ larger than b.

The information of the item with c = 0.25 and a = 2.0 is lower than the information of the item with the same value of a and c = 0.0. The reduction is lowest at high values of θ; for low values of θ the reduction is large due to guessing. From the figure, we can infer that the highest

Figure 9.8 Information functions for three items (b = 0).

I a p p c

P

i i i i

i i

( ) ( )[ ( )]

θ θ θ ( )

= − ⎛ − θ

⎝⎜ ⎞

⎠⎟

2 1 1

0 0.5 1

−2.5 0

θ

2.5

I(θ)

a = 1; c = 0 a = 2; c = 0 a = 2; c = 0.25

information is obtained at θ = b, unless c is larger than 0. When c exceeds 0, the highest information value is obtained for a value of θ somewhat higher than b. Birnbaum (1968) gives the relationship between c and the value of θ at which the highest information is obtained.

The value of the item information and, consequently, the value of the test information, depend on the choice of the latent scale. In the 2PL model and 3PL model a linear scale transformation is allowed.

We can multiply all θ and b with 2 and divide all a by 2. Then the item information and test information decrease by a factor 4 (see Equation 9.33). And, when nonlinear scale transformations are also considered—for example, a transformation to the true-score scale—the form of the information function can change dramatically.

Information is not an invariant item property. The ratio of the item informations of two items is, however, invariant:

for all monotone transformations θ* of θ.

The relative efficiency of two tests, the ratio of the test informations of two tests, remains unchanged with a change of scale. This means that the comparison of the accuracy of two tests does not depend on the chosen latent scale.

The estimated value of I(θ) can be used (asymptotically) for the construction of a confidence interval for θ. The variance of equals the inverse of the test information, 1/I(θ), assuming accurate item param- eter estimates (otherwise the error variance is larger; see, e.g., De Gruijter, 1988). Under the assumption that is normally distributed, the approximate 95% confidence interval is

(9.34) With this confidence interval we might err in case a population of abilities is involved. Then we better use the EAP estimator instead of the ML estimator. With EAP we can also compute the posterior vari- ance of θ. This variance is smaller than the inverse of the test infor- mation. These results are comparable to those discussed in connection with the application of the Kelley formula within the context of clas- sical test theory. Also, test reliability is of foremost importance within

I I

I I

i j

i j

( ) ( )

( ) ( )

*

*

θ θ

θ

= θ

θˆ

θˆ

ˆ . / ( ˆ ) ˆ . / ( ˆ ) θ−1 96 I θ < < +θ θ 1 96 I θ

the context of IRT. A test is useful for differentiating between persons in as far as the error variation is relatively small compared to the true variation in θ.

Một phần của tài liệu Statistical test theory for the behavoial science (Trang 179 - 184)

Tải bản đầy đủ (PDF)

(282 trang)