R: Item Response Analysis by Exploratory Factor Analysis of...

irt.fa {psych}

R Documentation

Item Response Analysis by Exploratory Factor Analysis of tetrachoric/polychoric correlations

Description

Although exploratory factor analysis and Item Response Theory seem to be very different models of binary data, they can provide equivalent parameter estimates of item difficulty and item discrimination. Tetrachoric or polychoric correlations of a data set of dichotomous or polytomous items may be factor analysed using a minimum residual or maximum likelihood factor analysis and the result loadings transformed to item discrimination parameters. The tau parameter from the tetrachoric/polychoric correlations combined with the item factor loading may be used to estimate item difficulties.

Usage

irt.fa(x,nfactors=1,correct=TRUE,plot=TRUE,n.obs=NULL,rotate="oblimin",fm="minres",
        sort=FALSE,...)
irt.select(x,y)
fa2irt(f,rho,plot=TRUE,n.obs=NULL)

Arguments

`x`	A data matrix of dichotomous or discrete items, or the result of `tetrachoric` or `polychoric`
`nfactors`	Defaults to 1 factor
`correct`	If true, then correct the tetrachoric correlations for continuity. (See `tetrachoric`).
`plot`	If TRUE, automatically call the `plot.irt` or `plot.poly` functions.
`y`	the subset of variables to pick from the rho and tau output of a previous irt.fa analysis to allow for further analysis.
`n.obs`	The number of subjects used in the initial analysis if doing a second analysis of a correlation matrix. In particular, if using the fm="minchi" option, this should be the matrix returned by `count.pairwise`.
`rotate`	The default rotation is oblimin. See `fa` for the other options.
`fm`	The default factor extraction is minres. See `fa` for the other options.
`f`	The object returned from `fa`
`rho`	The object returned from `polychoric` or `tetrachoric`. This will include both a correlation matrix and the item difficulty levels.
`sort`	Should the factor loadings be sorted before preparing the item information tables. Defaults to FALSE as this is more useful for scoring items. For tabular output it is better to have sort=TRUE.
`...`	Additional parameters to pass to the factor analysis function

Details

irt.fa combines several functions into one to make the process of item response analysis easier. Correlations are found using either tetrachoric or polychoric. Exploratory factor analyeses with all the normal options are then done using fa. The results are then organized to be reported in terms of IRT parameters (difficulties and discriminations) as well as the more conventional factor analysis output. In addition, because the correlation step is somewhat slow, reanalyses may be done using the correlation matrix found in the first step. In this case, if it is desired to use the fm="minchi" factoring method, the number of observations needs to be specified as the matrix resulting from pairwiseCount.

The tetrachoric correlation matrix of dichotomous items may be factored using a (e.g.) minimum residual factor analysis function fa and the resulting loadings, \lambda_i are transformed to discriminations by \alpha = \frac{\lambda_i}{\sqrt{1-\lambda_i^2}} .

The difficulty parameter, \delta is found from the \tau parameter of the tetrachoric or polychoric function.

\delta_i = \frac{\tau_i}{\sqrt{1-\lambda_i^2}}

Similar analyses may be done with discrete item responses using polychoric correlations and distinct estimates of item difficulty (location) for each item response.

The results may be shown graphically using plot.irt for dichotomous items or plot.poly for polytomous items. These are called by plotting the irt.fa output, see the examples). For plotting there are three options: type = "ICC" will plot the item characteristic response function. type = "IIC" will plot the item information function, and type= "test" will plot the test information function. Invisible output from the plot function will return tables of item information as a function of several levels of the trait, as well as the standard error of measurement and the reliability at each of those levels.

The normal input is just the raw data. If, however, the correlation matrix has already been found using tetrachoric, polychoric, or a previous analysis using irt.fa then that result can be processed directly. Because irt.fa saves the rho and tau matrices from the analysis, subsequent analyses of the same data set are much faster if the input is the object returned on the first run. A similar feature is available in omega.

The output is best seen in terms of graphic displays. Plot the output from irt.fa to see item and test information functions.

The print function will print the item location and discriminations. The additional factor analysis output is available as an object in the output and may be printed directly by specifying the $fa object.

The irt.select function is a helper function to allow for selecting a subset of a prior analysis for further analysis. First run irt.fa, then select a subset of variables to be analyzed in a subsequent irt.fa analysis. Perhaps a better approach is to just plot and find the information for selected items.

The plot function for an irt.fa object will plot ICC (item characteristic curves), IIC (item information curves), or test information curves. In addition, by using the "keys" option, these three kinds of plots can be done for selected items. This is particularly useful when trying to see the information characteristics of short forms of tests based upon the longer form factor analysis.

The plot function will also return (invisibly) the informaton at multiple levels of the trait, the average information (area under the curve) as well as the location of the peak information for each item. These may be then printed or printed in sorted order using the sort option in print.

Value

`irt`	A list of Item location (difficulty) and discrimination
`fa`	A list of statistics for the factor analyis
`rho`	The tetrachoric/polychoric correlation matrix
`tau`	The tetrachoric/polychoric cut points

Note

irt.fa makes use of the tetrachoric or tetrachoric functions. Both of these will use multiple cores if this is an option. To set these use options("mc.cores"=x) where x is the number of cores to use. (Macs default to 2, PCs seem to default to 1).

In comparing irt.fa to the ltm function in the ltm package or to the analysis reported in Kamata and Bauer (2008) the discrimination parameters are not identical, because the irt.fa reports them in units of the normal curve while ltm and Kamata and Bauer report them in logistic units. In addition, Kamata and Bauer do their factor analysis using a logistic error model. Their results match the irt.fa results (to the 2nd or 3rd decimal) when examining their analyses using a normal model. (With thanks to Akihito Kamata for sharing that analysis.)

irt.fa reports parameters in normal units. To convert them to conventional IRT parameters, multiply by 1.702. In addition, the location parameter is expressed in terms of difficulty (high positive scores imply lower frequency of response.)

The results of irt.fa can be used by score.irt for irt based scoring. First run irt.fa and then score the results using a two parameter model using score.irt.

There is also confusion in the literature of how to interpret the tau parameter. Here I treat it the item threshold for a response, which is to say that tau reflects item difficulty.

Further note that the rho parameter in the fa2irt function is the result from a tetrachoric or polychoric analysis and reflects both the correlations and the item thresholds.

Author(s)

William Revelle

References

Kamata, Akihito and Bauer, Daniel J. (2008) A Note on the Relation Between Factor Analytic and Item Response Theory Models Structural Equation Modeling, 15 (1) 136-153.

McDonald, Roderick P. (1999) Test theory: A unified treatment. L. Erlbaum Associates.

Revelle, William. (in prep) An introduction to psychometric theory with applications in R. Springer. Working draft available at https://personality-project.org/r/book/

Examples

## Not run: 
set.seed(17)
d9 <- sim.irt(9,1000,-2.5,2.5,mod="normal") #dichotomous items
test <- irt.fa(d9$items)
test 
op <- par(mfrow=c(3,1))
plot(test,type="ICC")
plot(test,type="IIC")
plot(test,type="test")
par(op)

#compare this result to first finding the correlations, then factoring them, and 
# then applying fa2irt.
R  <- tetrachoric(d9$items,correct=TRUE)
f <- fa(R$rho)
test2 <- fa2irt(f,R)
test2
test2$plot$sumInfo[[1]] - test$plot$sumInfo[[1]]  #identical results

set.seed(17)
items <- sim.congeneric(N=500,short=FALSE,categorical=TRUE) #500 responses to 4 discrete items
d4 <- irt.fa(items$observed)  #item response analysis of congeneric measures
d4    #show just the irt output
d4$fa  #show just the factor analysis output


op <- par(mfrow=c(2,2))
plot(d4,type="ICC")
par(op)


#using the iq data set for an example of real items
#first need to convert the responses to tf
data(iqitems)
iq.keys <- c(4,4,4, 6, 6,3,4,4,  5,2,2,4,  3,2,6,7)

iq.tf <- score.multiple.choice(iq.keys,psychTools::iqitems,score=FALSE)  #just the responses
iq.irt <- irt.fa(iq.tf)
print(iq.irt,short=FALSE) #show the IRT as well as factor analysis output
p.iq <- plot(iq.irt)  #save the invisible summary table
p.iq  #show the summary table of information by ability level
#select a subset of these variables
small.iq.irt <- irt.select(iq.irt,c(1,5,9,10,11,13))
small.irt <- irt.fa(small.iq.irt)
plot(small.irt)
#find the information for three subset of iq items
keys <- make.keys(16,list(all=1:16,some=c(1,5,9,10,11,13),others=c(1:5)))
plot(iq.irt,keys=keys)

## End(Not run)
#compare output to the ltm package or Kamata and Bauer   -- these are in logistic units 
ls <- irt.fa(lsat6)
#library(ltm)
# lsat.ltm <- ltm(lsat6~z1)
#  round(coefficients(lsat.ltm)/1.702,3)  #convert to normal (approximation)
#
#   Dffclt Dscrmn
#Q1 -1.974  0.485
#Q2 -0.805  0.425
#Q3 -0.164  0.523
#Q4 -1.096  0.405
#Q5 -1.835  0.386


#Normal results  ("Standardized and Marginal")(from Akihito Kamata )       
#Item       discrim             tau 
#  1       0.4169             -1.5520   
#  2       0.4333             -0.5999 
#  3       0.5373             -0.1512 
#  4       0.4044             -0.7723  
#  5       0.3587             -1.1966
#compare to ls 

  #Normal results  ("Standardized and conditional") (from Akihito Kamata )   
#item            discrim   tau
#  1           0.3848    -1.4325  
#  2           0.3976    -0.5505 
#  3           0.4733    -0.1332 
#  4           0.3749    -0.7159 
#  5           0.3377    -1.1264 
#compare to ls$fa and ls$tau 

#Kamata and Bauer (2008) logistic estimates
#1   0.826    2.773
#2   0.723    0.990
#3   0.891    0.249  
#4   0.688    1.285
#5   0.657    2.053

[Package psych version 1.9.11 ]