guttman {psych} R Documentation

## Alternative estimates of test reliabiity

### Description

Eight alternative estimates of test reliability include the six discussed by Guttman (1945), four discussed by ten Berge and Zergers (1978) (μ_0 … μ_3) as well as β (the worst split half, Revelle, 1979), the glb (greatest lowest bound) discussed by Bentler and Woodward (1980), and ω_h and ω_t (McDonald, 1999; Zinbarg et al., 2005).

### Usage

guttman(r,key=NULL)
tenberge(r)
glb(r,key=NULL)
glb.fa(r,key=NULL)


### Arguments

 r A correlation or covariance matrix or raw data matrix. key a vector of -1, 0, 1 to select or reverse key items

### Details

Surprisingly, 105 years after Spearman (1904) introduced the concept of reliability to psychologists, there are still multiple approaches for measuring it. Although very popular, Cronbach's α (1951) underestimates the reliability of a test and over estimates the first factor saturation. The guttman function includes the six estimates discussed by Guttman (1945), four of ten Berge and Zergers (1978), as well as Revelle's β (1979) using ICLUST. The companion function, omega calculates omega hierarchical (ω_h) and omega total (ω_t).

Guttman's first estimate λ_1 assumes that all the variance of an item is error:

lambda 1= 1-tr(Vx)/Vx

This is a clear underestimate.

The second bound, λ_2, replaces the diagonal with a function of the square root of the sums of squares of the off diagonal elements. Let C_2 = \vec{1}( \vec{V}-diag(\vec{V})^2 \vec{1}' , then

λ_2= λ_1 + sqrt(n *(n-1)C_2)/V_x)

Effectively, this is replacing the diagonal with n * the square root of the average squared off diagonal element.

Guttman's 3rd lower bound, λ_3, also modifies λ_1 and estimates the true variance of each item as the average covariance between items and is, of course, the same as Cronbach's α.

λ 3 = ((n)/(n-1))(1-tr(Vx)/(Vx) = ((n)/(n-1))(Vx-tr(Vx)/Vx = α

This is just replacing the diagonal elements with the average off diagonal elements. λ_2 ≥ λ_3 with λ_2 > λ_3 if the covariances are not identical.

λ_3

and λ_2 are both corrections to λ_1 and this correction may be generalized as an infinite set of successive improvements. (Ten Berge and Zegers, 1978)

(1/(Vx))(po + p1 = (p2 + ... (pr1) + pr^.5 )^.5^ ... .5)

where

p_h = sum(σ^2h, h = 0, 1, 2, ... r-1

and

p_h = n/((n-1) σ^2h)

tenberge and Zegers (1978). Clearly μ_0 = λ_3 = α and μ_1 = λ_2. μ_r ≥ μ_{r-1} ≥ … μ_1 ≥ μ_0, although the series does not improve much after the first two steps.

Guttman's fourth lower bound, λ_4 was originally proposed as any spit half reliability but has been interpreted as the greatest split half reliability. If \vec{X} is split into two parts, \vec{X}_a and \vec{X}_b, with correlation r_{ab} then

λ 4 = 4rab/(Va + Vb + 2rabVaVb)

which is just the normal split half reliability, but in this case, of the most similar splits.

λ_5

, Guttman's fifth lower bound, replaces the diagonal values with twice the square root of the maximum (across items) of the sums of squared interitem covariances

λ_5 = λ_1 +2/sqrt(average(C_2)/V_X.)

Although superior to λ_1, λ_5 underestimates the correction to the diagonal. A better estimate would be analogous to the correction used in λ_3:

λ 5+ = λ 1 + ((n/(n-1))2/sqrt(av covariance 12)/Vx

Guttman's final bound considers the amount of variance in each item that can be accounted for the linear regression of all of the other items (the squared multiple correlation or smc), or more precisely, the variance of the errors, e_j^2, and is

λ 6 = 1 - sum(e^2)/Vx = 1-sum(1-r^2(smc))/Vx

.

The smc is found from all the items. A modification to Guttman λ_6, λ_6* reported by the score.items function is to find the smc from the entire pool of items given, not just the items on the selected scale.

Guttman's λ_4 is the greatest split half reliability. This is found here by combining the output from three different approaches, and seems to work for all test cases yet tried. Lambda 4 is reported as the max of these three algorithms.

The algorithms are

a) Do an ICLUST of the reversed correlation matrix. ICLUST normally forms the most distinct clusters. By reversing the correlations, it will tend to find the most related clusters. Truly a weird approach but tends to work.

b) Alternatively, a kmeans clustering of the correlations (with the diagonal replaced with 0 to make pseudo distances) can produce 2 similar clusters.

c) Clusters identified by assigning items to two clusters based upon their order on the first principal factor. (Highest to cluster 1, next 2 to cluster 2, etc.)

These three procedures will produce keys vectors for assigning items to the two splits. The maximum split half reliability is found by taking the maximum of these three approaches. This is not elegant but is fast.

There are three greatest lower bound functions. One, glb finds the greatest split half reliability, λ_4. This considers the test as set of items and examines how best to partition the items into splits. The other two, glb.fa and glb.algebraic, are alternative ways of weighting the diagonal of the matrix.

glb.fa estimates the communalities of the variables from a factor model where the number of factors is the number with positive eigen values. Then reliability is found by

glb = 1 - sum(e^2)/Vx = 1-sum(1-h^2)/Vx

This estimate will differ slightly from that found by glb.algebraic, written by Andreas Moeltner which uses calls to csdp in the Rcsdp package. His algorithm, which more closely matches the description of the glb by Jackson and Woodhouse, seems to have a positive bias (i.e., will over estimate the reliability of some items; they are said to be = 1) for small sample sizes. More exploration of these two algorithms is underway.

Compared to glb.algebraic, glb.fa seems to have less (positive) bias for smallish sample sizes (n < 500) but larger for large (> 1000) sample sizes. This interacts with the number of variables so that equal bias sample size differs as a function of the number of variables. The differences are, however small. As samples sizes grow, glb.algebraic seems to converge on the population value while glb.fa has a positive bias.

### Value

 beta The normal beta estimate of cluster similarity from ICLUST. This is an estimate of the general factor saturation. tenberge$mu1 tenBerge mu 1 is functionally alpha tenberge$mu2 one of the sequence of estimates mu1 ... mu3 beta.factor For experimental purposes, what is the split half based upon the two factor solution? glb.IC Greatest split half based upon ICLUST of reversed correlations glb.Km Greatest split half based upon a kmeans clustering. glb.Fa Greatest split half based upon the items assigned by factor analysis. glb.max max of the above estimates glb glb found from factor analysis keys scoring keys from each of the alternative methods of forming best splits

William Revelle

### References

Cronbach, L.J. (1951) Coefficient alpha and the internal strucuture of tests. Psychometrika, 16, 297-334.

Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10 (4), 255-282.

Revelle, W. (1979). Hierarchical cluster-analysis and the internal structure of tests. Multivariate Behavioral Research, 14 (1), 57-74.

Revelle, W. and Zinbarg, R. E. (2009) Coefficients alpha, beta, omega and the glb: comments on Sijtsma. Psychometrika, 2009.

Ten Berge, J. M. F., & Zegers, F. E. (1978). A series of lower bounds to the reliability of a test. Psychometrika, 43 (4), 575-579.

Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach's α , Revelle's β , and McDonald's ω_h ): Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70 (1), 123-133.

alpha, omega, ICLUST, glb.algebraic

### Examples

data(attitude)
glb(attitude)
glb.fa(attitude)
if(require(Rcsdp)) {glb.algebraic(cor(attitude)) }
guttman(attitude)



[Package psych version 1.3.2 Index]