We have discussed a number of ways of estimating reliability. For this assignment you are to
use standard (psych) functions to find the
alpha
reliability for the ability data
set.
You can use several functions to do this (including
alpha
, omega
, scoreItems
and
reliability
). These last two require more specifications to
use.
library(psych)
library(psychTools)
alpha(ability)
##
## Reliability analysis
## Call: alpha(x = ability)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.83 0.83 0.84 0.23 4.9 0.0064 0.51 0.25 0.21
##
## 95% confidence boundaries
## lower alpha upper
## Feldt 0.82 0.83 0.84
## Duhachek 0.82 0.83 0.84
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## reason.4 0.82 0.82 0.82 0.23 4.5 0.0069 0.0060 0.21
## reason.16 0.82 0.82 0.83 0.24 4.7 0.0067 0.0061 0.22
## reason.17 0.82 0.82 0.82 0.23 4.5 0.0069 0.0058 0.21
## reason.19 0.82 0.82 0.83 0.24 4.6 0.0067 0.0061 0.21
## letter.7 0.82 0.82 0.82 0.23 4.5 0.0068 0.0057 0.21
## letter.33 0.82 0.82 0.83 0.24 4.6 0.0068 0.0060 0.21
## letter.34 0.82 0.82 0.82 0.23 4.5 0.0069 0.0056 0.21
## letter.58 0.82 0.82 0.82 0.23 4.5 0.0069 0.0062 0.21
## matrix.45 0.82 0.83 0.83 0.24 4.7 0.0066 0.0061 0.23
## matrix.46 0.82 0.82 0.83 0.24 4.7 0.0067 0.0060 0.21
## matrix.47 0.82 0.82 0.83 0.24 4.6 0.0067 0.0064 0.21
## matrix.55 0.83 0.83 0.83 0.24 4.8 0.0065 0.0058 0.23
## rotate.3 0.82 0.82 0.82 0.23 4.6 0.0067 0.0045 0.23
## rotate.4 0.82 0.82 0.82 0.23 4.5 0.0068 0.0046 0.22
## rotate.6 0.82 0.82 0.82 0.23 4.5 0.0068 0.0051 0.22
## rotate.8 0.82 0.82 0.83 0.24 4.6 0.0067 0.0048 0.23
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## reason.4 1442 0.58 0.58 0.54 0.50 0.68 0.47
## reason.16 1463 0.50 0.50 0.44 0.41 0.73 0.45
## reason.17 1440 0.57 0.57 0.54 0.49 0.74 0.44
## reason.19 1456 0.53 0.52 0.47 0.43 0.64 0.48
## letter.7 1441 0.57 0.56 0.52 0.48 0.63 0.48
## letter.33 1438 0.54 0.53 0.48 0.44 0.61 0.49
## letter.34 1455 0.58 0.57 0.53 0.49 0.64 0.48
## letter.58 1438 0.58 0.57 0.52 0.48 0.47 0.50
## matrix.45 1458 0.49 0.48 0.42 0.38 0.55 0.50
## matrix.46 1470 0.50 0.49 0.43 0.40 0.57 0.50
## matrix.47 1465 0.53 0.52 0.47 0.43 0.64 0.48
## matrix.55 1459 0.43 0.42 0.35 0.32 0.39 0.49
## rotate.3 1456 0.52 0.54 0.51 0.44 0.20 0.40
## rotate.4 1460 0.56 0.58 0.56 0.48 0.22 0.42
## rotate.6 1456 0.55 0.56 0.53 0.46 0.31 0.46
## rotate.8 1460 0.49 0.51 0.47 0.41 0.19 0.39
##
## Non missing response frequency for each item
## 0 1 miss
## reason.4 0.32 0.68 0.05
## reason.16 0.27 0.73 0.04
## reason.17 0.26 0.74 0.06
## reason.19 0.36 0.64 0.05
## letter.7 0.37 0.63 0.06
## letter.33 0.39 0.61 0.06
## letter.34 0.36 0.64 0.05
## letter.58 0.53 0.47 0.06
## matrix.45 0.45 0.55 0.04
## matrix.46 0.43 0.57 0.04
## matrix.47 0.36 0.64 0.04
## matrix.55 0.61 0.39 0.04
## rotate.3 0.80 0.20 0.05
## rotate.4 0.78 0.22 0.04
## rotate.6 0.69 0.31 0.05
## rotate.8 0.81 0.19 0.04
An alternative estimate to alpha
is the pair of
estimates: \(\omega_h\) and \(\omega_t\). Where alpha
is
based upon assumptions of item variances, omega
performs
‘model based’ estimates by doing factor analysis.
omega(ability,4) #we use 4 factors rather than the default to 3
## Loading required namespace: GPArotation
## Omega
## Call: omegah(m = m, nfactors = nfactors, fm = fm, key = key, flip = flip,
## digits = digits, title = title, sl = sl, labels = labels,
## plot = plot, n.obs = n.obs, rotate = rotate, Phi = Phi, option = option,
## covar = covar)
## Alpha: 0.83
## G.6: 0.84
## Omega Hierarchical: 0.66
## Omega H asymptotic: 0.77
## Omega Total 0.86
##
## Schmid Leiman Factor loadings greater than 0.2
## g F1* F2* F3* F4* h2 u2 p2
## reason.4 0.50 0.28 0.35 0.65 0.74
## reason.16 0.42 0.21 0.23 0.77 0.76
## reason.17 0.55 0.46 0.51 0.49 0.59
## reason.19 0.44 0.21 0.25 0.75 0.78
## letter.7 0.51 0.35 0.39 0.61 0.68
## letter.33 0.46 0.31 0.31 0.69 0.69
## letter.34 0.53 0.39 0.43 0.57 0.65
## letter.58 0.47 0.20 0.28 0.72 0.78
## matrix.45 0.40 0.64 0.57 0.43 0.28
## matrix.46 0.40 0.26 0.24 0.76 0.65
## matrix.47 0.43 0.23 0.77 0.79
## matrix.55 0.29 0.13 0.87 0.66
## rotate.3 0.36 0.60 0.50 0.50 0.26
## rotate.4 0.41 0.60 0.53 0.47 0.32
## rotate.6 0.40 0.49 0.41 0.59 0.39
## rotate.8 0.33 0.54 0.41 0.59 0.27
##
## With eigenvalues of:
## g F1* F2* F3* F4*
## 3.05 1.31 0.47 0.40 0.53
##
## general/max 2.34 max/min = 3.23
## mean percent general = 0.58 with sd = 0.2 and cv of 0.35
## Explained Common Variance of the general factor = 0.53
##
## The degrees of freedom are 62 and the fit is 0.05
## The number of observations was 1525 with Chi Square = 70.96 with prob < 0.2
## The root mean square of the residuals is 0.01
## The df corrected root mean square of the residuals is 0.02
## RMSEA index = 0.01 and the 10 % confidence intervals are 0 0.019
## BIC = -383.48
##
## Compare this with the adequacy of just a general factor and no group factors
## The degrees of freedom for just the general factor are 104 and the fit is 0.78
## The number of observations was 1525 with Chi Square = 1179.85 with prob < 9.1e-182
## The root mean square of the residuals is 0.09
## The df corrected root mean square of the residuals is 0.09
##
## RMSEA index = 0.082 and the 10 % confidence intervals are 0.078 0.087
## BIC = 417.56
##
## Measures of factor score adequacy
## g F1* F2* F3* F4*
## Correlation of scores with factors 0.83 0.79 0.54 0.55 0.69
## Multiple R square of scores with factors 0.69 0.63 0.29 0.31 0.48
## Minimum correlation of factor score estimates 0.38 0.26 -0.42 -0.39 -0.04
##
## Total, General and Subset omega for each subset
## g F1* F2* F3* F4*
## Omega total for total scores and subscales 0.86 0.77 0.69 0.64 0.52
## Omega general for total scores and subscales 0.66 0.24 0.52 0.47 0.27
## Omega group for total scores and subscales 0.13 0.53 0.17 0.17 0.25
ICC(ability) #compare the 6 answers. Why are they different
## Call: ICC(x = ability)
##
## Intraclass correlation coefficients
## type ICC F df1 df2 p lower bound upper bound
## Single_raters_absolute ICC1 0.19 4.8 1524 22875 0 0.18 0.21
## Single_random_raters ICC2 0.20 5.8 1524 22860 0 0.18 0.22
## Single_fixed_raters ICC3 0.23 5.8 1524 22860 0 0.22 0.25
## Average_raters_absolute ICC1k 0.79 4.8 1524 22875 0 0.78 0.81
## Average_random_raters ICC2k 0.80 5.8 1524 22860 0 0.77 0.82
## Average_fixed_raters ICC3k 0.83 5.8 1524 22860 0 0.82 0.84
##
## Number of subjects = 1525 Number of Judges = 16
## See the help file for a discussion of the other 4 McGraw and Wong estimates,
The \(\alpha\) estimate of reliability developed by Cronbach (1951) is identical to the \(\lambda_3\) coefficient of Guttman (1945) and the KR20 reliability of Kuder and Richardson (1937). All of these formulas are shortcuts for estimating the average correlation within a test without actually finding the correlations. They are based upon a variance decompostion of the test.
The variance of the total test (\(V_t\)) is made up of the sum of the inter-item covarariances (\(\sigma_{ij}\)) off the diagonal and the item varances \(\sigma_i^2\) on the diagonal. The diagonal elements are a mixture of reliable signal and unreliable noise.
Reliability is the ratio of reliable signal to total signal and may be found by
\[\alpha = \frac{V_t - \Sigma (\sigma_i^2)}{V_t}* \frac{k}{k-1}\] where k is the number of items.
Thus, to find \(\alpha\) you need to write a function to find the total test variance (\(V_t\)) and each of the item variances.
The var command will find variances, but first you need to find a total score by adding up the responses for each individual. Given a data.frame or matrix X with elements \(x_ij\) this means you need find a total for each subject
total = rowSums(X)
and then find the variances of total, and for each column of X (hint, use apply)
##Assignment d)
Create a function (my.alpha) and then compare your results to the alpha function.
my.alpha <- function(x) {
total.score <- apply(x,1,sum,na.rm=TRUE)
total.var <- var(total.score)
item.var <- apply(x,2,var,na.rm=TRUE)
k <- ncol(x)
alpha <- (total.var - sum(item.var))/total.var * (k/(k-1))
return(alpha)
}
my.alpha(attitude)
## [1] 0.8431428
This procedure works for the case of no missing data. But if you have missing data the technique of finding the total.score will attribute a missing response to no response. We have two alternatives: substitute in some value for the missing response (e.g., the item mean or item median for the other subjects who have scores on the item) or we can find mean scores. This imputes the person’s mean score to the missing data, rather than the item mean.
If we do the later (find means) then we need to adjust the formula for alpha, because the variance of the mean scores will be \(k^2\) smaller than that of total scores.
my.alpha.means <- function(x) {
k <- ncol(x)
total.score <- apply(x,1,mean,na.rm=TRUE)
total.var <- var(total.score,na.rm=TRUE) * k^2 #this corrects for the fact that we are using means
item.var <- apply(x,2,var,na.rm=TRUE)
alpha <- (total.var - sum(item.var))/total.var * (k/(k-1))
return(alpha)
}
my.alpha.means(attitude)
## [1] 0.8431428
An alternative way of imputing the missing items is to give total score (estimated) based upon the mean of the responses scaled by the number anwered.
my.imputed.alpha <- function(x) {
k <- ncol(x)
mean.score <- apply(x,1,mean,na.rm=TRUE)
total.answered <-apply(x,1,function(x) sum(!is.na(x),na.rm=TRUE))
adjusted.score <- mean.score * k
total.var <- var(adjusted.score,na.rm=TRUE)
item.var <- apply(x,2,var,na.rm=TRUE)
alpha <- (total.var - sum(item.var))/total.var * (k/(k-1))
return(alpha)
}
my.alpha.means(attitude)
## [1] 0.8431428
my.alpha(ability)
## [1] 0.8442631
#compare with
my.alpha.means(ability)
## [1] 0.8356268
# and with
my.imputed.alpha(ability)
## [1] 0.8356268
This the correlation between two random split halfs of the test, which are then adjusted for test length using a formula derived from Spearman and Brown in 1910.
\(r_{xx} = \frac{2r}{1+r}\)
We can specify the splits, or let the splitHalf
function
find all possible splits.
sp <- splitHalf(ability,raw=TRUE) #we specify that we want to keep the results
sp #show them
## Split half reliabilities
## Call: splitHalf(r = ability, raw = TRUE)
##
## Maximum split half reliability (lambda 4) = 0.87
## Guttman lambda 6 = 0.84
## Average split half reliability = 0.83
## Guttman lambda 3 (alpha) = 0.83
## Guttman lambda 2 = 0.83
## Minimum split half reliability (beta) = 0.73
## Average interitem r = 0.23 with median = 0.21
## 2.5% 50% 97.5%
## Quantiles of split half reliability = 0.77 0.83 0.86
hist(sp$raw) #draw a histogram of distributions of the split halfs
alpha' will find the $\alpha$ value for one scale.
scoreItemswill do this for one or for many scales. This requires specify a 'keys' list.
scoreOverlap`
will also find reliability estimates, but correcting the correlations
between scales for item overlap.
The reliability
function is a wrapper for
alpha',
omegaand
splitHalf`. If not given a keys
list, it will assume that all items are to be used. If given a keys
list, it will score each scale individually.
For the final part of the homework, try all three of these procedures:
bfi.keys #show the keying information
## $agree
## [1] "-A1" "A2" "A3" "A4" "A5"
##
## $conscientious
## [1] "C1" "C2" "C3" "-C4" "-C5"
##
## $extraversion
## [1] "-E1" "-E2" "E3" "E4" "E5"
##
## $neuroticism
## [1] "N1" "N2" "N3" "N4" "N5"
##
## $openness
## [1] "O1" "-O2" "O3" "O4" "-O5"
scales <- scoreItems(bfi.keys,bfi)
summary(scales)
## Call: scoreItems(keys = bfi.keys, items = bfi)
##
## Scale intercorrelations corrected for attenuation
## raw correlations below the diagonal, (unstandardized) alpha on the diagonal
## corrected correlations above the diagonal:
## agree conscientious extraversion neuroticism openness
## agree 0.70 0.36 0.63 -0.245 0.23
## conscientious 0.26 0.72 0.35 -0.305 0.30
## extraversion 0.46 0.26 0.76 -0.284 0.32
## neuroticism -0.18 -0.23 -0.22 0.812 -0.12
## openness 0.15 0.19 0.22 -0.086 0.60
reliability
has multiple options to display the results
–Try the various options (I just show one set of output. Try others.)
rel <- reliability(bfi.keys,bfi)
rel
## Measures of reliability
## reliability(keys = bfi.keys, items = bfi)
## omega_h alpha omega.tot Uni r.fit fa.fit max.split min.split mean.r med.r n.items
## agree 0.64 0.71 0.75 0.89 0.90 0.99 0.75 0.62 0.33 0.34 5
## conscientious 0.53 0.73 0.77 0.95 0.97 0.98 0.73 0.64 0.35 0.34 5
## extraversion 0.55 0.76 0.82 0.97 0.97 0.99 0.78 0.71 0.39 0.38 5
## neuroticism 0.71 0.81 0.85 0.93 0.95 0.98 0.83 0.73 0.47 0.41 5
## openness 0.39 0.61 0.70 0.85 0.88 0.97 0.66 0.53 0.24 0.23 5
plot(rel)