Slides available at https://personality-project.org/r/tutorials/swissarmy/SMEP.20.html The Rmd file is there as well. If you download the Rmd file and copy it to RStudio you can follow along.
Psych is analogous to a Swiss Army knife. Definitely not the best tool for any specific job but a pretty good tool for many jobs. Today we discuss using psych for some “basic” psychometrics.
It is always necessary to make sure that psych
and psychTools
are available. The assumption is that you have installed R and the packages. Then just start up R and make sure that the packages you want are available.
library(psych)
library(psychTools)
Psych has been under development for the past 15 years. As it has gotten bigger, it has exceeded Comprehensive R Archive Network (CRAN) guidelines of no more than a 5 MB file. Thus several of the vignettes, the larger data sets, and some convenient tools have been moved to psychTools
.
psychTools
offers several functions for convenient input. These are read.clipboard
and read.file
. Both of these functions are described in detail in the vignette {http://personality-project.org/r/psych/vignettes/intro.pdf}{(introduction to the psych package)} Look at that vignette and the ones cross linked from there for more information.
Here is a brief example that you can run. Copy the following text to your clipboard
Sentences 1.00
Vocabulary 0.83 1.00
Sent.Completion 0.78 0.78 1.00
First.Letters 0.44 0.49 0.46 1.00
Four.Letter.Words 0.43 0.46 0.42 0.67 1.00
Suffixes 0.45 0.49 0.44 0.59 0.54 1.00
Letter.Series 0.45 0.43 0.40 0.38 0.40 0.29 1.00
Pedigrees 0.54 0.54 0.53 0.35 0.37 0.32 0.56 1.00
Letter.Group 0.38 0.36 0.36 0.42 0.45 0.32 0.60 0.45 1.00
Then enter
R <- read.clipboard.lower(names=TRUE)
I can not show this in the html, because that is dynamic command, but the static version will produce the same output:
R <- Thurstone #a built in data set
lowerMat(R) #neatly print out the data matrix
Sntnc Vcblr Snt.C Frs.L F.L.W Sffxs Ltt.S Pdgrs Ltt.G
Sentences 1.00
Vocabulary 0.83 1.00
Sent.Completion 0.78 0.78 1.00
First.Letters 0.44 0.49 0.46 1.00
Four.Letter.Words 0.43 0.46 0.42 0.67 1.00
Suffixes 0.45 0.49 0.44 0.59 0.54 1.00
Letter.Series 0.45 0.43 0.40 0.38 0.40 0.29 1.00
Pedigrees 0.54 0.54 0.53 0.35 0.37 0.32 0.56 1.00
Letter.Group 0.38 0.36 0.36 0.42 0.45 0.32 0.60 0.45 1.00
We will make use of three of the data sets built into psychTools. ability
, bfi
and spi
. To see descriptions of these data sets, just ask for help:
?ability
To see all of the data sets available in psych
or psychTools
:
data(package= "psychTools")
data(package="psych")
The ability
dataset: 16 items taken from the International Cognitive Ability Resource (ICAR) 16 item example. Condon and Revelle, (2014). See also Revelle, Dworak and Condon, (2020) and Dworak, Revelle, Doebler and Condon (2020) Data were collected from the SAPA project.
describe
and corplot
functionsdescribe(ability)
vars n mean sd median trimmed mad min max range skew kurtosis se
reason.4 1 1442 0.68 0.47 1 0.72 0 0 1 1 -0.75 -1.44 0.01
reason.16 2 1463 0.73 0.45 1 0.78 0 0 1 1 -1.02 -0.96 0.01
reason.17 3 1440 0.74 0.44 1 0.80 0 0 1 1 -1.08 -0.84 0.01
reason.19 4 1456 0.64 0.48 1 0.68 0 0 1 1 -0.60 -1.64 0.01
letter.7 5 1441 0.63 0.48 1 0.67 0 0 1 1 -0.56 -1.69 0.01
letter.33 6 1438 0.61 0.49 1 0.63 0 0 1 1 -0.43 -1.82 0.01
letter.34 7 1455 0.64 0.48 1 0.68 0 0 1 1 -0.59 -1.65 0.01
letter.58 8 1438 0.47 0.50 0 0.46 0 0 1 1 0.12 -1.99 0.01
matrix.45 9 1458 0.55 0.50 1 0.56 0 0 1 1 -0.20 -1.96 0.01
matrix.46 10 1470 0.57 0.50 1 0.59 0 0 1 1 -0.28 -1.92 0.01
matrix.47 11 1465 0.64 0.48 1 0.67 0 0 1 1 -0.57 -1.67 0.01
matrix.55 12 1459 0.39 0.49 0 0.36 0 0 1 1 0.45 -1.80 0.01
rotate.3 13 1456 0.20 0.40 0 0.13 0 0 1 1 1.48 0.19 0.01
rotate.4 14 1460 0.22 0.42 0 0.15 0 0 1 1 1.34 -0.21 0.01
rotate.6 15 1456 0.31 0.46 0 0.27 0 0 1 1 0.80 -1.35 0.01
rotate.8 16 1460 0.19 0.39 0 0.12 0 0 1 1 1.55 0.41 0.01
corPlot(ability,xlas=3)
How many factors? Who knows? To quote Henry Kaiser, “the number of factor problems is easy, I solve it everyday before breakfast. But what is the right answer”.
fa.parallel(ability)
Parallel analysis suggests that the number of factors = 4 and the number of components = 2
#or another way
nfactors(ability)
Number of factors
Call: vss(x = x, n = n, rotate = rotate, diagonal = diagonal, fm = fm,
n.obs = n.obs, plot = FALSE, title = title, use = use, cor = cor)
VSS complexity 1 achieves a maximimum of 0.66 with 1 factors
VSS complexity 2 achieves a maximimum of 0.73 with 2 factors
The Velicer MAP achieves a minimum of 0.01 with 2 factors
Empirical BIC achieves a minimum of -386.28 with 4 factors
Sample Size adjusted BIC achieves a minimum of -186.53 with 4 factors
Statistics by number of factors
vss1 vss2 map dof chisq prob sqresid fit RMSEA BIC SABIC complex eChisq SRMR eCRMS eBIC
1 0.66 0.00 0.013 104 1.1e+03 9.5e-159 10.4 0.66 0.0778 301 631 1.0 1.6e+03 6.6e-02 0.0705 815
2 0.59 0.73 0.012 89 2.8e+02 2.9e-21 8.1 0.73 0.0373 -375 -92 1.2 3.1e+02 2.9e-02 0.0338 -343
3 0.43 0.68 0.018 75 1.7e+02 3.8e-09 7.6 0.75 0.0286 -381 -143 1.8 1.9e+02 2.3e-02 0.0290 -357
4 0.42 0.64 0.024 62 7.1e+01 2.0e-01 6.8 0.78 0.0097 -383 -187 2.0 6.8e+01 1.4e-02 0.0190 -386
5 0.45 0.67 0.035 50 4.5e+01 6.9e-01 6.0 0.80 0.0000 -322 -163 1.8 3.8e+01 1.0e-02 0.0158 -328
6 0.45 0.65 0.046 39 2.4e+01 9.7e-01 5.9 0.81 0.0000 -262 -138 1.9 2.1e+01 7.6e-03 0.0133 -265
7 0.44 0.63 0.063 29 1.7e+01 9.6e-01 5.8 0.81 0.0000 -196 -103 2.0 1.4e+01 6.2e-03 0.0126 -199
8 0.41 0.56 0.084 20 9.8e+00 9.7e-01 5.6 0.82 0.0000 -137 -73 2.0 7.1e+00 4.4e-03 0.0108 -139
9 0.39 0.55 0.113 12 4.0e+00 9.8e-01 5.5 0.82 0.0000 -84 -46 2.2 2.8e+00 2.8e-03 0.0087 -85
10 0.43 0.59 0.164 5 1.1e+00 9.5e-01 5.0 0.84 0.0000 -36 -20 2.2 7.7e-01 1.4e-03 0.0071 -36
11 0.42 0.57 0.214 -1 1.2e-01 NA 5.3 0.83 NA NA NA 2.2 8.8e-02 4.9e-04 NA NA
12 0.40 0.54 0.307 -6 2.5e-06 NA 5.5 0.82 NA NA NA 2.3 1.6e-06 2.1e-06 NA NA
13 0.42 0.56 0.421 -10 2.5e-07 NA 5.5 0.82 NA NA NA 2.3 1.6e-07 6.6e-07 NA NA
14 0.40 0.54 0.568 -13 8.4e-08 NA 5.5 0.82 NA NA NA 2.3 5.1e-08 3.7e-07 NA NA
15 0.42 0.54 1.000 -15 0.0e+00 NA 5.5 0.82 NA NA NA 2.3 2.3e-12 2.5e-09 NA NA
16 0.42 0.54 NA -16 0.0e+00 NA 5.5 0.82 NA NA NA 2.3 2.3e-12 2.5e-09 NA NA
f4 <- fa(ability,4)
Loading required namespace: GPArotation
fa.diagram(f4)
Yes. Either find them directly using the tetrachoric
function, or find them and then factor them and display the results as a 2pl solution.
ability.irt <- irt.fa(ability) #take out a one factor solution
ability.irt.4 <- irt.fa(ability.irt,4) #use the correlations found here, and do it again with 4 factors
om <- omega(ability,4) #show a Schmid Leiman Solution
om #show the reslts
Omega
Call: omegah(m = m, nfactors = nfactors, fm = fm, key = key, flip = flip,
digits = digits, title = title, sl = sl, labels = labels,
plot = plot, n.obs = n.obs, rotate = rotate, Phi = Phi, option = option,
covar = covar)
Alpha: 0.83
G.6: 0.84
Omega Hierarchical: 0.66
Omega H asymptotic: 0.77
Omega Total 0.86
Schmid Leiman Factor loadings greater than 0.2
g F1* F2* F3* F4* h2 u2 p2
reason.4 0.50 0.28 0.35 0.65 0.74
reason.16 0.42 0.21 0.23 0.77 0.76
reason.17 0.55 0.46 0.51 0.49 0.59
reason.19 0.44 0.21 0.25 0.75 0.78
letter.7 0.51 0.35 0.39 0.61 0.68
letter.33 0.46 0.31 0.31 0.69 0.69
letter.34 0.53 0.39 0.43 0.57 0.65
letter.58 0.47 0.20 0.28 0.72 0.78
matrix.45 0.40 0.64 0.57 0.43 0.28
matrix.46 0.40 0.26 0.24 0.76 0.65
matrix.47 0.43 0.23 0.77 0.79
matrix.55 0.29 0.13 0.87 0.66
rotate.3 0.36 0.60 0.50 0.50 0.26
rotate.4 0.41 0.60 0.53 0.47 0.32
rotate.6 0.40 0.49 0.41 0.59 0.39
rotate.8 0.33 0.54 0.41 0.59 0.27
With eigenvalues of:
g F1* F2* F3* F4*
3.05 1.31 0.47 0.40 0.53
general/max 2.34 max/min = 3.23
mean percent general = 0.58 with sd = 0.2 and cv of 0.35
Explained Common Variance of the general factor = 0.53
The degrees of freedom are 62 and the fit is 0.05
The number of observations was 1525 with Chi Square = 70.96 with prob < 0.2
The root mean square of the residuals is 0.01
The df corrected root mean square of the residuals is 0.02
RMSEA index = 0.01 and the 10 % confidence intervals are 0 0.019
BIC = -383.48
Compare this with the adequacy of just a general factor and no group factors
The degrees of freedom for just the general factor are 104 and the fit is 0.78
The number of observations was 1525 with Chi Square = 1179.85 with prob < 9.1e-182
The root mean square of the residuals is 0.09
The df corrected root mean square of the residuals is 0.09
RMSEA index = 0.082 and the 10 % confidence intervals are 0.078 0.087
BIC = 417.56
Measures of factor score adequacy
g F1* F2* F3* F4*
Correlation of scores with factors 0.83 0.79 0.54 0.55 0.69
Multiple R square of scores with factors 0.69 0.63 0.29 0.31 0.48
Minimum correlation of factor score estimates 0.38 0.26 -0.42 -0.39 -0.04
Total, General and Subset omega for each subset
g F1* F2* F3* F4*
Omega total for total scores and subscales 0.86 0.77 0.69 0.64 0.52
Omega general for total scores and subscales 0.66 0.24 0.52 0.47 0.27
Omega group for total scores and subscales 0.13 0.53 0.17 0.17 0.25
omega.diagram(om,sl=FALSE) #show the hierarchical model
omegaDirect(ability,4) #use the Neils Waller approach
Call: omegaDirect(m = ability, nfactors = 4)
Omega from direct Schmid Leiman = 0.71
Call: omegaDirect(m = ability, nfactors = 4)
Standardized loadings (pattern matrix) based upon correlation matrix
g F1* F2* F3* F4* h2 u2
reason.4 0.49 0.06 -0.03 0.05 -0.31 0.35 0.65
reason.16 0.40 0.10 -0.01 0.00 -0.25 0.23 0.77
reason.17 0.49 0.04 0.04 -0.05 -0.51 0.51 0.49
reason.19 0.42 0.11 0.01 0.03 -0.25 0.25 0.75
letter.7 0.47 0.40 0.05 -0.01 -0.09 0.39 0.61
letter.33 0.43 0.34 0.02 0.02 -0.05 0.31 0.69
letter.34 0.49 0.43 0.07 0.00 -0.07 0.43 0.57
letter.58 0.46 0.22 -0.08 0.00 -0.11 0.28 0.72
matrix.45 0.49 -0.08 0.10 0.56 0.03 0.57 0.43
matrix.46 0.42 0.10 0.04 0.22 -0.06 0.24 0.76
matrix.47 0.43 0.15 0.00 0.10 -0.11 0.23 0.77
matrix.55 0.32 0.03 -0.09 0.12 -0.04 0.13 0.87
rotate.3 0.46 -0.01 -0.53 -0.03 0.06 0.50 0.50
rotate.4 0.50 0.04 -0.53 -0.05 0.04 0.53 0.47
rotate.6 0.47 -0.01 -0.43 -0.06 -0.08 0.41 0.59
rotate.8 0.42 -0.08 -0.47 -0.01 -0.01 0.41 0.59
g F1* F2* F3* F4*
SS loadings 3.23 0.59 1.01 0.40 0.54
Proportion Var 0.20 0.04 0.06 0.03 0.03
Cumulative Var 0.20 0.24 0.30 0.33 0.36
Proportion Explained 0.56 0.10 0.17 0.07 0.09
Cumulative Proportion 0.56 0.66 0.84 0.91 1.00
With eigenvalues of:
g F1* F2* F3* F4*
3.23 0.59 1.01 0.40 0.54
The degrees of freedom for the model is 62 and the fit was 0.05
The root mean square of the residuals is 0.01
The df corrected root mean square of the residuals is 0.02
Total, General and Subset omega for each subset
g F1* F2* F3* F4*
Omega total for total scores and subscales 0.86 0.68 0.77 0.53 0.64
Omega general for total scores and subscales 0.71 0.47 0.36 0.35 0.42
Omega group for total scores and subscales 0.12 0.21 0.41 0.19 0.22
See Revelle and Condon (2019) for more than you ever wanted to know about reliability. And if that wasn’t enough, go to the online appendix for R code for doing the analyses.
We just showed three ways of looking at reliability. \(\omega_h\), \(\alpha\) and \(\omega_t\). What about splitHalf?
We can find all split halves for up to about 24 items, or sample the top 10,000 (or so) for bigger problems.
sp <-splitHalf(ability,raw=TRUE)
sp #report basic statistics
Split half reliabilities
Call: splitHalf(r = ability, raw = TRUE)
Maximum split half reliability (lambda 4) = 0.87
Guttman lambda 6 = 0.84
Average split half reliability = 0.83
Guttman lambda 3 (alpha) = 0.83
Guttman lambda 2 = 0.83
Minimum split half reliability (beta) = 0.73
Average interitem r = 0.23 with median = 0.21
2.5% 50% 97.5%
Quantiles of split half reliability = 0.77 0.84 0.86
hist(sp$raw) # show the histogram of all possible splits
The bfi
data set IS NOT THE BFI (John and Soto). It is a little set of 25 items taken from SAPA.
Lets find 5 factors (hence the name) and then show the item content. We use the fa
function as well as the faLookup
function. The later takes the factor output, sorts it, and then merges it with item discriptions held in the bfi.dictionary
.
f5 <- fa(bfi[1:25],5) #do the factoring, specify we want 5 factors
fa.diagram(f5) #show the structure
fa.lookup(f5,dictionary=bfi.dictionary[1:2])
MR2 MR1 MR3 MR5 MR4 com h2 ItemLabel Item
N1 0.81 0.10 0.00 -0.11 -0.05 1.08 0.65 q_952 Get angry easily.
N2 0.78 0.04 0.01 -0.09 0.01 1.04 0.60 q_974 Get irritated easily.
N3 0.71 -0.10 -0.04 0.08 0.02 1.07 0.55 q_1099 Have frequent mood swings.
N5 0.49 -0.20 0.00 0.21 -0.15 1.96 0.35 q_1505 Panic easily.
N4 0.47 -0.39 -0.14 0.09 0.08 2.27 0.49 q_1479 Often feel blue.
E2 0.10 -0.68 -0.02 -0.05 -0.06 1.07 0.54 q_901 Find it difficult to approach others.
E4 0.01 0.59 0.02 0.29 -0.08 1.49 0.53 q_1410 Make friends easily.
E1 -0.06 -0.56 0.11 -0.08 -0.10 1.21 0.35 q_712 Don't talk a lot.
E5 0.15 0.42 0.27 0.05 0.21 2.60 0.40 q_1768 Take charge.
E3 0.08 0.42 0.00 0.25 0.28 2.55 0.44 q_1205 Know how to captivate people.
C2 0.15 -0.09 0.67 0.08 0.04 1.17 0.45 q_530 Continue until everything is perfect.
C4 0.17 0.00 -0.61 0.04 -0.05 1.18 0.45 q_626 Do things in a half-way manner.
C3 0.03 -0.06 0.57 0.09 -0.07 1.11 0.32 q_619 Do things according to a plan.
C5 0.19 -0.14 -0.55 0.02 0.09 1.44 0.43 q_1949 Waste my time.
C1 0.07 -0.03 0.55 -0.02 0.15 1.19 0.33 q_124 Am exacting in my work.
A3 -0.03 0.12 0.02 0.66 0.03 1.07 0.52 q_1206 Know how to comfort others.
A2 -0.02 0.00 0.08 0.64 0.03 1.04 0.45 q_1162 Inquire about others' well-being.
A5 -0.11 0.23 0.01 0.53 0.04 1.49 0.46 q_1419 Make people feel at ease.
A4 -0.06 0.06 0.19 0.43 -0.15 1.74 0.28 q_1364 Love children.
A1 0.21 0.17 0.07 -0.41 -0.06 1.97 0.19 q_146 Am indifferent to the feelings of others.
O3 0.03 0.15 0.02 0.08 0.61 1.17 0.46 q_492 Carry the conversation to a higher level.
O5 0.13 0.10 -0.03 0.04 -0.54 1.21 0.30 q_1964 Will not probe deeply into a subject.
O1 0.02 0.10 0.07 0.02 0.51 1.13 0.31 q_128 Am full of ideas.
O2 0.19 0.06 -0.08 0.16 -0.46 1.75 0.26 q_316 Avoid difficult reading material.
O4 0.13 -0.32 -0.02 0.17 0.37 2.69 0.25 q_1738 Spend time reflecting on things.
Can be done with successive pca and correlate the component scores (Goldberg) or just using matrix algebra on the components (Waller) or doing matrix algrebra on the factor correlations (Revelle, who doesn’t know better).
ba.bfi <- bassAckward(bfi[1:25], c(2,3,5))
In core-R, regression is done using the lm
function from the raw data. setCor
in psych will do it from the raw data or from the correlation matrix. The syntax is meant to be the same.
We will use a data set from Nurit Tal-Or, Jonanathan Cohen, Yariv Tasfati, and Albert Gunther (2010) (with their permission).
describe(Tal_Or)
vars n mean sd median trimmed mad min max range skew kurtosis se
cond 1 123 0.47 0.50 0.00 0.46 0.00 0 1 1 0.11 -2.00 0.05
pmi 2 123 5.60 1.32 6.00 5.78 1.48 1 7 6 -1.17 1.30 0.12
import 3 123 4.20 1.74 4.00 4.26 1.48 1 7 6 -0.26 -0.89 0.16
reaction 4 123 3.48 1.55 3.25 3.44 1.85 1 7 6 0.21 -0.90 0.14
gender 5 123 1.65 0.48 2.00 1.69 0.00 1 2 1 -0.62 -1.62 0.04
age 6 123 24.63 5.80 24.00 23.76 1.48 18 61 43 4.71 24.76 0.52
pairs.panels(Tal_Or)
summary(lm(reaction ~ cond + pmi, data=Tal_Or)) #the classic lm approach
Call:
lm(formula = reaction ~ cond + pmi, data = Tal_Or)
Residuals:
Min 1Q Median 3Q Max
-3.07636 -1.06128 -0.06346 0.94573 2.94299
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.52687 0.54968 0.958 0.340
cond 0.25435 0.25582 0.994 0.322
pmi 0.50645 0.09705 5.219 7.66e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.393 on 120 degrees of freedom
Multiple R-squared: 0.2059, Adjusted R-squared: 0.1927
F-statistic: 15.56 on 2 and 120 DF, p-value: 9.83e-07
setCor(reaction ~ cond + pmi, data=Tal_Or,std=FALSE) #the psych approach
Call: setCor(y = reaction ~ cond + pmi, data = Tal_Or, std = FALSE)
Multiple Regression from raw data
DV = reaction
slope se t p lower.ci upper.ci VIF
(Intercept) 0.53 0.55 0.96 3.4e-01 -0.56 1.62 19.15
cond 0.25 0.26 0.99 3.2e-01 -0.25 0.76 1.96
pmi 0.51 0.10 5.22 7.7e-07 0.31 0.70 19.77
Residual Standard Error = 1.39 with 120 degrees of freedom
Multiple Regression
R R2 Ruw R2uw Shrunken R2 SE of R2 overall F df1 df2 p
reaction 0.45 0.21 0.33 0.11 0.19 0.06 15.56 2 120 9.83e-07
mod <- mediate(reaction ~ cond + (pmi), data=Tal_Or) #a mediation model
summary(mod)
Call: mediate(y = reaction ~ cond + (pmi), data = Tal_Or)
Direct effect estimates (traditional regression) (c')
reaction se t df Prob
Intercept 0.53 0.55 0.96 120 3.40e-01
cond 0.25 0.26 0.99 120 3.22e-01
pmi 0.51 0.10 5.22 120 7.66e-07
R = 0.45 R2 = 0.21 F = 15.56 on 2 and 120 DF p-value: 9.83e-07
Total effect estimates (c)
reaction se t df Prob
Intercept 3.25 0.19 17.05 121 5.68e-34
cond 0.50 0.28 1.79 121 7.66e-02
'a' effect estimates
pmi se t df Prob
Intercept 5.38 0.16 33.22 121 1.16e-62
cond 0.48 0.24 2.02 121 4.54e-02
'b' effect estimates
reaction se t df Prob
pmi 0.51 0.1 5.24 121 6.88e-07
'ab' effect estimates (through mediators)
reaction boot sd lower upper
cond 0.24 0.25 0.13 0.01 0.52
When supplying an interaction term, the data are centered before multiplying
mod2 <- mediate(reaction ~ cond + (pmi) + (import), data=Tal_Or) #a mediation model
mod3 <- mediate(reaction ~ cond + (pmi) * import, data=Tal_Or)
summary(mod2)
Call: mediate(y = reaction ~ cond + (pmi) + (import), data = Tal_Or)
Direct effect estimates (traditional regression) (c')
reaction se t df Prob
Intercept -0.15 0.53 -0.28 119 7.78e-01
cond 0.10 0.24 0.43 119 6.66e-01
pmi 0.40 0.09 4.26 119 4.04e-05
import 0.32 0.07 4.59 119 1.13e-05
R = 0.57 R2 = 0.33 F = 19.11 on 3 and 119 DF p-value: 3.5e-10
Total effect estimates (c)
reaction se t df Prob
Intercept 3.25 0.19 17.05 121 5.68e-34
cond 0.50 0.28 1.79 121 7.66e-02
'a' effect estimates
pmi se t df Prob
Intercept 5.38 0.16 33.22 121 1.16e-62
cond 0.48 0.24 2.02 121 4.54e-02
import se t df Prob
Intercept 3.91 0.21 18.37 121 8.39e-37
cond 0.63 0.31 2.02 121 4.52e-02
'b' effect estimates
reaction se t df Prob
pmi 0.40 0.09 4.28 120 3.75e-05
import 0.32 0.07 4.60 120 1.03e-05
'ab' effect estimates (through mediators)
reaction boot sd lower upper
cond 0.39 0.4 0.16 0.09 0.74
'ab' effects estimates for each mediator for reaction
pmi sd lower upper
cond 0.19 0.1 0.01 0.41
import sd lower upper
cond 0.21 0.12 0.01 0.46
summary(mod3)
Call: mediate(y = reaction ~ cond + (pmi) * import, data = Tal_Or)
Direct effect estimates (traditional regression) (c')
reaction se t df Prob
Intercept 0.00 0.12 0.00 118 9.99e-01
cond 0.10 0.24 0.43 118 6.68e-01
import 0.32 0.07 4.51 118 1.55e-05
pmi*import 0.00 0.05 0.00 118 9.98e-01
pmi 0.40 0.11 3.72 118 3.04e-04
R = 0.57 R2 = 0.33 F = 14.21 on 4 and 118 DF p-value: 1.69e-09
Total effect estimates (c)
reaction se t df Prob
Intercept 0.06 0.13 0.47 119 6.38e-01
cond 0.23 0.25 0.92 119 3.59e-01
import 0.40 0.07 5.60 119 1.42e-07
pmi*import -0.09 0.05 -1.94 119 5.45e-02
'a' effect estimates
pmi se t df Prob
Intercept 0.15 0.10 1.46 119 1.47e-01
cond 0.32 0.20 1.57 119 1.20e-01
import 0.20 0.06 3.42 119 8.56e-04
pmi*import -0.24 0.04 -6.00 119 2.21e-08
'b' effect estimates
reaction se t df Prob
pmi 0.4 0.11 3.74 119 0.000286
'ab' effect estimates (through mediators)
reaction boot sd lower upper
cond 0.13 0.14 0.10 -0.03 0.36
import 0.08 0.08 0.04 0.02 0.17
pmi*import -0.09 -0.10 0.04 -0.18 -0.03
anova(mod2,mod3) #compare the models
Model 1 = mediate(y = reaction ~ cond + (pmi) + (import), data = Tal_Or)
Model 2 = mediate(y = reaction ~ cond + (pmi) * import, data = Tal_Or)
ANOVA Test for Difference Between Models
Res Df Res SS Diff df Diff SS F Pr(F > )
1 119 197.88
2 118 197.88 1 9.6388e-06 0 0.9981
setCor
from the correlations.An example from Marian Spengler and Rodrica Damian A longitudinal study of the project talent data set.
Spengler.stat #show the basic descriptives of the original data set
Var M SD
1 Race 1.96 0.21
2 Sex 1.52 0.50
3 Age 10.43 1.10
4 Parental 0.00 1.00
5 IQ 0.00 1.00
6 Sociability 6.70 2.91
7 Social 4.66 2.35
8 Impulsiveness 1.98 1.64
9 Vigor 3.67 2.13
10 Calmness 4.26 2.51
11 Tidiness 5.67 2.80
12 Culture 5.21 2.36
13 Leadership 1.29 1.36
14 Self-confidence 5.12 2.47
15 Mature 11.16 5.19
16 Interest 2.28 0.73
17 Reading 2.53 0.82
18 Writing 3.23 0.88
19 Responsible 3.30 0.64
20 Ed.11 3.07 1.17
21 OccPres.11 4.19 1.18
22 Income.11 9.07 0.65
23 Educ.50 3.34 1.25
24 OccPres.50 4.23 1.21
25 Income.50 2.87 0.97
psych::lowerMat(Spengler[psych::cs(IQ,Parental,Ed.11,OccPres.50),
psych::cs(IQ,Parental,Ed.11,OccPres.50)])
IQ Prntl Ed.11 OP.50
IQ 1.00
Parental 0.44 1.00
Ed.11 0.53 0.44 1.00
OccPres.50 0.35 0.27 0.44 1.00
psych::setCor(OccPres.50 ~ IQ + Parental + Ed.11,data=Spengler)
Call: psych::setCor(y = OccPres.50 ~ IQ + Parental + Ed.11, data = Spengler)
Multiple Regression from matrix input
DV = OccPres.50
slope VIF
IQ 0.15 1.50
Parental 0.06 1.34
Ed.11 0.34 1.50
Multiple Regression
R R2 Ruw R2uw
OccPres.50 0.46 0.22 0.44 0.19
#we reduce the number of subjects for faster replication in this example
mod <- psych::mediate(OccPres.50 ~ IQ + Parental + (Ed.11),data=Spengler,
n.iter=50,n.obs=1000) #for speed
The replication data matrices were simulated based upon the specified number of subjects and the observed correlation matrix.
summary(mod)
Call: psych::mediate(y = OccPres.50 ~ IQ + Parental + (Ed.11), data = Spengler,
n.obs = 1000, n.iter = 50)
Direct effect estimates (traditional regression) (c')
OccPres.50 se t df Prob
Intercept 0.00 0.03 0.00 996 1.00e+00
IQ 0.15 0.03 4.24 996 2.41e-05
Parental 0.06 0.03 1.76 996 7.80e-02
Ed.11 0.34 0.03 9.81 996 9.53e-22
R = 0.46 R2 = 0.22 F = 90.94 on 3 and 996 DF p-value: 5.1e-52
Total effect estimates (c)
OccPres.50 se t df Prob
Intercept 0.00 0.03 0.00 997 1.00e+00
IQ 0.29 0.03 8.76 997 8.12e-18
Parental 0.14 0.03 4.40 997 1.22e-05
'a' effect estimates
Ed.11 se t df Prob
Intercept 0.00 0.03 0.00 997 1.00e+00
IQ 0.42 0.03 14.49 997 2.44e-43
Parental 0.26 0.03 8.91 997 2.38e-18
'b' effect estimates
OccPres.50 se t df Prob
Ed.11 0.34 0.03 9.81 997 9.1e-22
'ab' effect estimates (through mediators)
OccPres.50 boot sd lower upper
IQ 0.14 0.11 0.02 0.07 0.16
Parental 0.09 0.08 0.01 0.06 0.10
Unless just finding scores for one scale, it is necessary to specify a scoring key for each scale.
Consider the scoring key for the bfi
. This key is supplied as an example of scoring keys.
bfi.keys #show the key information (note how a negative sign implies item reversal)
$agree
[1] "-A1" "A2" "A3" "A4" "A5"
$conscientious
[1] "C1" "C2" "C3" "-C4" "-C5"
$extraversion
[1] "-E1" "-E2" "E3" "E4" "E5"
$neuroticism
[1] "N1" "N2" "N3" "N4" "N5"
$openness
[1] "O1" "-O2" "O3" "O4" "-O5"
score.bfi <- scoreItems(bfi.keys,bfi[1:25])
score.bfi #lots of output
Call: scoreItems(keys = bfi.keys, items = bfi[1:25])
(Unstandardized) Alpha:
agree conscientious extraversion neuroticism openness
alpha 0.7 0.72 0.76 0.81 0.6
Standard errors of unstandardized Alpha:
agree conscientious extraversion neuroticism openness
ASE 0.014 0.014 0.013 0.011 0.017
Average item correlation:
agree conscientious extraversion neuroticism openness
average.r 0.32 0.34 0.39 0.46 0.23
Median item correlation:
agree conscientious extraversion neuroticism openness
0.34 0.34 0.38 0.41 0.22
Guttman 6* reliability:
agree conscientious extraversion neuroticism openness
Lambda.6 0.7 0.72 0.76 0.81 0.6
Signal/Noise based upon av.r :
agree conscientious extraversion neuroticism openness
Signal/Noise 2.3 2.6 3.2 4.3 1.5
Scale intercorrelations corrected for attenuation
raw correlations below the diagonal, alpha on the diagonal
corrected correlations above the diagonal:
agree conscientious extraversion neuroticism openness
agree 0.70 0.36 0.63 -0.245 0.23
conscientious 0.26 0.72 0.35 -0.305 0.30
extraversion 0.46 0.26 0.76 -0.284 0.32
neuroticism -0.18 -0.23 -0.22 0.812 -0.12
openness 0.15 0.19 0.22 -0.086 0.60
In order to see the item by scale loadings and frequency counts of the data
print with the short option = FALSE
#the scores are an object within the score.bfi
pairs.panels(score.bfi$scores)
bfi.irt <- scoreIrt.2pl(bfi.keys,bfi)
cor2(bfi.irt,score.bfi$scores)
agree conscientious extraversion neuroticism openness
agree 0.98 0.25 0.49 -0.16 0.15
conscientious 0.25 0.97 0.25 -0.23 0.18
extraversion 0.43 0.25 0.94 -0.25 0.22
neuroticism -0.19 -0.22 -0.19 0.99 -0.08
openness 0.17 0.21 0.27 -0.11 0.98
A very simple statistical learning model is incorporated into the bestScales
function. Also known as the BISCUIT (Best Items Scales that are Cross validated, Unit weighted, Informative, and Transparent) algorithm. As discussed in Elleman et al. (2020) best scales chooses the most highly correlated items, unit weight them, cross validates (using either k-fold or bootstrapping) and forms a final scale.
We use the bfi
data set to show this.
bestboot <- bestScales(psychTools::bfi,criteria=cs(gender,age,education),
n.iter=100,dictionary=psychTools::bfi.dictionary[1:3])
bestboot
Call = bestScales(x = psychTools::bfi, criteria = cs(gender, age, education),
n.iter = 100, dictionary = psychTools::bfi.dictionary[1:3])
derivation.mean derivation.sd validation.m validation.sd final.valid final.wtd N.wtd
gender 0.32 0.022 0.30 0.027 0.29 0.33 10
age 0.25 0.017 0.23 0.024 0.24 0.25 10
education 0.16 0.025 0.14 0.027 0.14 0.18 10
Best items on each scale with counts of replications
Criterion = gender
Freq mean.r sd.r ItemLabel Item Giant3
N5 100 0.21 0.02 q_1505 Panic easily. Stability
A2 100 0.18 0.02 q_1162 Inquire about others' well-being. Cohesion
A1 100 -0.15 0.02 q_146 Am indifferent to the feelings of others. Cohesion
A3 97 0.14 0.02 q_1206 Know how to comfort others. Cohesion
E1 95 -0.13 0.02 q_712 Don't talk a lot. Plasticity
A4 91 0.13 0.02 q_1364 Love children. Cohesion
Criterion = age
Freq mean.r sd.r ItemLabel Item Giant3
A1 100 -0.16 0.02 q_146 Am indifferent to the feelings of others. Cohesion
C4 99 -0.15 0.02 q_626 Do things in a half-way manner. Stability
A4 98 0.14 0.02 q_1364 Love children. Cohesion
A5 94 0.13 0.02 q_1419 Make people feel at ease. Cohesion
Criterion = education
Freq mean.r sd.r ItemLabel Item Giant3
A1 98 -0.14 0.02 q_146 Am indifferent to the feelings of others. Cohesion
#compare with 10 fold cross validation
tenfold <- bestScales(psychTools::bfi,criteria=cs(gender,age,education),fold=10,
dictionary= psychTools::bfi.dictionary[1:3])
Number of iterations set to the number of folds = 10
tenfold
Call = bestScales(x = psychTools::bfi, criteria = cs(gender, age, education),
folds = 10, dictionary = psychTools::bfi.dictionary[1:3])
derivation.mean derivation.sd validation.m validation.sd final.valid final.wtd N.wtd
gender 0.32 0.0099 0.30 0.066 0.31 0.33 10
age 0.25 0.0070 0.23 0.052 0.24 0.25 10
education 0.15 0.0117 0.13 0.074 0.14 0.18 10
Best items on each scale with counts of replications
Criterion = gender
Freq mean.r sd.r ItemLabel Item Giant3
N5 10 0.21 0.00 q_1505 Panic easily. Stability
A2 10 0.18 0.00 q_1162 Inquire about others' well-being. Cohesion
A1 10 -0.16 0.01 q_146 Am indifferent to the feelings of others. Cohesion
A3 10 0.14 0.01 q_1206 Know how to comfort others. Cohesion
A4 10 0.13 0.01 q_1364 Love children. Cohesion
E1 10 -0.13 0.01 q_712 Don't talk a lot. Plasticity
N3 10 0.12 0.01 q_1099 Have frequent mood swings. Stability
Criterion = age
Freq mean.r sd.r ItemLabel Item Giant3
A1 10 -0.16 0.00 q_146 Am indifferent to the feelings of others. Cohesion
C4 10 -0.15 0.01 q_626 Do things in a half-way manner. Stability
A4 10 0.14 0.00 q_1364 Love children. Cohesion
A5 10 0.13 0.00 q_1419 Make people feel at ease. Cohesion
E5 10 0.11 0.01 q_1768 Take charge. Plasticity
A2 10 0.11 0.01 q_1162 Inquire about others' well-being. Cohesion
N3 9 -0.11 0.01 q_1099 Have frequent mood swings. Stability
Criterion = education
Freq mean.r sd.r ItemLabel Item Giant3
A1 10 -0.14 0.01 q_146 Am indifferent to the feelings of others. Cohesion
#Then, to display the results graphically
#Note that we scale the two graphs with the same x.lim values
error.dots(tenfold,eyes=TRUE,pch=16,xlim=c(0,.4),main="Comparing boot strap (bars) and k-fold (eyes)")
error.dots(bestboot,add=TRUE,xlim=c(0,.4),main="")
We can show cohen.d effects with their confidence intervals using cohen.d
and error.dots
.
cd <- cohen.d(bfi ~ gender, dictionary=bfi.dictionary[,2,drop=FALSE])
error.dots(cd, head=13,tail=13,main="Cohen d for gender: bfi items")
abline(v=0)
We can also show scatter plots with densities by a grouping variable using the scatterHist
function. We do this for the bfi scores that we found before.
scores.irt.df <- data.frame(bfi.irt,gender=bfi$gender)
scatterHist(neuroticism ~ agree + gender, data=scores.irt.df,legend=1)
Although psych
is not the best package for any particular problem, it is useful tool for many problems. For more information using psych
, visit the various tutorials available at the personality-project.
To see all the functions in psych use the objects function.
objects(package:psych)
Warning in objects(package:psych): 'package:psych' converted to character string
[1] "%+%" "acs" "alpha" "alpha.ci"
[5] "anova.psych" "AUC" "autoR" "bassAckward"
[9] "bassAckward.diagram" "Bechtoldt" "Bechtoldt.1" "Bechtoldt.2"
[13] "bestItems" "bestScales" "bfi" "bfi.keys"
[17] "bi.bars" "bifactor" "biplot.psych" "biquartimin"
[21] "biserial" "block.random" "bock.table" "cattell"
[25] "char2numeric" "Chen" "chi2r" "circ.sim"
[29] "circ.sim.plot" "circ.simulation" "circ.tests" "circadian.cor"
[33] "circadian.F" "circadian.linear.cor" "circadian.mean" "circadian.phase"
[37] "circadian.reliability" "circadian.sd" "circadian.stats" "circular.cor"
[41] "circular.mean" "cluster.cor" "cluster.fit" "cluster.loadings"
[45] "cluster.plot" "cluster2keys" "cohen.d" "cohen.d.by"
[49] "cohen.d.ci" "cohen.kappa" "comorbidity" "con2cat"
[53] "congeneric.sim" "cor.ci" "cor.plot" "cor.plot.upperLowerCi"
[57] "cor.smooth" "cor.smoother" "cor.wt" "cor2"
[61] "cor2cov" "cor2dist" "corCi" "corFiml"
[65] "corPlot" "corPlotUpperLowerCi" "corr.p" "corr.test"
[69] "correct.cor" "cortest" "cortest.bartlett" "cortest.jennrich"
[73] "cortest.mat" "cortest.normal" "cosinor" "cosinor.period"
[77] "cosinor.plot" "count.pairwise" "crossValidation" "cs"
[81] "cta" "cta.15" "d.ci" "d.robust"
[85] "d2r" "d2t" "Damian" "densityBy"
[89] "describe" "describe.by" "describeBy" "describeData"
[93] "describeFast" "dia.arrow" "dia.cone" "dia.curve"
[97] "dia.curved.arrow" "dia.ellipse" "dia.ellipse1" "dia.rect"
[101] "dia.self" "dia.shape" "dia.triangle" "diagram"
[105] "directSl" "draw.cor" "draw.tetra" "dummy.code"
[109] "Dwyer" "eigen.loadings" "ellipses" "equamax"
[113] "error.bars" "error.bars.by" "error.bars.tab" "error.crosses"
[117] "error.dots" "errorCircles" "esem" "esem.diagram"
[121] "extension.diagram" "fa" "fa.congruence" "fa.diagram"
[125] "fa.extend" "fa.extension" "fa.graph" "fa.lookup"
[129] "fa.multi" "fa.multi.diagram" "fa.organize" "fa.parallel"
[133] "fa.parallel.poly" "fa.plot" "fa.poly" "fa.random"
[137] "fa.rgraph" "fa.sapa" "fa.sort" "fa.stats"
[141] "fa2irt" "faBy" "fac" "faCor"
[145] "factor.congruence" "factor.fit" "factor.minres" "factor.model"
[149] "factor.pa" "factor.plot" "factor.residuals" "factor.rotate"
[153] "factor.scores" "factor.stats" "factor.wls" "factor2cluster"
[157] "faRotate" "fisherz" "fisherz2r" "fparse"
[161] "fromTo" "g2r" "Garcia" "geometric.mean"
[165] "glb" "glb.algebraic" "glb.fa" "Gleser"
[169] "Gorsuch" "guttman" "Harman.5" "Harman.8"
[173] "Harman.Burt" "Harman.Holzinger" "Harman.political" "harmonic.mean"
[177] "headtail" "headTail" "het.diagram" "histBy"
[181] "Holzinger" "Holzinger.9" "ICC" "iclust"
[185] "ICLUST" "ICLUST.cluster" "iclust.diagram" "ICLUST.graph"
[189] "ICLUST.rgraph" "iclust.sort" "ICLUST.sort" "interbattery"
[193] "interp.boxplot" "interp.median" "interp.q" "interp.qplot.by"
[197] "interp.quantiles" "interp.quart" "interp.quartiles" "interp.values"
[201] "irt.0p" "irt.1p" "irt.2p" "irt.discrim"
[205] "irt.fa" "irt.item.diff.rasch" "irt.person.rasch" "irt.responses"
[209] "irt.se" "irt.select" "irt.stats.like" "irt.tau"
[213] "isCorrelation" "isCovariance" "item.dichot" "item.lookup"
[217] "item.sim" "kaiser" "keys.lookup" "keys2list"
[221] "keysort" "KMO" "kurtosi" "lavaan.diagram"
[225] "levels2numeric" "logistic" "logistic.grm" "logit"
[229] "lookup" "lookupFromKeys" "lookupItems" "lowerCor"
[233] "lowerMat" "lowerUpper" "lsat6" "lsat7"
[237] "m2t" "make.congeneric" "make.hierarchical" "make.irt.stats"
[241] "make.keys" "manhattan" "mardia" "mat.regress"
[245] "mat.sort" "matPlot" "matReg" "matSort"
[249] "mediate" "mediate.diagram" "minkowski" "mixed.cor"
[253] "mixedCor" "mlArrange" "mlPlot" "mlr"
[257] "moderate.diagram" "mssd" "multi.arrow" "multi.curved.arrow"
[261] "multi.hist" "multi.rect" "multi.self" "multilevel.reliability"
[265] "nchar2numeric" "nfactors" "omega" "omega.diagram"
[269] "omega.graph" "omegaDirect" "omegaFromSem" "omegah"
[273] "omegaSem" "outlier" "p.rep" "p.rep.f"
[277] "p.rep.r" "p.rep.t" "paired.r" "pairs.panels"
[281] "pairwiseCount" "pairwiseDescribe" "pairwiseImpute" "pairwisePlot"
[285] "pairwiseReport" "parcels" "partial.r" "pca"
[289] "phi" "phi.demo" "phi.list" "phi2poly"
[293] "phi2poly.matrix" "phi2tetra" "Pinv" "plot.irt"
[297] "plot.poly" "plot.poly.parallel" "plot.psych" "plot.residuals"
[301] "polar" "poly.mat" "polychoric" "polydi"
[305] "polyserial" "predict.psych" "principal" "print.psych"
[309] "Procrustes" "progressBar" "Promax" "psych"
[313] "psych.misc" "quickView" "r.con" "r.test"
[317] "r2c" "r2chi" "r2d" "r2t"
[321] "radar" "rangeCorrection" "reflect" "Reise"
[325] "rescale" "resid.psych" "residuals.psych" "response.frequencies"
[329] "responseFrequency" "reverse.code" "rmssd" "sat.act"
[333] "scaling.fits" "scatter.hist" "scatterHist" "schmid"
[337] "Schmid" "schmid.leiman" "score.alpha" "score.irt"
[341] "score.irt.2" "score.irt.poly" "score.items" "score.multiple.choice"
[345] "scoreFast" "scoreIrt" "scoreIrt.1pl" "scoreIrt.2pl"
[349] "scoreItems" "scoreOverlap" "scoreVeryFast" "scoreWtd"
[353] "scree" "scrub" "SD" "selectFromKeys"
[357] "sem.diagram" "sem.graph" "set.cor" "setCor"
[361] "setCor.diagram" "setCorLookup" "shannon" "sim"
[365] "sim.anova" "sim.bonds" "sim.circ" "sim.congeneric"
[369] "sim.correlation" "sim.dichot" "sim.general" "sim.hierarchical"
[373] "sim.irt" "sim.item" "sim.minor" "sim.multi"
[377] "sim.multilevel" "sim.npl" "sim.npn" "sim.omega"
[381] "sim.parallel" "sim.poly" "sim.poly.ideal" "sim.poly.ideal.npl"
[385] "sim.poly.ideal.npn" "sim.poly.mat" "sim.poly.npl" "sim.poly.npn"
[389] "sim.rasch" "sim.simplex" "sim.spherical" "sim.structural"
[393] "sim.structure" "sim.VSS" "simCor" "simulation.circ"
[397] "skew" "smc" "Spengler" "Spengler.stat"
[401] "spider" "splitHalf" "statsBy" "statsBy.boot"
[405] "statsBy.boot.summary" "structure.diagram" "structure.graph" "structure.list"
[409] "structure.sem" "summary.psych" "super.matrix" "superMatrix"
[413] "t2d" "t2r" "table2df" "table2matrix"
[417] "tableF" "Tal_Or" "Tal.Or" "target.rot"
[421] "TargetQ" "TargetT" "tenberge" "test.all"
[425] "test.irt" "test.psych" "testRetest" "tetrachoric"
[429] "thurstone" "Thurstone" "Thurstone.33" "Thurstone.9"
[433] "topBottom" "tr" "Tucker" "unidim"
[437] "varimin" "vgQ.bimin" "vgQ.targetQ" "vgQ.varimin"
[441] "violin" "violinBy" "vss" "VSS"
[445] "VSS.parallel" "VSS.plot" "VSS.scree" "VSS.sim"
[449] "VSS.simulate" "West" "winsor" "winsor.mean"
[453] "winsor.means" "winsor.sd" "winsor.var" "withinBetween"
[457] "wkappa" "Yule" "Yule.inv" "Yule2phi"
[461] "Yule2phi.matrix" "Yule2poly" "Yule2poly.matrix" "Yule2tetra"
[465] "YuleBonett" "YuleCor"
A minor comment on formatting in Rmarkdown. 1
Note, that in r setup, if we specify that knitr::opts_chunk$set(comment="",echo = TRUE) then we suppress the ## marks at the beginning of output. This is useful if trying to get text to copy.↩︎