library(psych) #make it active
library(psychTools)
sessionInfo() #psych should be 2.4.4 or higher
## R version 4.4.0 (2024-04-24)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sonoma 14.4.1
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/Chicago
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] psychTools_2.4.4 psych_2.4.4
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-164 cli_3.6.2 knitr_1.46 rlang_1.1.3 xfun_0.43 jsonlite_1.8.8
## [7] rtf_0.4-14.1 htmltools_0.5.8.1 sass_0.4.9 rmarkdown_2.26 grid_4.4.0 evaluate_0.23
## [13] jquerylib_0.1.4 fastmap_1.1.1 yaml_2.3.8 lifecycle_1.0.4 compiler_4.4.0 rstudioapi_0.16.0
## [19] R.oo_1.26.0 lattice_0.22-6 digest_0.6.35 R6_2.5.1 foreign_0.8-86 mnormt_2.1.1
## [25] parallel_4.4.0 bslib_0.7.0 R.methodsS3_1.8.2 tools_4.4.0 cachem_1.0.8
packageDate("psych") #to see when created
## [1] "2024-05-03"
#should be "2024-05-03"
If the version is not 2.4.4 and the date is not 2024-05-03 then you should update it, restart and examine the news file.
# install.packages("psych",repos="https://personality-project.org/r",type="source")
news(Version > "2.4.3",package="psych") #to see changes
## Changes in version 2.4.4 (2024-04-03)
##
## Introduction
##
## o The psych package includes functions and data sets to do
## classic and modern psychometrics and to analyze personality and
## experimental psychological data sets. The psych package has
## been developed as a supplement to courses in research methods
## in psychology, personality research, and graduate level
## psychometric theory courses. The functions are a supplement to
## the text (in progress): An introduction to psychometric theory
## with applications in R. The functions are also written to
## further research in the Personality-Motivation-Cognition
## Laboratory at Northwestern University.
##
## o Additional functions are added sporadically.
##
## o This news file reports changes that have been made as the
## package has been developed.
##
## o To report bugs, send email to <mailto:revelle@northwestern.edu>
## using bug.report.
##
## o Version 2.4.4 is the development release of the psych package.
## It is available as a source file for Macs or PCs in the
## repository at <https://personality-project.org/r/>. The
## released version on CRAN is 2.4.3 (Added March 15, 2024). The
## second digit reflects the year (i.e., 2024), the third set the
## month (i.e., 2.3.12 was released in December of 2023 the last
## two digits of development versions reflect either an minor
## change or the day of any modifications, e.g. 1.8.3.3 was the
## third attempt to get 1.8.3 released. 1.7.8 was released in
## August, 2017.
##
## o To install this development version, use the command:
## install.packages("psych",repos="https://personality-project.org/r",type="source").
##
## Remember to restart R and library(psych) to make the new
## version active. It is also useful to check the date of the
## development package, as I will use the same version number but
## change the dates (e.g., packageDate("psych")). The date of the
## CRAN version reflects when it was submitted.
##
## Still to do
##
## o Change examples to dontrun rather than commented out ## mostly
## done
##
## o Add the ability to get factor scores for higher order factors
## (requested by Brandon Vaughn). Partly done in omega, but there
## are problems associated with the entire problem of factor
## scores.
##
## o Fix a bug in alpha when finding average R (need to use cor
## rather than cov2cor when treating missing data)
##
## o Add weight option to statsBy (but the question is how to handle
## missing data)
##
## o Adjust the se for correlations (in bestScales) to take into
## account the pairwise count, rather than the n.obs
##
## o Improve documentation for thurstone. Would help if we had an
## example of scaling.
##
## o scoreIrt.2pl needs to be able to adjust for group differences
## correctly
##
## o Possible problem in pooled correlations for statsBy
##
## o Think about improvements in itemLookup to show means if desired
##
## o Fix bug in polychoric where it doesn't handle empty cells very
## well. (reported by Björn Büdenbender). Or at least improve the
## documentation about why this is a problem.
##
## o Fix bug in mediate for multiple dependent variables (reported
## by Martin Zeschke)
##
## o scoreItem should find the mean item correlation with na.rm=TRUE
## (Liz D)
##
## o create a scoreFastBy
##
## o improve the help menu for sim.multi and return the latent
## values
##
## o think about linking person scores to item difficulty scores in
## IRT
##
## o Consider using the Jennrich approach to standard errors of
## rotated loadings
##
## o Still need to fix mediate to handle more than 1 DVs
##
## o Possible problem with the use of try in several functions.
##
## o Need to fix omega so that alpha is correctly calculated in case
## of no general factor
##
## o It would be nice to add an option to display confidence
## intervals of correlations in error.dots
##
## o allow omega style hierarchical structure for x (and y) in esem
##
## Additions
##
## o Finally added Left justification to dictionary content in
## lookupItems
##
## o Added the ability to show probability stars in corPlot.
##
## o Added a short function to convert r to p values.
##
## o Added the pval option to cor2
##
## o Added the score option to scoreOverlap so that if given raw
## data, it will find the scale scores as well as the corrected
## for overlap statistics. Note that the scale scores are not
## corrected for overlap.
##
## o Minor tweaks to scoreIrt.2pl to check for bad data (missing
## cells ). This was leading to the function hanging forever in
## the case of pairwise counts of 0.
##
## o Added fix.dlpyr (as a local function) to statsBy and describeBy
## to not choke on Tibbles. Had added this to alpha several years
## ago.
##
## o Added itemSort to combine item means and item content and then
## sort them.
##
## o Added the median option to describe with the fast=TRUE option
## (requested by David Condon)
##
## o Parallelized describe. Now will describe 644K subjects with
## 6633 variables in 64 seconds (with fast=TRUE option).
## Improvement in speed varies by numbers of cores.
##
## o Parallelized bigCor for very significant savings.
##
## o Added the hist.border option to pairs.panels following a
## request by Jordan Adamson.
##
## o Added the w.exp (weighting exponent to cohen.kappa following a
## request from Julius Pfadt)
##
## o Added n.obs to unidim to properly find statistics from
## correlation matrices
##
## o Added a number of fit statistics to unidim
##
## o Added the ability to find Spearman correlations to bigCor.
## Added a check that the data are a data.frame rather than a
## data.table for bigCor.
##
## o Changed the class of describe output to be c("psych",
## "Pdescribe" ,"data.frame") to avoid choking if Hmisc is in use
##
## o Added the ability to use mean.weights in statsBy (still under
## testing)
##
## o Added paSelect to allow parallel analyses of subsets of items
## using a keys list.
##
## o Added vssSelect to allow VSS analyses of subsets of items using
## a kes list.
##
## Bugs fixed
##
## o changed the number of elements in the unit.Circle in
## spider/radar to get more precision
##
## o Added a conversion to numeric in mediate to get around a
## problem with variables that are factors. (Problem reported by
## Lynn Bueeler )
##
## o Finally fixed bug in alpha where it finds R from cov2cor
## instead of directly (reported by Karl Ove Hufthammer )
##
## o Fixed item.lookup so that it will report item means as well as
## factor loadings.
##
## o Strange bug introduced into score.multiple.choice by qgraph
## reported by Michael Truong. Fixed by changing class of
## describe object used in score.multiple.choice to just be a
## data.frame.
One of the more exciting changes in the way that we conduct psychological research is the introduction of intensive longitudinal modeling using Experience Sampling Methododology (ESM) which allows us to collect state data over time. Alhough originally done with various diary procedures, it is now typically done with smart phones and apps. At the same time, the procedures for evaluating longitudinal data have seen a computional revolution as computational approaches of multilevel modeling have been introduced.
Here we go through the R procedures for doing this analysis.
See the accompanying slides at https:\personality-project.org/courses/350/350.dynamics.pdf and the accompanying article at Analyzing dynamic data: A tutorial
It is convenient, when testing new procedures, to simulate
data. We do this using the sim.multi
function.
We set the random seed to a specific number in order to produce replicable results. If the seed is not specified, a different pseudo random sequence will be generated for each simulation.
The simulation is set to create data for four subjects with two measures (V1 and V2 and V3 and V4) of each of two factors.
Items are assumed to have loadings of .6 on one factor, 0 on the other. The factor intercorrelations at the overall level are randomly varying around 0, but differ for each subject, with factor intercorrelations of -.7, 0, 0, and .7 respectively.
Although normally we would not round the results of the simulation we do so to create the data in Tables~\(\ref{tab:fat}\) and ~\(\ref{tab:wide}\). In addition, the normal output of the is already in wide form. We convert it to Fat form and then back again as demonstrations of the and functions.
As with all code, try to find the three sections: preliminaries and defaults, processing, summary.
library(psych)
sim.multi
## function (n.obs = 4, nvar = 2, nfact = 2, ntrials = 96, days = 16,
## mu = 0, sigma = 1, fact = NULL, loading = 0.9, phi = 0, phi.i = NULL,
## beta.i = 0, mu.i = 0, sigma.i = 1, sin.i = 0, cos.i = 0,
## AR1 = 0, f.i = NULL, plot = TRUE)
## {
## if (missing(n.obs))
## n.obs = 4
## X <- list()
## Xjk <- matrix(NA, ncol = nvar + nfact, nrow = ntrials)
## if (missing(mu))
## mu <- rep(0, nvar)
## if (missing(sigma))
## sigma <- rep(1, nvar)
## if (missing(beta.i)) {
## beta.i <- matrix(0, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(beta.i) < n.obs) {
## beta.i <- matrix(beta.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## if (missing(mu.i))
## mu.i <- matrix(0, ncol = nvar, nrow = n.obs)
## if (missing(sigma.i)) {
## sigma.i <- matrix(1, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(sigma.i) < n.obs) {
## sigma.i <- matrix(sigma.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## if (missing(sin.i)) {
## sin.i <- matrix(0, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(sin.i) < n.obs) {
## sin.i <- matrix(sin.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## if (missing(cos.i)) {
## cos.i <- matrix(0, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(cos.i) < n.obs) {
## cos.i <- matrix(cos.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## if (missing(AR1)) {
## AR1 <- matrix(0, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(AR1) < n.obs) {
## AR1 <- matrix(AR1, ncol = nfact, nrow = n.obs, byrow = TRUE)
## }
## }
## if (is.null(phi)) {
## phi <- diag(1, nfact)
## }
## else {
## phi <- matrix(phi, ncol = nfact, nrow = nfact)
## diag(phi) <- 1
## }
## if (!is.null(phi.i)) {
## if (length(phi.i) < n.obs) {
## phi.i <- rep(phi.i, n.obs/length(phi.i))
## }
## }
## if (nfact > 1) {
## if (is.null(fact)) {
## fact <- matrix(0, nrow = nvar, ncol = nfact)
## for (j in 1:nfact) {
## fact[((j - 1) * nvar/nfact + 1):(j * nvar/nfact),
## j] <- loading
## }
## fact <- (fact %*% phi)
## }
## }
## else {
## fact <- matrix(loading, ncol = nfact, nrow = nvar)
## }
## if (is.null(f.i)) {
## f.i <- list()
## for (i in 1:n.obs) {
## f.i[[i]] <- fact
## }
## }
## trials.day <- ntrials/days
## hours <- 24/trials.day
## time <- seq(hours, days * trials.day * hours, hours)
## t.radian <- time * pi/12
## for (i in 1:n.obs) {
## xij <- rnorm((nvar + nfact), mu, sigma)
## for (j in 1:nfact) {
## error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## lagerror <- c(0, error[1:(ntrials - 1)])
## Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials +
## sin(t.radian) * sin.i[i, j] + cos(t.radian) *
## cos.i[i, j] + error + AR1[i, j] * lagerror
## }
## if (is.null(phi.i)) {
## phi.ind <- diag(1, nfact)
## }
## else {
## phi.ind <- matrix(phi.i[i], nfact, nfact)
## diag(phi.ind) <- 1
## }
## Xjk[, 1:nfact] <- Xjk[, 1:nfact] %*% phi.ind
## for (k in 1:nvar) {
## uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## uniq.err <- rnorm(ntrials, 0, uniq)
## score <- 0
## for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## Xjk[, nfact + k] <- score + uniq.err
## }
## X[[i]] <- Xjk
## }
## DV <- unlist(X)
## dv.a <- array(DV, dim = c(ntrials, nvar + nfact, n.obs))
## dv.m <- NULL
## for (i in 1:(nvar + nfact)) {
## dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## }
## dv.df <- data.frame(dv.m, time = rep(time, n.obs), id = rep(1:n.obs,
## each = ntrials))
## colnames(dv.df)[1:(nfact + nvar)] <- c(paste0("F", 1:nfact),
## paste0("V", 1:nvar))
## if (plot) {
## IV <- NULL
## vars <- c(paste0("F", 1:nfact), paste0("V", 1:nvar))
## select <- rep(NA, nvar * ntrials * n.obs)
## kount <- ntrials * nfact
## for (i in 1:n.obs) {
## select[(1:(ntrials * nvar) + (i - 1) * ntrials *
## nvar)] <- kount + 1:(ntrials * nvar)
## kount <- kount + ntrials * (nvar + nfact)
## }
## vars <- paste0("V", 1:nvar)
## X.df <- data.frame(DV = DV[select], time = rep(time,
## (n.obs * (nvar))), id = rep(1:n.obs, each = (ntrials *
## (nvar))), IV = rep(rep(vars, each = ntrials), n.obs))
## plot1 <- xyplot(DV ~ time | id, group = IV, data = X.df,
## type = "b", as.table = TRUE, strip = strip.custom(strip.names = TRUE,
## strip.levels = TRUE), col = c("blue", "red",
## "black", "grey"))
## print(plot1)
## }
## invisible(dv.df)
## }
## <bytecode: 0x1146f5ab8>
## <environment: namespace:psych>
library(psych) #activate the psych package
#create the data
set.seed(42)
x <- sim.multi(n.obs=4,nvar=4,nfact=2,days=6,ntrials=6,plot=TRUE,
phi.i=c(-.7,0,0,.7),loading=.6)
debugonce(sim.multi)
set.seed(42)
x <- sim.multi(n.obs=4,nvar=4,nfact=2,days=6,ntrials=6,plot=TRUE,
phi.i=c(-.7,0,0,.7),loading=.6)
## debugging in: sim.multi(n.obs = 4, nvar = 4, nfact = 2, days = 6, ntrials = 6,
## plot = TRUE, phi.i = c(-0.7, 0, 0, 0.7), loading = 0.6)
## debug: {
## if (missing(n.obs))
## n.obs = 4
## X <- list()
## Xjk <- matrix(NA, ncol = nvar + nfact, nrow = ntrials)
## if (missing(mu))
## mu <- rep(0, nvar)
## if (missing(sigma))
## sigma <- rep(1, nvar)
## if (missing(beta.i)) {
## beta.i <- matrix(0, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(beta.i) < n.obs) {
## beta.i <- matrix(beta.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## if (missing(mu.i))
## mu.i <- matrix(0, ncol = nvar, nrow = n.obs)
## if (missing(sigma.i)) {
## sigma.i <- matrix(1, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(sigma.i) < n.obs) {
## sigma.i <- matrix(sigma.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## if (missing(sin.i)) {
## sin.i <- matrix(0, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(sin.i) < n.obs) {
## sin.i <- matrix(sin.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## if (missing(cos.i)) {
## cos.i <- matrix(0, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(cos.i) < n.obs) {
## cos.i <- matrix(cos.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## if (missing(AR1)) {
## AR1 <- matrix(0, ncol = nvar, nrow = n.obs)
## }
## else {
## if (length(AR1) < n.obs) {
## AR1 <- matrix(AR1, ncol = nfact, nrow = n.obs, byrow = TRUE)
## }
## }
## if (is.null(phi)) {
## phi <- diag(1, nfact)
## }
## else {
## phi <- matrix(phi, ncol = nfact, nrow = nfact)
## diag(phi) <- 1
## }
## if (!is.null(phi.i)) {
## if (length(phi.i) < n.obs) {
## phi.i <- rep(phi.i, n.obs/length(phi.i))
## }
## }
## if (nfact > 1) {
## if (is.null(fact)) {
## fact <- matrix(0, nrow = nvar, ncol = nfact)
## for (j in 1:nfact) {
## fact[((j - 1) * nvar/nfact + 1):(j * nvar/nfact),
## j] <- loading
## }
## fact <- (fact %*% phi)
## }
## }
## else {
## fact <- matrix(loading, ncol = nfact, nrow = nvar)
## }
## if (is.null(f.i)) {
## f.i <- list()
## for (i in 1:n.obs) {
## f.i[[i]] <- fact
## }
## }
## trials.day <- ntrials/days
## hours <- 24/trials.day
## time <- seq(hours, days * trials.day * hours, hours)
## t.radian <- time * pi/12
## for (i in 1:n.obs) {
## xij <- rnorm((nvar + nfact), mu, sigma)
## for (j in 1:nfact) {
## error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## lagerror <- c(0, error[1:(ntrials - 1)])
## Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials +
## sin(t.radian) * sin.i[i, j] + cos(t.radian) *
## cos.i[i, j] + error + AR1[i, j] * lagerror
## }
## if (is.null(phi.i)) {
## phi.ind <- diag(1, nfact)
## }
## else {
## phi.ind <- matrix(phi.i[i], nfact, nfact)
## diag(phi.ind) <- 1
## }
## Xjk[, 1:nfact] <- Xjk[, 1:nfact] %*% phi.ind
## for (k in 1:nvar) {
## uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## uniq.err <- rnorm(ntrials, 0, uniq)
## score <- 0
## for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## Xjk[, nfact + k] <- score + uniq.err
## }
## X[[i]] <- Xjk
## }
## DV <- unlist(X)
## dv.a <- array(DV, dim = c(ntrials, nvar + nfact, n.obs))
## dv.m <- NULL
## for (i in 1:(nvar + nfact)) {
## dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## }
## dv.df <- data.frame(dv.m, time = rep(time, n.obs), id = rep(1:n.obs,
## each = ntrials))
## colnames(dv.df)[1:(nfact + nvar)] <- c(paste0("F", 1:nfact),
## paste0("V", 1:nvar))
## if (plot) {
## IV <- NULL
## vars <- c(paste0("F", 1:nfact), paste0("V", 1:nvar))
## select <- rep(NA, nvar * ntrials * n.obs)
## kount <- ntrials * nfact
## for (i in 1:n.obs) {
## select[(1:(ntrials * nvar) + (i - 1) * ntrials *
## nvar)] <- kount + 1:(ntrials * nvar)
## kount <- kount + ntrials * (nvar + nfact)
## }
## vars <- paste0("V", 1:nvar)
## X.df <- data.frame(DV = DV[select], time = rep(time,
## (n.obs * (nvar))), id = rep(1:n.obs, each = (ntrials *
## (nvar))), IV = rep(rep(vars, each = ntrials), n.obs))
## plot1 <- xyplot(DV ~ time | id, group = IV, data = X.df,
## type = "b", as.table = TRUE, strip = strip.custom(strip.names = TRUE,
## strip.levels = TRUE), col = c("blue", "red",
## "black", "grey"))
## print(plot1)
## }
## invisible(dv.df)
## }
## debug: if (missing(n.obs)) n.obs = 4
## debug: X <- list()
## debug: Xjk <- matrix(NA, ncol = nvar + nfact, nrow = ntrials)
## debug: if (missing(mu)) mu <- rep(0, nvar)
## debug: mu <- rep(0, nvar)
## debug: if (missing(sigma)) sigma <- rep(1, nvar)
## debug: sigma <- rep(1, nvar)
## debug: if (missing(beta.i)) {
## beta.i <- matrix(0, ncol = nvar, nrow = n.obs)
## } else {
## if (length(beta.i) < n.obs) {
## beta.i <- matrix(beta.i, ncol = nvar, nrow = n.obs, byrow = TRUE)
## }
## }
## debug: beta.i <- matrix(0, ncol = nvar, nrow = n.obs)
## debug: if (missing(mu.i)) mu.i <- matrix(0, ncol = nvar, nrow = n.obs)
## debug: mu.i <- matrix(0, ncol = nvar, nrow = n.obs)
## debug: if (missing(sigma.i)) {
## sigma.i <- matrix(1, ncol = nvar, nrow = n.obs)
## } else {
## if (length(sigma.i) < n.obs) {
## sigma.i <- matrix(sigma.i, ncol = nvar, nrow = n.obs,
## byrow = TRUE)
## }
## }
## debug: sigma.i <- matrix(1, ncol = nvar, nrow = n.obs)
## debug: if (missing(sin.i)) {
## sin.i <- matrix(0, ncol = nvar, nrow = n.obs)
## } else {
## if (length(sin.i) < n.obs) {
## sin.i <- matrix(sin.i, ncol = nvar, nrow = n.obs, byrow = TRUE)
## }
## }
## debug: sin.i <- matrix(0, ncol = nvar, nrow = n.obs)
## debug: if (missing(cos.i)) {
## cos.i <- matrix(0, ncol = nvar, nrow = n.obs)
## } else {
## if (length(cos.i) < n.obs) {
## cos.i <- matrix(cos.i, ncol = nvar, nrow = n.obs, byrow = TRUE)
## }
## }
## debug: cos.i <- matrix(0, ncol = nvar, nrow = n.obs)
## debug: if (missing(AR1)) {
## AR1 <- matrix(0, ncol = nvar, nrow = n.obs)
## } else {
## if (length(AR1) < n.obs) {
## AR1 <- matrix(AR1, ncol = nfact, nrow = n.obs, byrow = TRUE)
## }
## }
## debug: AR1 <- matrix(0, ncol = nvar, nrow = n.obs)
## debug: if (is.null(phi)) {
## phi <- diag(1, nfact)
## } else {
## phi <- matrix(phi, ncol = nfact, nrow = nfact)
## diag(phi) <- 1
## }
## debug: phi <- matrix(phi, ncol = nfact, nrow = nfact)
## debug: diag(phi) <- 1
## debug: if (!is.null(phi.i)) {
## if (length(phi.i) < n.obs) {
## phi.i <- rep(phi.i, n.obs/length(phi.i))
## }
## }
## debug: if (length(phi.i) < n.obs) {
## phi.i <- rep(phi.i, n.obs/length(phi.i))
## }
## debug: if (nfact > 1) {
## if (is.null(fact)) {
## fact <- matrix(0, nrow = nvar, ncol = nfact)
## for (j in 1:nfact) {
## fact[((j - 1) * nvar/nfact + 1):(j * nvar/nfact),
## j] <- loading
## }
## fact <- (fact %*% phi)
## }
## } else {
## fact <- matrix(loading, ncol = nfact, nrow = nvar)
## }
## debug: if (is.null(fact)) {
## fact <- matrix(0, nrow = nvar, ncol = nfact)
## for (j in 1:nfact) {
## fact[((j - 1) * nvar/nfact + 1):(j * nvar/nfact), j] <- loading
## }
## fact <- (fact %*% phi)
## }
## debug: fact <- matrix(0, nrow = nvar, ncol = nfact)
## debug: for (j in 1:nfact) {
## fact[((j - 1) * nvar/nfact + 1):(j * nvar/nfact), j] <- loading
## }
## debug: fact[((j - 1) * nvar/nfact + 1):(j * nvar/nfact), j] <- loading
## debug: fact[((j - 1) * nvar/nfact + 1):(j * nvar/nfact), j] <- loading
## debug: fact <- (fact %*% phi)
## debug: if (is.null(f.i)) {
## f.i <- list()
## for (i in 1:n.obs) {
## f.i[[i]] <- fact
## }
## }
## debug: f.i <- list()
## debug: for (i in 1:n.obs) {
## f.i[[i]] <- fact
## }
## debug: f.i[[i]] <- fact
## debug: f.i[[i]] <- fact
## debug: f.i[[i]] <- fact
## debug: f.i[[i]] <- fact
## debug: trials.day <- ntrials/days
## debug: hours <- 24/trials.day
## debug: time <- seq(hours, days * trials.day * hours, hours)
## debug: t.radian <- time * pi/12
## debug: for (i in 1:n.obs) {
## xij <- rnorm((nvar + nfact), mu, sigma)
## for (j in 1:nfact) {
## error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## lagerror <- c(0, error[1:(ntrials - 1)])
## Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials +
## sin(t.radian) * sin.i[i, j] + cos(t.radian) * cos.i[i,
## j] + error + AR1[i, j] * lagerror
## }
## if (is.null(phi.i)) {
## phi.ind <- diag(1, nfact)
## }
## else {
## phi.ind <- matrix(phi.i[i], nfact, nfact)
## diag(phi.ind) <- 1
## }
## Xjk[, 1:nfact] <- Xjk[, 1:nfact] %*% phi.ind
## for (k in 1:nvar) {
## uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## uniq.err <- rnorm(ntrials, 0, uniq)
## score <- 0
## for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## Xjk[, nfact + k] <- score + uniq.err
## }
## X[[i]] <- Xjk
## }
## debug: xij <- rnorm((nvar + nfact), mu, sigma)
## debug: for (j in 1:nfact) {
## error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## lagerror <- c(0, error[1:(ntrials - 1)])
## Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials +
## sin(t.radian) * sin.i[i, j] + cos(t.radian) * cos.i[i,
## j] + error + AR1[i, j] * lagerror
## }
## debug: error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## debug: lagerror <- c(0, error[1:(ntrials - 1)])
## debug: Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials + sin(t.radian) *
## sin.i[i, j] + cos(t.radian) * cos.i[i, j] + error + AR1[i,
## j] * lagerror
## debug: error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## debug: lagerror <- c(0, error[1:(ntrials - 1)])
## debug: Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials + sin(t.radian) *
## sin.i[i, j] + cos(t.radian) * cos.i[i, j] + error + AR1[i,
## j] * lagerror
## debug: if (is.null(phi.i)) {
## phi.ind <- diag(1, nfact)
## } else {
## phi.ind <- matrix(phi.i[i], nfact, nfact)
## diag(phi.ind) <- 1
## }
## debug: phi.ind <- matrix(phi.i[i], nfact, nfact)
## debug: diag(phi.ind) <- 1
## debug: Xjk[, 1:nfact] <- Xjk[, 1:nfact] %*% phi.ind
## debug: for (k in 1:nvar) {
## uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## uniq.err <- rnorm(ntrials, 0, uniq)
## score <- 0
## for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## Xjk[, nfact + k] <- score + uniq.err
## }
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: X[[i]] <- Xjk
## debug: xij <- rnorm((nvar + nfact), mu, sigma)
## debug: for (j in 1:nfact) {
## error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## lagerror <- c(0, error[1:(ntrials - 1)])
## Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials +
## sin(t.radian) * sin.i[i, j] + cos(t.radian) * cos.i[i,
## j] + error + AR1[i, j] * lagerror
## }
## debug: error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## debug: lagerror <- c(0, error[1:(ntrials - 1)])
## debug: Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials + sin(t.radian) *
## sin.i[i, j] + cos(t.radian) * cos.i[i, j] + error + AR1[i,
## j] * lagerror
## debug: error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## debug: lagerror <- c(0, error[1:(ntrials - 1)])
## debug: Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials + sin(t.radian) *
## sin.i[i, j] + cos(t.radian) * cos.i[i, j] + error + AR1[i,
## j] * lagerror
## debug: if (is.null(phi.i)) {
## phi.ind <- diag(1, nfact)
## } else {
## phi.ind <- matrix(phi.i[i], nfact, nfact)
## diag(phi.ind) <- 1
## }
## debug: phi.ind <- matrix(phi.i[i], nfact, nfact)
## debug: diag(phi.ind) <- 1
## debug: Xjk[, 1:nfact] <- Xjk[, 1:nfact] %*% phi.ind
## debug: for (k in 1:nvar) {
## uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## uniq.err <- rnorm(ntrials, 0, uniq)
## score <- 0
## for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## Xjk[, nfact + k] <- score + uniq.err
## }
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: X[[i]] <- Xjk
## debug: xij <- rnorm((nvar + nfact), mu, sigma)
## debug: for (j in 1:nfact) {
## error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## lagerror <- c(0, error[1:(ntrials - 1)])
## Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials +
## sin(t.radian) * sin.i[i, j] + cos(t.radian) * cos.i[i,
## j] + error + AR1[i, j] * lagerror
## }
## debug: error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## debug: lagerror <- c(0, error[1:(ntrials - 1)])
## debug: Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials + sin(t.radian) *
## sin.i[i, j] + cos(t.radian) * cos.i[i, j] + error + AR1[i,
## j] * lagerror
## debug: error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## debug: lagerror <- c(0, error[1:(ntrials - 1)])
## debug: Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials + sin(t.radian) *
## sin.i[i, j] + cos(t.radian) * cos.i[i, j] + error + AR1[i,
## j] * lagerror
## debug: if (is.null(phi.i)) {
## phi.ind <- diag(1, nfact)
## } else {
## phi.ind <- matrix(phi.i[i], nfact, nfact)
## diag(phi.ind) <- 1
## }
## debug: phi.ind <- matrix(phi.i[i], nfact, nfact)
## debug: diag(phi.ind) <- 1
## debug: Xjk[, 1:nfact] <- Xjk[, 1:nfact] %*% phi.ind
## debug: for (k in 1:nvar) {
## uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## uniq.err <- rnorm(ntrials, 0, uniq)
## score <- 0
## for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## Xjk[, nfact + k] <- score + uniq.err
## }
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: X[[i]] <- Xjk
## debug: xij <- rnorm((nvar + nfact), mu, sigma)
## debug: for (j in 1:nfact) {
## error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## lagerror <- c(0, error[1:(ntrials - 1)])
## Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials +
## sin(t.radian) * sin.i[i, j] + cos(t.radian) * cos.i[i,
## j] + error + AR1[i, j] * lagerror
## }
## debug: error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## debug: lagerror <- c(0, error[1:(ntrials - 1)])
## debug: Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials + sin(t.radian) *
## sin.i[i, j] + cos(t.radian) * cos.i[i, j] + error + AR1[i,
## j] * lagerror
## debug: error <- rnorm(ntrials, mu.i[i, j], sigma.i[i, j])
## debug: lagerror <- c(0, error[1:(ntrials - 1)])
## debug: Xjk[, j] <- xij[j] + mu[j] + beta.i[i, j] * time/ntrials + sin(t.radian) *
## sin.i[i, j] + cos(t.radian) * cos.i[i, j] + error + AR1[i,
## j] * lagerror
## debug: if (is.null(phi.i)) {
## phi.ind <- diag(1, nfact)
## } else {
## phi.ind <- matrix(phi.i[i], nfact, nfact)
## diag(phi.ind) <- 1
## }
## debug: phi.ind <- matrix(phi.i[i], nfact, nfact)
## debug: diag(phi.ind) <- 1
## debug: Xjk[, 1:nfact] <- Xjk[, 1:nfact] %*% phi.ind
## debug: for (k in 1:nvar) {
## uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## uniq.err <- rnorm(ntrials, 0, uniq)
## score <- 0
## for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## Xjk[, nfact + k] <- score + uniq.err
## }
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: uniq <- sqrt(1 - sum(f.i[[i]][k, ]^2))
## debug: uniq.err <- rnorm(ntrials, 0, uniq)
## debug: score <- 0
## debug: for (j in 1:nfact) {
## score <- score + Xjk[, j] * f.i[[i]][k, j]
## }
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: score <- score + Xjk[, j] * f.i[[i]][k, j]
## debug: Xjk[, nfact + k] <- score + uniq.err
## debug: X[[i]] <- Xjk
## debug: DV <- unlist(X)
## debug: dv.a <- array(DV, dim = c(ntrials, nvar + nfact, n.obs))
## debug: dv.m <- NULL
## debug: for (i in 1:(nvar + nfact)) {
## dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## }
## debug: dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## debug: dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## debug: dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## debug: dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## debug: dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## debug: dv.m <- cbind(dv.m, as.vector(dv.a[, i, ]))
## debug: dv.df <- data.frame(dv.m, time = rep(time, n.obs), id = rep(1:n.obs,
## each = ntrials))
## debug: colnames(dv.df)[1:(nfact + nvar)] <- c(paste0("F", 1:nfact),
## paste0("V", 1:nvar))
## debug: if (plot) {
## IV <- NULL
## vars <- c(paste0("F", 1:nfact), paste0("V", 1:nvar))
## select <- rep(NA, nvar * ntrials * n.obs)
## kount <- ntrials * nfact
## for (i in 1:n.obs) {
## select[(1:(ntrials * nvar) + (i - 1) * ntrials * nvar)] <- kount +
## 1:(ntrials * nvar)
## kount <- kount + ntrials * (nvar + nfact)
## }
## vars <- paste0("V", 1:nvar)
## X.df <- data.frame(DV = DV[select], time = rep(time, (n.obs *
## (nvar))), id = rep(1:n.obs, each = (ntrials * (nvar))),
## IV = rep(rep(vars, each = ntrials), n.obs))
## plot1 <- xyplot(DV ~ time | id, group = IV, data = X.df,
## type = "b", as.table = TRUE, strip = strip.custom(strip.names = TRUE,
## strip.levels = TRUE), col = c("blue", "red", "black",
## "grey"))
## print(plot1)
## }
## debug: IV <- NULL
## debug: vars <- c(paste0("F", 1:nfact), paste0("V", 1:nvar))
## debug: select <- rep(NA, nvar * ntrials * n.obs)
## debug: kount <- ntrials * nfact
## debug: for (i in 1:n.obs) {
## select[(1:(ntrials * nvar) + (i - 1) * ntrials * nvar)] <- kount +
## 1:(ntrials * nvar)
## kount <- kount + ntrials * (nvar + nfact)
## }
## debug: select[(1:(ntrials * nvar) + (i - 1) * ntrials * nvar)] <- kount +
## 1:(ntrials * nvar)
## debug: kount <- kount + ntrials * (nvar + nfact)
## debug: select[(1:(ntrials * nvar) + (i - 1) * ntrials * nvar)] <- kount +
## 1:(ntrials * nvar)
## debug: kount <- kount + ntrials * (nvar + nfact)
## debug: select[(1:(ntrials * nvar) + (i - 1) * ntrials * nvar)] <- kount +
## 1:(ntrials * nvar)
## debug: kount <- kount + ntrials * (nvar + nfact)
## debug: select[(1:(ntrials * nvar) + (i - 1) * ntrials * nvar)] <- kount +
## 1:(ntrials * nvar)
## debug: kount <- kount + ntrials * (nvar + nfact)
## debug: vars <- paste0("V", 1:nvar)
## debug: X.df <- data.frame(DV = DV[select], time = rep(time, (n.obs *
## (nvar))), id = rep(1:n.obs, each = (ntrials * (nvar))), IV = rep(rep(vars,
## each = ntrials), n.obs))
## debug: plot1 <- xyplot(DV ~ time | id, group = IV, data = X.df, type = "b",
## as.table = TRUE, strip = strip.custom(strip.names = TRUE,
## strip.levels = TRUE), col = c("blue", "red", "black",
## "grey"))
## debug: print(plot1)
## debug: invisible(dv.df)
## exiting from: sim.multi(n.obs = 4, nvar = 4, nfact = 2, days = 6, ntrials = 6,
## plot = TRUE, phi.i = c(-0.7, 0, 0, 0.7), loading = 0.6)
set.seed(42)
x <- sim.multi(n.obs=4,nvar=4,nfact=2,days=6,ntrials=6,plot=TRUE,
phi.i=c(-.7,0,0,.7),loading=.6)
raw <- round(x[3:8])
raw[1:4] <- raw[1:4] + 6
#make a 'Fat' version
XFat <- reshape(raw,idvar="id",timevar="time",times=1:4,direction="wide")
#show it
XFat
## id V1.24 V2.24 V3.24 V4.24 V1.48 V2.48 V3.48 V4.48 V1.72 V2.72 V3.72 V4.72 V1.96 V2.96 V3.96 V4.96 V1.120 V2.120
## 1 1 7 10 4 3 8 7 6 4 8 8 5 2 5 5 5 6 8 8
## 7 2 6 6 6 5 7 8 5 5 7 7 6 7 7 7 6 6 7 7
## 13 3 5 6 4 4 6 5 5 4 5 6 6 7 6 6 9 6 4 4
## 19 4 4 5 4 4 6 4 4 5 5 7 5 6 3 4 5 5 5 4
## V3.120 V4.120 V1.144 V2.144 V3.144 V4.144
## 1 5 5 11 9 1 2
## 7 4 4 8 7 6 6
## 13 6 7 5 4 7 7
## 19 4 3 5 4 4 5
#now make it wide
XWide <- reshape(XFat,idvar="id",varying=2:25,direction="long")
Xwide <- dfOrder(XWide,"id")
Xwide #show the data in typical wide format
## id time V1 V2 V3 V4
## 1.24 1 24 7 10 4 3
## 1.48 1 48 8 7 6 4
## 1.72 1 72 8 8 5 2
## 1.96 1 96 5 5 5 6
## 1.120 1 120 8 8 5 5
## 1.144 1 144 11 9 1 2
## 2.24 2 24 6 6 6 5
## 2.48 2 48 7 8 5 5
## 2.72 2 72 7 7 6 7
## 2.96 2 96 7 7 6 6
## 2.120 2 120 7 7 4 4
## 2.144 2 144 8 7 6 6
## 3.24 3 24 5 6 4 4
## 3.48 3 48 6 5 5 4
## 3.72 3 72 5 6 6 7
## 3.96 3 96 6 6 9 6
## 3.120 3 120 4 4 6 7
## 3.144 3 144 5 4 7 7
## 4.24 4 24 4 5 4 4
## 4.48 4 48 6 4 4 5
## 4.72 4 72 5 7 5 6
## 4.96 4 96 3 4 5 5
## 4.120 4 120 5 4 4 3
## 4.144 4 144 5 4 4 5
#note that there are 4 subjects, with 6 observations per subject on each of 4 variables
#add in the trait information
traits <- data.frame(id = 1:4,extraversion =c(5,10,15,20),
neuroticism =c(10,5, 15,10))
traits #this would be some data collected separately
## id extraversion neuroticism
## 1 1 5 10
## 2 2 10 5
## 3 3 15 15
## 4 4 20 10
Xwide.traits <- merge(Xwide,traits, by ="id") #this merges two data.frames using the id
Xwide.traits #show the merged data
## id time V1 V2 V3 V4 extraversion neuroticism
## 1 1 24 7 10 4 3 5 10
## 2 1 48 8 7 6 4 5 10
## 3 1 72 8 8 5 2 5 10
## 4 1 96 5 5 5 6 5 10
## 5 1 120 8 8 5 5 5 10
## 6 1 144 11 9 1 2 5 10
## 7 2 24 6 6 6 5 10 5
## 8 2 48 7 8 5 5 10 5
## 9 2 72 7 7 6 7 10 5
## 10 2 96 7 7 6 6 10 5
## 11 2 120 7 7 4 4 10 5
## 12 2 144 8 7 6 6 10 5
## 13 3 24 5 6 4 4 15 15
## 14 3 48 6 5 5 4 15 15
## 15 3 72 5 6 6 7 15 15
## 16 3 96 6 6 9 6 15 15
## 17 3 120 4 4 6 7 15 15
## 18 3 144 5 4 7 7 15 15
## 19 4 24 4 5 4 4 20 10
## 20 4 48 6 4 4 5 20 10
## 21 4 72 5 7 5 6 20 10
## 22 4 96 3 4 5 5 20 10
## 23 4 120 5 4 4 3 20 10
## 24 4 144 5 4 4 5 20 10
Basic descriptive statistics may be found by using the appropriate
functions in the package. We find overall descriptives
(describe
), descriptives by group
(describeBy
), and then a variety of multilevel statistics
including the pooled within person correlations, the between person
correlations, and the individual within individual correlations using
the function. We plot the data using the function.
help(describe) #get help on the input commands for the describe function.
#first the overall descriptives
describe(Xwide.traits)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id 1 24 2.50 1.14 2.5 2.50 1.48 1 4 3 0.00 -1.49 0.23
## time 2 24 84.00 41.87 84.0 84.00 53.37 24 144 120 0.00 -1.41 8.55
## V1 3 24 6.17 1.74 6.0 6.10 1.48 3 11 8 0.61 0.44 0.35
## V2 4 24 6.17 1.74 6.0 6.05 1.48 4 10 6 0.28 -0.91 0.35
## V3 5 24 5.08 1.47 5.0 5.05 1.48 1 9 8 -0.06 1.81 0.30
## V4 6 24 4.92 1.50 5.0 5.00 1.48 2 7 5 -0.31 -0.89 0.31
## extraversion 7 24 12.50 5.71 12.5 12.50 7.41 5 20 15 0.00 -1.49 1.17
## neuroticism 8 24 10.00 3.61 10.0 10.00 3.71 5 15 10 0.00 -1.16 0.74
describeBy(Xwide.traits, group="id")
##
## Descriptive statistics by group
## id: 1
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id 1 6 1.00 0.00 1.0 1.00 0.00 1 1 0 NaN NaN 0.00
## time 2 6 84.00 44.90 84.0 84.00 53.37 24 144 120 0.00 -1.80 18.33
## V1 3 6 7.83 1.94 8.0 7.83 0.74 5 11 6 0.19 -1.06 0.79
## V2 4 6 7.83 1.72 8.0 7.83 1.48 5 10 5 -0.38 -1.32 0.70
## V3 5 6 4.33 1.75 5.0 4.33 0.74 1 6 5 -0.98 -0.66 0.71
## V4 6 6 3.67 1.63 3.5 3.67 2.22 2 6 4 0.21 -1.86 0.67
## extraversion 7 6 5.00 0.00 5.0 5.00 0.00 5 5 0 NaN NaN 0.00
## neuroticism 8 6 10.00 0.00 10.0 10.00 0.00 10 10 0 NaN NaN 0.00
## ------------------------------------------------------------------------------------------
## id: 2
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id 1 6 2.0 0.00 2.0 2.0 0.00 2 2 0 NaN NaN 0.00
## time 2 6 84.0 44.90 84.0 84.0 53.37 24 144 120 0.00 -1.80 18.33
## V1 3 6 7.0 0.63 7.0 7.0 0.00 6 8 2 0.00 -0.92 0.26
## V2 4 6 7.0 0.63 7.0 7.0 0.00 6 8 2 0.00 -0.92 0.26
## V3 5 6 5.5 0.84 6.0 5.5 0.00 4 6 2 -0.85 -1.17 0.34
## V4 6 6 5.5 1.05 5.5 5.5 0.74 4 7 3 0.00 -1.57 0.43
## extraversion 7 6 10.0 0.00 10.0 10.0 0.00 10 10 0 NaN NaN 0.00
## neuroticism 8 6 5.0 0.00 5.0 5.0 0.00 5 5 0 NaN NaN 0.00
## ------------------------------------------------------------------------------------------
## id: 3
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id 1 6 3.00 0.00 3.0 3.00 0.00 3 3 0 NaN NaN 0.00
## time 2 6 84.00 44.90 84.0 84.00 53.37 24 144 120 0.00 -1.80 18.33
## V1 3 6 5.17 0.75 5.0 5.17 0.74 4 6 2 -0.17 -1.54 0.31
## V2 4 6 5.17 0.98 5.5 5.17 0.74 4 6 2 -0.25 -2.08 0.40
## V3 5 6 6.17 1.72 6.0 6.17 1.48 4 9 5 0.38 -1.32 0.70
## V4 6 6 5.83 1.47 6.5 5.83 0.74 4 7 3 -0.39 -2.00 0.60
## extraversion 7 6 15.00 0.00 15.0 15.00 0.00 15 15 0 NaN NaN 0.00
## neuroticism 8 6 15.00 0.00 15.0 15.00 0.00 15 15 0 NaN NaN 0.00
## ------------------------------------------------------------------------------------------
## id: 4
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id 1 6 4.00 0.00 4 4.00 0.00 4 4 0 NaN NaN 0.00
## time 2 6 84.00 44.90 84 84.00 53.37 24 144 120 0.00 -1.80 18.33
## V1 3 6 4.67 1.03 5 4.67 0.74 3 6 3 -0.37 -1.37 0.42
## V2 4 6 4.67 1.21 4 4.67 0.00 4 7 3 1.08 -0.64 0.49
## V3 5 6 4.33 0.52 4 4.33 0.00 4 5 1 0.54 -1.96 0.21
## V4 6 6 4.67 1.03 5 4.67 0.74 3 6 3 -0.37 -1.37 0.42
## extraversion 7 6 20.00 0.00 20 20.00 0.00 20 20 0 NaN NaN 0.00
## neuroticism 8 6 10.00 0.00 10 10.00 0.00 10 10 0 NaN NaN 0.00
statsBy basically finds descriptive statistics within groups and also finds between group statistics. It was developed for multilevel modeling
sb <- statsBy(Xwide.traits, group="id",cors=TRUE)
## Warning in cor(x[-gr], use = use, method = method): the standard deviation is zero
## Warning in cor(x[-gr], use = use, method = method): the standard deviation is zero
## Warning in cor(x[-gr], use = use, method = method): the standard deviation is zero
## Warning in cor(x[-gr], use = use, method = method): the standard deviation is zero
## Warning in cor(new.data[, 1:nvar], use = "pairwise", method = method): the standard deviation is zero
## Warning in cor(diffs, use = use, method = method): the standard deviation is zero
## Warning in cov2cor(xvals$rwg): diag(V) had non-positive or NA entries; the non-finite result may be dubious
## Warning in cor(new.data[, 1:(nvar)], new.data[, (nvar + 1):ncol(new.data)], : the standard deviation is zero
## Warning in cor(new.data[, (nvar + 1):ncol(new.data)], diffs, use = "pairwise", : the standard deviation is zero
# note that when there is no variance NAs are reported.
First show all the objects available in the statsBy output.
names(sb)
## [1] "mean" "sd" "n" "F" "ICC1" "ICC2" "ci1" "ci2" "r" "r.ci" "within" "pooled"
## [13] "sd.r" "raw" "rbg" "ci.bg" "pbg" "rwg" "nw" "ci.wg" "pwg" "etabg" "etawg" "nWg"
## [25] "nwg" "nG" "slope" "Call"
lowerMat(sb$rwg) #top part of Table 6 (r within groups)
## tm.wg V1.wg V2.wg V3.wg V4.wg extr. nrtc.
## time.wg 1.00
## V1.wg 0.24 1.00
## V2.wg -0.27 0.45 1.00
## V3.wg 0.00 -0.35 -0.22 1.00
## V4.wg 0.24 -0.40 -0.31 0.57 1.00
## extraversion.wg NA NA NA NA NA NA
## neuroticism.wg NA NA NA NA NA NA NA
lowerMat(sb$rbg) #lower part of Table 6 (between groups)
## tm.bg V1.bg V2.bg V3.bg V4.bg extr. nrtc.
## time.bg NA
## V1.bg NA 1.00
## V2.bg NA 1.00 1.00
## V3.bg NA -0.21 -0.21 1.00
## V4.bg NA -0.49 -0.49 0.90 1.00
## extraversion.bg NA -0.98 -0.98 0.09 0.44 1.00
## neuroticism.bg NA -0.50 -0.50 0.30 0.14 0.32 1.00
round(sb$within,2) #Table 7 (r within person for each person)
## time-V1 time-V2 time-V3 time-V4 time-extrv time-nrtcs V1-V2 V1-V3 V1-V4 V1-extrv V1-nrtcs V2-V3 V2-V4 V2-extrv
## 1 0.47 -0.16 -0.55 0.07 NA NA 0.59 -0.69 -0.72 NA NA -0.51 -0.73 NA
## 2 0.85 0.17 -0.19 0.05 NA NA 0.50 0.00 0.30 NA NA -0.38 0.00 NA
## 3 -0.36 -0.71 0.65 0.84 NA NA 0.50 0.28 -0.51 NA NA -0.02 -0.39 NA
## 4 0.00 -0.35 0.00 -0.10 NA NA 0.05 -0.50 0.06 NA NA 0.53 0.53 NA
## V2-nrtcs V3-V4 V3-extrv V3-nrtcs V4-extrv V4-nrtcs extrv-nrtcs
## 1 NA 0.54 NA NA NA NA NA
## 2 NA 0.80 NA NA NA NA NA
## 3 NA 0.57 NA NA NA NA NA
## 4 NA 0.62 NA NA NA NA NA
#
Use the mlPlot
function which is just a wrapper for the
xyplot
function from the lattice
package. It
first arranges the data into the long
format required by
the xyplot function.
mlPlot
shows a separate window for each subject.
mlPlot(Xwide.traits,grp="id",items =3:6)
mlPlot #show the function
## function (x, grp = "id", Time = "time", items = c(3:5), extra = NULL,
## col = c("blue", "red", "black", "grey"), type = "b", main = "Lattice Plot by subjects over time",
## ...)
## {
## long <- mlArrange(x = x, grp = grp, Time = Time, items = items,
## extra = extra)
## plot1 <- xyplot(values ~ time | id, group = items, data = long,
## type = type, as.table = TRUE, strip = strip.custom(strip.names = TRUE,
## strip.levels = TRUE), col = col, main = main, ...)
## print(plot1)
## invisible(long)
## }
## <bytecode: 0x1110bcc08>
## <environment: namespace:psych>
long
arrangement of the data.The mlArrange
function takes temporal data (repeated
measures) and organizes it by subject. Consider the Xwide data
headTail(Xwide.traits) #show the original data
## id time V1 V2 V3 V4 extraversion neuroticism
## 1 1 24 7 10 4 3 5 10
## 2 1 48 8 7 6 4 5 10
## 3 1 72 8 8 5 2 5 10
## 4 1 96 5 5 5 6 5 10
## ... ... ... ... ... ... ... ... ...
## 21 4 72 5 7 5 6 20 10
## 22 4 96 3 4 5 5 20 10
## 23 4 120 5 4 4 3 20 10
## 24 4 144 5 4 4 5 20 10
headTail(mlArrange(Xwide.traits,grp="id",items =3:6))
## id time values items
## 1 1 24 7 V1
## 2 1 48 8 V1
## 3 1 72 8 V1
## 4 1 96 5 V1
## ... ... ... ... <NA>
## 93 4 72 6 V4
## 94 4 96 5 V4
## 95 4 120 3 V4
## 96 4 144 5 V4
In a study of 10 participants diagnosed with clinically generalized anxiety disorder, Aaron Fisher collected 28 items for at least 60 days per participant. In an impressive demonstration of how different people are, Fisher examined the dynamic factor structure of each person using procedures discussed by molenaar:85,molenaar:09. The purpose of the study was to encourage the use of personalized care for clinical psychology. Most importantly, for our purposes, the data set he analyzed is available at Fisher’s website http://www.dynamicpsychlab.com/data for easy download and subsequent analysis.
As discussed in the Appendix, the 10 data files available may be merged into one file and we can examine both the reliability of scales made up of subsets of items, as well as the correlational pattern of these scales. It is important to note that the original paper goes far beyond what we report here, and indeed, the analyses that follow are independent of the main thrust of Fisher’s paper.
Of the 28 items, we focus on just eight, four measuring general
positive affect (happy, content, relaxed, and positive), and four
measuring tension and negative affect (angry, afraid, sad, and lonely).
We see that the rating suggest a clearly reliable separation between
individuals for both positive and negative affects (Table~\(\ref{tab:fisher:reliability}\)). Scoring
each subject for these two moods at all time points may done using
scoreItems
Aaron Fisher at UCB reported the data for 10 subjects measured across 100 days on their affect. He put these data up on his web page, and I have moved those cases to the 350 server at Fisher data
The data are in 10 different folders and stored as Rdata
files. We can write a little function to read these and then combine
them.
"combine.data" <- function(dir=NULL,names,filename=NULL) {
new <- NULL
n <- length(names)
old.dir <- getwd() #save the current working directory
for (subject in 1:n) { #repeat n times, once for each subject
if(is.null(filename)) {setwd(dir)} else {dir <- filename}
#set the working directory to where the files are
#this is specific to this particular data structure
x <- read.file(f=paste0(dir,"/P",names[subject],"/pre",
names[subject],".csv"))
nx <- nrow(x)
#add id and time to this data frame
temp <- data.frame(id=names[subject],time=1:nx,x)
#combine with prior data.frames to make a longer object
new <- rbind(new,temp)
} #end of the subject loop
setwd(old.dir) #set the working directory back to the original
return(new)} #end the function by returning the data
Now use this function to read the data. We need to specify which cases we want, and where the data are.
names <- c("002","007","009","010","011","013","022","023",
"030","065") #hard coded from his file names
#specify where the data are
filename <-
"http://personality-project.org/courses/350/Fisher_2015_Data"
new <- combine.data(dir=NULL, names=names,filename=filename)
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P002/pre002.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P007/pre007.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P009/pre009.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P010/pre010.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P011/pre011.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P013/pre013.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P022/pre022.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P023/pre023.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P030/pre030.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/350/Fisher_2015_Data/P065/pre065.csv has been loaded.
dim(new)
## [1] 792 29
colnames(new)
## [1] "id" "time" "happy" "sad" "angry" "content" "afraid" "lonely"
## [9] "relaxed" "tired" "anxious" "positive" "percent" "interfere" "upset" "wcontent"
## [17] "tension" "difficult" "control" "concentrate" "mustens" "fatigue" "irritable" "sleep"
## [25] "restless" "avoid" "prepare" "procrast" "reassur"
Remember to describe the data to see what they look like.
describe(new)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id* 1 792 5.27 2.90 5.0 5.21 2.97 1 10 9 0.17 -1.25 0.10
## time 2 792 41.34 25.36 40.0 40.15 29.65 1 118 117 0.42 -0.31 0.90
## happy 3 788 156.44 316.53 38.0 70.51 26.69 3 999 996 2.27 3.20 11.28
## sad 4 788 148.20 319.72 29.0 61.04 31.13 0 999 999 2.27 3.19 11.39
## angry 5 788 148.61 319.41 30.0 61.44 28.17 1 999 998 2.27 3.20 11.38
## content 6 788 154.26 317.46 33.0 67.99 26.69 3 999 996 2.27 3.19 11.31
## afraid 7 788 152.49 318.28 36.0 66.33 37.06 0 999 999 2.26 3.18 11.34
## lonely 8 788 151.39 318.66 36.0 65.07 37.06 0 999 999 2.27 3.18 11.35
## relaxed 9 788 152.97 317.78 30.0 66.36 19.27 2 999 997 2.27 3.20 11.32
## tired 10 788 167.59 312.62 55.0 84.41 35.58 1 999 998 2.26 3.18 11.14
## anxious 11 788 170.52 311.44 62.5 87.98 30.39 0 999 999 2.27 3.19 11.09
## positive 12 788 158.27 315.89 41.0 72.74 31.13 3 999 996 2.27 3.20 11.25
## percent 13 788 170.42 315.02 57.0 87.27 32.62 4 999 995 2.23 3.03 11.22
## interfere 14 788 168.30 315.96 52.5 84.92 34.84 0 999 999 2.23 3.03 11.26
## upset 15 788 170.76 313.10 56.0 87.49 32.62 6 999 993 2.25 3.11 11.15
## wcontent 16 788 171.13 314.70 56.5 87.92 30.39 3 999 996 2.23 3.04 11.21
## tension 17 788 170.10 313.37 60.0 86.97 31.13 4 999 995 2.25 3.11 11.16
## difficult 18 788 172.06 314.41 58.0 89.25 31.13 2 999 997 2.23 3.03 11.20
## control 19 788 171.82 314.50 59.0 88.88 31.13 3 999 996 2.23 3.03 11.20
## concentrate 20 788 164.91 318.81 46.0 80.71 29.65 4 999 995 2.22 2.97 11.36
## mustens 21 788 167.25 318.19 53.0 83.88 40.03 0 999 999 2.21 2.95 11.34
## fatigue 22 788 171.17 316.65 52.0 88.30 35.58 4 999 995 2.21 2.95 11.28
## irritable 23 788 166.52 318.18 47.0 82.59 26.69 2 999 997 2.22 2.97 11.33
## sleep 24 788 169.42 317.51 50.0 86.41 40.03 3 999 996 2.21 2.94 11.31
## restless 25 788 171.13 316.51 53.0 88.02 32.62 3 999 996 2.22 2.96 11.28
## avoid 26 788 166.69 318.65 45.0 83.20 37.06 0 999 999 2.21 2.93 11.35
## prepare 27 788 165.72 318.94 44.0 82.09 37.06 0 999 999 2.21 2.94 11.36
## procrast 28 788 168.61 317.98 45.5 85.53 39.29 3 999 996 2.20 2.93 11.33
## reassur 29 788 156.43 321.83 35.0 70.45 22.24 3 999 996 2.23 2.98 11.46
# There are some missing values-- scrub them
fisher <- scrub(new, where=2:29,max=120)
describe(fisher)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id* 1 792 5.27 2.90 5 5.21 2.97 1 10 9 0.17 -1.25 0.10
## time 2 792 41.34 25.36 40 40.15 29.65 1 118 117 0.42 -0.31 0.90
## happy 3 691 38.16 21.69 33 36.37 20.76 3 97 94 0.67 -0.49 0.82
## sad 4 691 28.77 23.42 21 26.02 22.24 0 100 100 0.83 -0.17 0.89
## angry 5 691 29.23 20.72 25 26.99 22.24 1 100 99 0.89 0.38 0.79
## content 6 691 35.68 23.52 28 33.15 20.76 3 100 97 0.81 -0.45 0.89
## afraid 7 691 33.66 25.77 29 31.01 29.65 0 100 100 0.67 -0.53 0.98
## lonely 8 691 32.40 25.35 28 29.80 31.13 0 100 100 0.65 -0.49 0.96
## relaxed 9 691 34.21 20.89 28 31.85 14.83 2 93 91 0.97 0.11 0.79
## tired 10 691 50.88 25.55 50 51.03 31.13 1 100 99 -0.03 -1.07 0.97
## anxious 11 691 54.22 24.37 57 55.20 26.69 0 100 100 -0.31 -0.84 0.93
## positive 12 691 40.25 22.31 38 38.73 25.20 3 100 97 0.51 -0.71 0.85
## percent 13 689 51.36 23.13 50 50.98 29.65 4 100 96 0.12 -0.96 0.88
## interfere 14 689 48.94 25.18 45 48.21 29.65 0 99 99 0.22 -1.18 0.96
## upset 15 690 53.13 23.36 50 52.87 29.65 6 100 94 0.09 -1.05 0.89
## wcontent 16 689 52.18 22.35 50 51.67 25.20 3 100 97 0.16 -0.83 0.85
## tension 17 690 52.37 23.67 51 52.77 31.13 4 100 96 -0.11 -1.20 0.90
## difficult 18 689 53.24 23.17 51 53.16 28.17 2 100 98 0.00 -0.94 0.88
## control 19 689 52.96 23.19 51 52.73 28.17 3 100 97 0.05 -0.96 0.88
## concentrate 20 688 43.67 21.09 42 43.01 25.20 4 100 96 0.29 -0.86 0.80
## mustens 21 688 46.36 25.44 45 45.70 35.58 0 100 100 0.13 -1.28 0.97
## fatigue 22 688 50.84 24.62 49 50.30 30.39 4 100 96 0.19 -1.08 0.94
## irritable 23 688 45.52 20.91 42 44.64 20.76 2 100 98 0.36 -0.48 0.80
## sleep 24 688 48.84 27.40 42 47.76 31.13 3 100 97 0.38 -1.16 1.04
## restless 25 688 50.80 22.36 49 50.43 26.69 3 100 97 0.13 -0.95 0.85
## avoid 26 688 45.71 28.71 38 43.64 29.65 0 100 100 0.58 -0.96 1.09
## prepare 27 688 44.60 27.71 36 42.93 29.65 0 100 100 0.48 -1.13 1.06
## procrast 28 688 47.91 29.49 39 46.30 32.62 3 100 97 0.44 -1.22 1.12
## reassur 29 688 33.96 17.08 30 32.84 17.79 3 94 91 0.59 -0.12 0.65
table(fisher$id)
##
## 002 007 009 010 011 013 022 023 030 065
## 80 84 118 80 74 66 67 72 73 78
#note that this is character
fisher$id <- as.numeric(fisher$id) #convert to numeric
describe(fisher)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id 1 792 18.53 17.28 11 14.85 5.93 2 65 63 1.78 2.31 0.61
## time 2 792 41.34 25.36 40 40.15 29.65 1 118 117 0.42 -0.31 0.90
## happy 3 691 38.16 21.69 33 36.37 20.76 3 97 94 0.67 -0.49 0.82
## sad 4 691 28.77 23.42 21 26.02 22.24 0 100 100 0.83 -0.17 0.89
## angry 5 691 29.23 20.72 25 26.99 22.24 1 100 99 0.89 0.38 0.79
## content 6 691 35.68 23.52 28 33.15 20.76 3 100 97 0.81 -0.45 0.89
## afraid 7 691 33.66 25.77 29 31.01 29.65 0 100 100 0.67 -0.53 0.98
## lonely 8 691 32.40 25.35 28 29.80 31.13 0 100 100 0.65 -0.49 0.96
## relaxed 9 691 34.21 20.89 28 31.85 14.83 2 93 91 0.97 0.11 0.79
## tired 10 691 50.88 25.55 50 51.03 31.13 1 100 99 -0.03 -1.07 0.97
## anxious 11 691 54.22 24.37 57 55.20 26.69 0 100 100 -0.31 -0.84 0.93
## positive 12 691 40.25 22.31 38 38.73 25.20 3 100 97 0.51 -0.71 0.85
## percent 13 689 51.36 23.13 50 50.98 29.65 4 100 96 0.12 -0.96 0.88
## interfere 14 689 48.94 25.18 45 48.21 29.65 0 99 99 0.22 -1.18 0.96
## upset 15 690 53.13 23.36 50 52.87 29.65 6 100 94 0.09 -1.05 0.89
## wcontent 16 689 52.18 22.35 50 51.67 25.20 3 100 97 0.16 -0.83 0.85
## tension 17 690 52.37 23.67 51 52.77 31.13 4 100 96 -0.11 -1.20 0.90
## difficult 18 689 53.24 23.17 51 53.16 28.17 2 100 98 0.00 -0.94 0.88
## control 19 689 52.96 23.19 51 52.73 28.17 3 100 97 0.05 -0.96 0.88
## concentrate 20 688 43.67 21.09 42 43.01 25.20 4 100 96 0.29 -0.86 0.80
## mustens 21 688 46.36 25.44 45 45.70 35.58 0 100 100 0.13 -1.28 0.97
## fatigue 22 688 50.84 24.62 49 50.30 30.39 4 100 96 0.19 -1.08 0.94
## irritable 23 688 45.52 20.91 42 44.64 20.76 2 100 98 0.36 -0.48 0.80
## sleep 24 688 48.84 27.40 42 47.76 31.13 3 100 97 0.38 -1.16 1.04
## restless 25 688 50.80 22.36 49 50.43 26.69 3 100 97 0.13 -0.95 0.85
## avoid 26 688 45.71 28.71 38 43.64 29.65 0 100 100 0.58 -0.96 1.09
## prepare 27 688 44.60 27.71 36 42.93 29.65 0 100 100 0.48 -1.13 1.06
## procrast 28 688 47.91 29.49 39 46.30 32.62 3 100 97 0.44 -1.22 1.12
## reassur 29 688 33.96 17.08 30 32.84 17.79 3 94 91 0.59 -0.12 0.65
We particularly want positive and negative affect items.
positive <- cs(happy,content,relaxed, positive)
negative <- cs(angry,afraid, sad, lonely)
pana <- c(positive,negative) #we want to select the items
R <- lowerCor(fisher[pana]) #to show in a correlation matrix
## happy cntnt relxd postv angry afrad sad lonly
## happy 1.00
## content 0.80 1.00
## relaxed 0.67 0.74 1.00
## positive 0.84 0.79 0.70 1.00
## angry -0.19 -0.15 -0.12 -0.15 1.00
## afraid -0.36 -0.37 -0.28 -0.34 0.65 1.00
## sad -0.34 -0.32 -0.23 -0.31 0.67 0.75 1.00
## lonely -0.24 -0.21 -0.12 -0.21 0.64 0.74 0.80 1.00
corPlot(fisher[pana], cex=.7 ) # A graphic represention
# we use the cex command to make the font size fit the html
pana.scores <- scoreItems(keys=list(positive=positive,
negative=negative), fisher, impute="median")
## Number of categories should be increased in order to count frequencies.
summary(pana.scores)
## Call: scoreItems(keys = list(positive = positive, negative = negative),
## items = fisher, impute = "median")
##
## Scale intercorrelations corrected for attenuation
## raw correlations below the diagonal, (unstandardized) alpha on the diagonal
## corrected correlations above the diagonal:
## positive negative
## positive 0.93 -0.33
## negative -0.30 0.91
The data are a mixture of between subjects and within subjects. The
statsBy
function allows us to examine these two sets of
relations. First we need to make up a smaller data frame with just the
positive and negative items.
fisher.select <- data.frame(id = fisher$id,time=fisher$time,p=fisher[positive],n=fisher[negative])
describe(fisher.select)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id 1 792 18.53 17.28 11 14.85 5.93 2 65 63 1.78 2.31 0.61
## time 2 792 41.34 25.36 40 40.15 29.65 1 118 117 0.42 -0.31 0.90
## p.happy 3 691 38.16 21.69 33 36.37 20.76 3 97 94 0.67 -0.49 0.82
## p.content 4 691 35.68 23.52 28 33.15 20.76 3 100 97 0.81 -0.45 0.89
## p.relaxed 5 691 34.21 20.89 28 31.85 14.83 2 93 91 0.97 0.11 0.79
## p.positive 6 691 40.25 22.31 38 38.73 25.20 3 100 97 0.51 -0.71 0.85
## n.angry 7 691 29.23 20.72 25 26.99 22.24 1 100 99 0.89 0.38 0.79
## n.afraid 8 691 33.66 25.77 29 31.01 29.65 0 100 100 0.67 -0.53 0.98
## n.sad 9 691 28.77 23.42 21 26.02 22.24 0 100 100 0.83 -0.17 0.89
## n.lonely 10 691 32.40 25.35 28 29.80 31.13 0 100 100 0.65 -0.49 0.96
lowerCor(fisher.select) #this is a mixture of two kinds of correlations
## id time p.hpp p.cnt p.rlx p.pst n.ngr n.frd n.sad n.lnl
## id 1.00
## time -0.08 1.00
## p.happy -0.49 -0.02 1.00
## p.content -0.49 0.03 0.80 1.00
## p.relaxed -0.41 0.07 0.67 0.74 1.00
## p.positive -0.53 0.02 0.84 0.79 0.70 1.00
## n.angry 0.34 0.00 -0.19 -0.15 -0.12 -0.15 1.00
## n.afraid 0.63 -0.04 -0.36 -0.37 -0.28 -0.34 0.65 1.00
## n.sad 0.54 0.01 -0.34 -0.32 -0.23 -0.31 0.67 0.75 1.00
## n.lonely 0.64 0.04 -0.24 -0.21 -0.12 -0.21 0.64 0.74 0.80 1.00
We can examine the structure within and between subjects by using
statsBy
function.
The help file gives a hint of the power of the function:
When examining data at two levels (e.g., the individual and by some set of grouping variables), it is useful to find basic descriptive statistics (means, sds, ns per group, within group correlations) as well as between group statistics (over all descriptive statistics, and overall between group correlations). Of particular use is the ability to decompose a matrix of correlations at the individual level into correlations within group and correlations between groups.
Multilevel data are endemic in psychological research. In multilevel data, observations are taken on subjects who are nested within some higher level grouping variable. The data might be experimental (participants are nested within experimental conditions) or observational (students are nested within classrooms, students are nested within college majors.) To analyze this type of data, one uses random effects models or mixed effect models, or more generally, multilevel models. There are at least two very powerful packages (nlme and multilevel) which allow for complex analysis of hierarchical (multilevel) data structures. statsBy is a much simpler function to give some of the basic descriptive statistics for two level models. It is meant to supplement true multilevel modeling.
In the output, we see the variance associated with group differences as well as the reliability of the mean differences of the groups.
sb.fisher <- statsBy(fisher.select,group ="id")
print(sb.fisher, short=FALSE)
## Statistics within and between groups
## Call: statsBy(data = fisher.select, group = "id")
## Intraclass Correlation 1 (Percentage of variance due to groups)
## id time p.happy p.content p.relaxed p.positive n.angry n.afraid n.sad n.lonely
## 1.00 0.10 0.69 0.76 0.62 0.67 0.50 0.70 0.69 0.75
## Intraclass Correlation 2 (Reliability of group differences)
## id time p.happy p.content p.relaxed p.positive n.angry n.afraid n.sad n.lonely
## 1.00 0.90 0.99 1.00 0.99 0.99 0.99 0.99 0.99 1.00
## eta^2 between groups
## time.bg p.happy.bg p.content.bg p.relaxed.bg p.positive.bg n.angry.bg n.afraid.bg n.sad.bg
## 0.10 0.67 0.74 0.60 0.65 0.49 0.69 0.67
## n.lonely.bg
## 0.74
## Correlation between groups
## tm.bg p.hp. p.cn. p.rl. p.ps. n.ng. n.fr. n.sd. n.ln.
## time.bg 1.00
## p.happy.bg 0.24 1.00
## p.content.bg 0.37 0.93 1.00
## p.relaxed.bg 0.38 0.93 0.98 1.00
## p.positive.bg 0.39 0.98 0.94 0.93 1.00
## n.angry.bg 0.53 -0.21 -0.12 -0.08 -0.11 1.00
## n.afraid.bg 0.09 -0.47 -0.46 -0.41 -0.45 0.84 1.00
## n.sad.bg 0.33 -0.37 -0.33 -0.28 -0.33 0.90 0.91 1.00
## n.lonely.bg 0.33 -0.23 -0.21 -0.12 -0.22 0.85 0.89 0.95 1.00
## Correlation within groups
## tm.wg p.hp. p.cn. p.rl. p.ps. n.ng. n.fr. n.sd. n.ln.
## time.wg 1.00
## p.happy.wg -0.05 1.00
## p.content.wg -0.03 0.48 1.00
## p.relaxed.wg 0.06 0.24 0.27 1.00
## p.positive.wg -0.05 0.55 0.47 0.33 1.00
## n.angry.wg -0.14 -0.15 -0.16 -0.13 -0.14 1.00
## n.afraid.wg -0.13 -0.16 -0.14 -0.07 -0.12 0.40 1.00
## n.sad.wg -0.12 -0.27 -0.21 -0.11 -0.18 0.38 0.40 1.00
## n.lonely.wg -0.07 -0.20 -0.12 -0.05 -0.09 0.35 0.34 0.44 1.00
##
## Many results are not shown directly. To see specific objects select from the following list:
## mean sd n F ICC1 ICC2 ci1 ci2 raw rbg ci.bg pbg rwg nw ci.wg pwg etabg etawg nwg nG Call
The mlPlot
function allows us to explore within group
(i.e. individual) level patterning across time.
fisher.affect <- data.frame(id=fisher$id,time=fisher$time,pana.scores$scores)
describe(fisher.affect)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id 1 792 18.53 17.28 11.00 14.85 5.93 2.00 65.00 63.0 1.78 2.31 0.61
## time 2 792 41.34 25.36 40.00 40.15 29.65 1.00 118.00 117.0 0.42 -0.31 0.90
## positive 3 792 36.40 18.76 31.75 34.39 15.57 3.75 93.25 89.5 0.94 0.20 0.67
## negative 4 792 30.34 19.79 25.75 28.41 21.50 1.50 98.50 97.0 0.75 -0.07 0.70
lowerCor(fisher.affect)
## id time postv negtv
## id 1.00
## time -0.08 1.00
## positive -0.49 0.01 1.00
## negative 0.60 -0.01 -0.30 1.00
mlPlot(fisher.affect,id="id", time="time", type="p",items =3:4, col=c("blue","red"),pch= c(16,17))
In a study of 10 participants diagnosed with clinically generalized anxiety disorder, collected 28 items for at least 60 days per participant. In an impressive demonstration of how different people are, examined the dynamic factor structure of each person using procedures discussed by . The purpose of the study was to encourage the use of personalized care for clinical psychology. Most importantly, for our purposes, the data set he analyzed is available at Fisher’s website http://www.dynamicpsychlab.com/data for easy download and subsequent analysis.
s discussed in the Appendix, the 10 data files available may be merged into one file and we can examine both the reliability of scales made up of subsets of items, as well as the correlational pattern of these scales. It is important to note that the original paper goes far beyond what we report here, and indeed, the analyses that follow are independent of the main thrust of Fisher’s paper.
Of the 28 items, we focus on just eight, four measuring general
positive affect (happy, content, relaxed, and positive), and four
measuring tension and negative affect (angry, afraid, sad, and lonely).
We see that the rating suggest a clearly reliable separation between
individuals for both positive and negative affects (Table~\(\ref{tab:fisher:reliability}\)). Scoring
each subject for these two moods at all time points may done using
scoreItems
First, we create a little function to our work for us.
#First create a small function to get the data
"combine.data" <- function(dir=NULL,names,filename=NULL) {
new <- NULL
n <- length(names)
old.dir <- getwd() #save the current working directory
for (subject in 1:n) { #repeat n times, once for each subject
if(is.null(filename)) {setwd(dir)} else {dir <- filename} #set the working directory to where the files are
x <- read.file(f=paste0(dir,"/P",names[subject],"/pre",names[subject],".csv"))
nx <- nrow(x)
#add id and time to this data frame
temp <- data.frame(id=names[subject],time=1:nx,x)
new <- rbind(new,temp) #combine with prior data.frames to make a longer object
} #end of the subject loop
setwd(old.dir) #set the working directory back to the original
return(new)} #end the function by returning the data
#now use this function to read in data from a set of files (downloaded from Fisher's web page) and
# combine into one data.frame
names <- c("002","007","009","010","011","013","022","023","030","065")
filename="http://personality-project.org/courses/314/Fisher_2015_Data" #specify where the data are
new <- combine.data(dir=NULL, names=names,filename=filename) #use the function we just created for this problem
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P002/pre002.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P007/pre007.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P009/pre009.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P010/pre010.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P011/pre011.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P013/pre013.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P022/pre022.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P023/pre023.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P030/pre030.csv has been loaded.
## Data from the .csv file http://personality-project.org/courses/314/Fisher_2015_Data/P065/pre065.csv has been loaded.
describe(new) #note how missing values were coded as 999
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id* 1 792 5.27 2.90 5.0 5.21 2.97 1 10 9 0.17 -1.25 0.10
## time 2 792 41.34 25.36 40.0 40.15 29.65 1 118 117 0.42 -0.31 0.90
## happy 3 788 156.44 316.53 38.0 70.51 26.69 3 999 996 2.27 3.20 11.28
## sad 4 788 148.20 319.72 29.0 61.04 31.13 0 999 999 2.27 3.19 11.39
## angry 5 788 148.61 319.41 30.0 61.44 28.17 1 999 998 2.27 3.20 11.38
## content 6 788 154.26 317.46 33.0 67.99 26.69 3 999 996 2.27 3.19 11.31
## afraid 7 788 152.49 318.28 36.0 66.33 37.06 0 999 999 2.26 3.18 11.34
## lonely 8 788 151.39 318.66 36.0 65.07 37.06 0 999 999 2.27 3.18 11.35
## relaxed 9 788 152.97 317.78 30.0 66.36 19.27 2 999 997 2.27 3.20 11.32
## tired 10 788 167.59 312.62 55.0 84.41 35.58 1 999 998 2.26 3.18 11.14
## anxious 11 788 170.52 311.44 62.5 87.98 30.39 0 999 999 2.27 3.19 11.09
## positive 12 788 158.27 315.89 41.0 72.74 31.13 3 999 996 2.27 3.20 11.25
## percent 13 788 170.42 315.02 57.0 87.27 32.62 4 999 995 2.23 3.03 11.22
## interfere 14 788 168.30 315.96 52.5 84.92 34.84 0 999 999 2.23 3.03 11.26
## upset 15 788 170.76 313.10 56.0 87.49 32.62 6 999 993 2.25 3.11 11.15
## wcontent 16 788 171.13 314.70 56.5 87.92 30.39 3 999 996 2.23 3.04 11.21
## tension 17 788 170.10 313.37 60.0 86.97 31.13 4 999 995 2.25 3.11 11.16
## difficult 18 788 172.06 314.41 58.0 89.25 31.13 2 999 997 2.23 3.03 11.20
## control 19 788 171.82 314.50 59.0 88.88 31.13 3 999 996 2.23 3.03 11.20
## concentrate 20 788 164.91 318.81 46.0 80.71 29.65 4 999 995 2.22 2.97 11.36
## mustens 21 788 167.25 318.19 53.0 83.88 40.03 0 999 999 2.21 2.95 11.34
## fatigue 22 788 171.17 316.65 52.0 88.30 35.58 4 999 995 2.21 2.95 11.28
## irritable 23 788 166.52 318.18 47.0 82.59 26.69 2 999 997 2.22 2.97 11.33
## sleep 24 788 169.42 317.51 50.0 86.41 40.03 3 999 996 2.21 2.94 11.31
## restless 25 788 171.13 316.51 53.0 88.02 32.62 3 999 996 2.22 2.96 11.28
## avoid 26 788 166.69 318.65 45.0 83.20 37.06 0 999 999 2.21 2.93 11.35
## prepare 27 788 165.72 318.94 44.0 82.09 37.06 0 999 999 2.21 2.94 11.36
## procrast 28 788 168.61 317.98 45.5 85.53 39.29 3 999 996 2.20 2.93 11.33
## reassur 29 788 156.43 321.83 35.0 70.45 22.24 3 999 996 2.23 2.98 11.46
fisher <- scrub(new,2:29,isvalue=999) #change 999 to NA
dim(fisher) #show the number of rows and columns
## [1] 792 29
describe(fisher) #to see what is there and to check that we recoded everything correctly.
## vars n mean sd median trimmed mad min max range skew kurtosis se
## id* 1 792 5.27 2.90 5 5.21 2.97 1 10 9 0.17 -1.25 0.10
## time 2 792 41.34 25.36 40 40.15 29.65 1 118 117 0.42 -0.31 0.90
## happy 3 691 38.16 21.69 33 36.37 20.76 3 97 94 0.67 -0.49 0.82
## sad 4 691 28.77 23.42 21 26.02 22.24 0 100 100 0.83 -0.17 0.89
## angry 5 691 29.23 20.72 25 26.99 22.24 1 100 99 0.89 0.38 0.79
## content 6 691 35.68 23.52 28 33.15 20.76 3 100 97 0.81 -0.45 0.89
## afraid 7 691 33.66 25.77 29 31.01 29.65 0 100 100 0.67 -0.53 0.98
## lonely 8 691 32.40 25.35 28 29.80 31.13 0 100 100 0.65 -0.49 0.96
## relaxed 9 691 34.21 20.89 28 31.85 14.83 2 93 91 0.97 0.11 0.79
## tired 10 691 50.88 25.55 50 51.03 31.13 1 100 99 -0.03 -1.07 0.97
## anxious 11 691 54.22 24.37 57 55.20 26.69 0 100 100 -0.31 -0.84 0.93
## positive 12 691 40.25 22.31 38 38.73 25.20 3 100 97 0.51 -0.71 0.85
## percent 13 689 51.36 23.13 50 50.98 29.65 4 100 96 0.12 -0.96 0.88
## interfere 14 689 48.94 25.18 45 48.21 29.65 0 99 99 0.22 -1.18 0.96
## upset 15 690 53.13 23.36 50 52.87 29.65 6 100 94 0.09 -1.05 0.89
## wcontent 16 689 52.18 22.35 50 51.67 25.20 3 100 97 0.16 -0.83 0.85
## tension 17 690 52.37 23.67 51 52.77 31.13 4 100 96 -0.11 -1.20 0.90
## difficult 18 689 53.24 23.17 51 53.16 28.17 2 100 98 0.00 -0.94 0.88
## control 19 689 52.96 23.19 51 52.73 28.17 3 100 97 0.05 -0.96 0.88
## concentrate 20 688 43.67 21.09 42 43.01 25.20 4 100 96 0.29 -0.86 0.80
## mustens 21 688 46.36 25.44 45 45.70 35.58 0 100 100 0.13 -1.28 0.97
## fatigue 22 688 50.84 24.62 49 50.30 30.39 4 100 96 0.19 -1.08 0.94
## irritable 23 688 45.52 20.91 42 44.64 20.76 2 100 98 0.36 -0.48 0.80
## sleep 24 688 48.84 27.40 42 47.76 31.13 3 100 97 0.38 -1.16 1.04
## restless 25 688 50.80 22.36 49 50.43 26.69 3 100 97 0.13 -0.95 0.85
## avoid 26 688 45.71 28.71 38 43.64 29.65 0 100 100 0.58 -0.96 1.09
## prepare 27 688 44.60 27.71 36 42.93 29.65 0 100 100 0.48 -1.13 1.06
## procrast 28 688 47.91 29.49 39 46.30 32.62 3 100 97 0.44 -1.22 1.12
## reassur 29 688 33.96 17.08 30 32.84 17.79 3 94 91 0.59 -0.12 0.65
The data set contains 28 different mood words. For the purpose of the
demonstration, we want to find the reliability of four positive affect
terms (happy, content, relaxed, and positive) as well as the four
negaitve affect terms (Table~\(\ref{tab:fisher:reliability}\)). We then
find scores for all subjects at all time periods by using the
scoreItems
function. We combine these scores with the id
and time information from the original file, and then plot it with the
mlPlot
function. Finally, to examine the within subject
correlations we use the statsBy
function with the cors=TRUE
option (Table~\(\ref{tab:fisher:alpha}\)).
Before we do the multilevel analysis, we can treat the data as a simple data set. Lets find the correlations of the positive and negative affect terms.
positive <- cs(happy,content,relaxed, positive)
negative <- cs(angry,afraid, sad, lonely)
pana <- c(positive,negative)
R <- lowerCor(fisher[pana])
## happy cntnt relxd postv angry afrad sad lonly
## happy 1.00
## content 0.80 1.00
## relaxed 0.67 0.74 1.00
## positive 0.84 0.79 0.70 1.00
## angry -0.19 -0.15 -0.12 -0.15 1.00
## afraid -0.36 -0.37 -0.28 -0.34 0.65 1.00
## sad -0.34 -0.32 -0.23 -0.31 0.67 0.75 1.00
## lonely -0.24 -0.21 -0.12 -0.21 0.64 0.74 0.80 1.00
corPlot(R,numbers=TRUE, cex=.7)
Now lets find the simple reliabilities of these two sets of items. We use the scoreItems function. This does not take into account the nested nature of the data.
pana.scores <- scoreItems(keys=list(positive=positive,negative=negative), fisher)
## Number of categories should be increased in order to count frequencies.
pana.scores
## Call: scoreItems(keys = list(positive = positive, negative = negative),
## items = fisher)
##
## (Unstandardized) Alpha:
## positive negative
## alpha 0.93 0.91
##
## Standard errors of unstandardized Alpha:
## positive negative
## ASE 0.019 0.02
##
## Average item correlation:
## positive negative
## average.r 0.76 0.71
##
## Median item correlation:
## positive negative
## 0.76 0.70
##
## Guttman 6* reliability:
## positive negative
## Lambda.6 0.91 0.89
##
## Signal/Noise based upon av.r :
## positive negative
## Signal/Noise 12 9.7
##
## Scale intercorrelations corrected for attenuation
## raw correlations below the diagonal, alpha on the diagonal
## corrected correlations above the diagonal:
## positive negative
## positive 0.93 -0.33
## negative -0.30 0.91
##
## Average adjusted correlations within and between scales (MIMS)
## postv negtv
## positive 0.76
## negative -111.89 0.71
##
## Average adjusted item x scale correlations within and between scales (MIMT)
## postv negtv
## positive 0.90
## negative -0.26 0.88
##
## In order to see the item by scale loadings and frequency counts of the data
## print with the short option = FALSE
#find the multilevel reliabilities
pos.fisher.rel <- mlr(fisher,"id","time",items = c("happy","content",
"relaxed", "positive"), aov=FALSE,lmer=TRUE)
## Loading required namespace: lme4
## boundary (singular) fit: see help('isSingular')
neg.fisher.rel <- mlr(fisher,"id" ,"time",items = c("angry","afraid",
"sad","lonely" ),aov= FALSE,lmer=TRUE)
## boundary (singular) fit: see help('isSingular')
#organize the alpha's for each subject
alpha.df <- data.frame(positive = pos.fisher.rel$alpha[,c(1,3)],
negative = neg.fisher.rel$alpha[,c(1,3)])
select <-c("happy","content", "relaxed", "positive" ,
"angry","afraid","sad","lonely" )
affect.keys <- list(positive = c("happy","content", "relaxed", "positive"),
negative = c("angry","afraid","sad","lonely" ) )
affect.scores <- scoreItems(keys= affect.keys, items = fisher[select], min=0,
max=100)
## Number of categories should be increased in order to count frequencies.
affect.df <-cbind(fisher[1:2], affect.scores$score)
mlPlot(affect.df, grp = "id",time="time", items= 3:4, typ="p", pch=c(20,21))
stats.affect <- statsBy(affect.df,group="id",cors=TRUE)
stats.affect$within
## time-postv time-negtv postv-negtv
## 002 -0.38267572 -0.18262620 -0.36463597
## 007 -0.47970186 0.52654566 -0.60484009
## 009 0.04923507 -0.03452747 0.35141267
## 010 -0.28772283 -0.36429214 -0.27646868
## 011 0.42555939 0.12569681 -0.28286020
## 013 0.02500525 -0.21866366 -0.05130851
## 022 -0.03767051 -0.32317756 -0.23794901
## 023 -0.01260779 -0.22267681 -0.15609861
## 030 0.29436633 0.11297704 -0.43407796
## 065 0.08948664 -0.47978968 -0.74120395
#combine these with the alphas
ar <- autoR(affect.df[2:4],group= affect.df["id"]) #this is lag 1 auto correlation
alpha.df <- cbind(alpha.df,stats.affect$within,ar$autoR[,2:3])
colnames(alpha.df ) #these are too long for nice output
## [1] "positive.Raw.alpha" "positive.av.r" "negative.Raw.alpha" "negative.av.r" "time-postv"
## [6] "time-negtv" "postv-negtv" "positive" "negative"
colnames(alpha.df) <- abbreviate(colnames(alpha.df),10) #abbreviate them
round(alpha.df,2) #show the results
## pstv.Rw.lp positv.v.r ngtv.Rw.lp negatv.v.r time-postv time-negtv postv-ngtv positive negative
## ID002 0.30 0.13 0.57 0.40 -0.38 -0.18 -0.36 0.27 -0.07
## ID007 0.82 0.54 0.53 0.22 -0.48 0.53 -0.60 0.38 0.43
## ID009 0.75 0.43 0.80 0.49 0.05 -0.03 0.35 0.10 0.09
## ID010 0.87 0.63 0.64 0.32 -0.29 -0.36 -0.28 0.40 0.36
## ID011 0.88 0.65 0.54 0.29 0.43 0.13 -0.28 0.46 0.23
## ID013 0.61 0.27 0.73 0.40 0.03 -0.22 -0.05 -0.13 0.21
## ID022 0.84 0.55 0.39 0.14 -0.04 -0.32 -0.24 0.29 0.17
## ID023 0.63 0.29 0.45 0.18 -0.01 -0.22 -0.16 0.38 0.05
## ID030 0.66 0.34 0.49 0.17 0.29 0.11 -0.43 0.15 0.39
## ID065 0.89 0.68 0.83 0.55 0.09 -0.48 -0.74 0.23 0.29
A very powerful tool in learning how to analyze data and to test how
well various models work is to create simulated data set. The
sim.multi
function allows for creating arbitrarily large
data sets with a variety of complex data structures. The basic generator
allows one to define a factor model with 1 … f factors with 1.. j items
per factor and for 1 … i subjects and 1 … k time points. It is possible
to define factor loadings globally, or individually for each subject.
Factors (and their corresponding items) can increase, remain stable, or
decrease over time, and can show diurnal variation with different phase
relationships. Diurnal variation is simulated as a sinusoidal curve
varying over 24 hours with a peak (phase angle) at different times of
day.
The default for sim.multi
is for four subjects over 6
days on two variables:
sim.multi() #just use the default values
A more complicated example of 16 such subjects is seen in Figure~\(\ref{fig:diurnal}\) and various summary statistics are given in Table~\(\ref{tab:diurnal}\). 16 subjects were simulated with two measures taken 8 times a day for six days. The scores for some subjects decreased, while for others they increased over the 6 days. People differ in the strength (amplitude) and phase of their diurnal rhythms.
We create 16 subjects, with two factors and three items per factor.
We simulate data collected 6 times/day over 16 days. We set the random
seed to a fixed number for a reproducible example. After generating the
data, we apply the cosinor
function to estimate the diurnal
nature of the signal, statsBy
to find the within individual
correlations, and autoR
to find the auto correlations of
the data. We combine the various outputs into one data.frame to display
in Table~\(\ref{tab:diurnal}\).
set.seed(17)
x <- sim.multi(n.obs=16,nvar=2,nfact=2,ntrials=48,days=6,
sin.i =c(1,0,0,1),cos.i=c(1,.5,0,-.5), sigma.i=c(0,0,0,0),
sigma=1,phi.i=c(0,0,.5,.5),beta.i=c(-1.,0,1,0,-.5,0,.5,0))
co <- cosinor(x$time,x[3:6],code="id")
sb <- statsBy(x,group="id",cors=TRUE)
## Warning in cor(new.data[, 1:nvar], use = "pairwise", method = method): the standard deviation is zero
## Warning in cor(new.data[, 1:(nvar)], new.data[, (nvar + 1):ncol(new.data)], : the standard deviation is zero
aR <-autoR(x,group="id")
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
## Warning in cor(x[1:(n.obs - lag), ], x[(lag + 1):n.obs, ], use = use): the standard deviation is zero
sim.df <- data.frame(co[,c(1,2,4,5)],sb$within[,c(8,9,10)],aR$autoR[,3:4])
round(sim.df,2) #show it
## V1.phase V2.phase V1.fit V2.fit V1.V2 V1.time V2.time V1 V2
## 1 3.37 0.39 0.73 0.64 0.33 -0.64 -0.04 0.78 0.23
## 2 19.21 7.54 0.12 0.84 -0.20 0.83 -0.20 0.70 0.53
## 3 2.61 2.40 0.84 0.82 0.81 -0.44 -0.31 0.63 0.51
## 4 6.92 8.44 0.48 0.85 0.46 0.51 0.05 0.46 0.51
## 5 3.37 21.98 0.79 0.65 0.04 -0.55 0.12 0.71 0.19
## 6 15.07 7.33 0.09 0.83 -0.14 0.82 -0.20 0.66 0.57
## 7 2.91 1.34 0.89 0.84 0.79 -0.41 -0.23 0.70 0.54
## 8 7.47 8.27 0.36 0.84 0.43 0.70 0.11 0.60 0.56
## 9 3.58 0.44 0.73 0.52 0.36 -0.65 -0.12 0.74 0.12
## 10 20.28 7.10 0.11 0.90 -0.07 0.90 -0.09 0.81 0.64
## 11 2.66 1.98 0.88 0.81 0.82 -0.38 -0.29 0.67 0.54
## 12 7.45 8.24 0.49 0.88 0.50 0.55 0.09 0.45 0.62
## 13 3.27 22.74 0.70 0.52 0.07 -0.64 0.06 0.73 0.24
## 14 22.89 7.35 0.15 0.84 -0.13 0.83 -0.21 0.65 0.57
## 15 2.61 1.84 0.87 0.82 0.79 -0.34 -0.30 0.61 0.59
## 16 8.47 7.23 0.44 0.82 0.52 0.49 0.07 0.24 0.52
#to find the pooled data, we do the same analyses,
# but without specifying the group
cos <- cosinor(x$time,x[3:6]) #pooled results
aR <-autoR(x) #pooled results
rs <- cor(x[3:5])
rs <- rs[lower.tri(rs)] #just use the relevant ones
pooled <- c(cos[1:2,1:2],rs,aR$autoR[3:4])
#combine the pooled data with the individual data
sim.df <- rbind(sim.df,pooled) #rbind adds rows of data frames together
#see also cbind to add columns together
round(sim.df,2) #print it more neatly
## V1.phase V2.phase V1.fit V2.fit V1.V2 V1.time V2.time V1 V2
## 1 3.37 0.39 0.73 0.64 0.33 -0.64 -0.04 0.78 0.23
## 2 19.21 7.54 0.12 0.84 -0.20 0.83 -0.20 0.70 0.53
## 3 2.61 2.40 0.84 0.82 0.81 -0.44 -0.31 0.63 0.51
## 4 6.92 8.44 0.48 0.85 0.46 0.51 0.05 0.46 0.51
## 5 3.37 21.98 0.79 0.65 0.04 -0.55 0.12 0.71 0.19
## 6 15.07 7.33 0.09 0.83 -0.14 0.82 -0.20 0.66 0.57
## 7 2.91 1.34 0.89 0.84 0.79 -0.41 -0.23 0.70 0.54
## 8 7.47 8.27 0.36 0.84 0.43 0.70 0.11 0.60 0.56
## 9 3.58 0.44 0.73 0.52 0.36 -0.65 -0.12 0.74 0.12
## 10 20.28 7.10 0.11 0.90 -0.07 0.90 -0.09 0.81 0.64
## 11 2.66 1.98 0.88 0.81 0.82 -0.38 -0.29 0.67 0.54
## 12 7.45 8.24 0.49 0.88 0.50 0.55 0.09 0.45 0.62
## 13 3.27 22.74 0.70 0.52 0.07 -0.64 0.06 0.73 0.24
## 14 22.89 7.35 0.15 0.84 -0.13 0.83 -0.21 0.65 0.57
## 15 2.61 1.84 0.87 0.82 0.79 -0.34 -0.30 0.61 0.59
## 16 8.47 7.23 0.44 0.82 0.52 0.49 0.07 0.24 0.52
## 17 3.37 5.23 0.28 0.37 0.36 -0.03 -0.07 0.89 0.73
df2latex(sim.df) #if you have LaTex, this makes a table for you.
## % df2latex % sim.df
## \begin{table}[htpb]\caption{df2latex}
## \begin{center}
## \begin{scriptsize}
## \begin{tabular} {l r r r r r r r r r }
## \multicolumn{ 9 }{l}{ A table from the psych package in R } \cr
## \hline Variable & {V1.ph} & {V2.ph} & {V1.ft} & {V2.ft} & {V1.V2} & {V1.tm} & {V2.tm} & {V1} & {V2}\cr
## \hline
## 1 & 3.37 & 0.39 & 0.73 & 0.64 & 0.33 & -0.64 & -0.04 & 0.78 & 0.23 \cr
## 2 & 19.21 & 7.54 & 0.12 & 0.84 & -0.20 & 0.83 & -0.20 & 0.70 & 0.53 \cr
## 3 & 2.61 & 2.40 & 0.84 & 0.82 & 0.81 & -0.44 & -0.31 & 0.63 & 0.51 \cr
## 4 & 6.92 & 8.44 & 0.48 & 0.85 & 0.46 & 0.51 & 0.05 & 0.46 & 0.51 \cr
## 5 & 3.37 & 21.98 & 0.79 & 0.65 & 0.04 & -0.55 & 0.12 & 0.71 & 0.19 \cr
## 6 & 15.07 & 7.33 & 0.09 & 0.83 & -0.14 & 0.82 & -0.20 & 0.66 & 0.57 \cr
## 7 & 2.91 & 1.34 & 0.89 & 0.84 & 0.79 & -0.41 & -0.23 & 0.70 & 0.54 \cr
## 8 & 7.47 & 8.27 & 0.36 & 0.84 & 0.43 & 0.70 & 0.11 & 0.60 & 0.56 \cr
## 9 & 3.58 & 0.44 & 0.73 & 0.52 & 0.36 & -0.65 & -0.12 & 0.74 & 0.12 \cr
## 10 & 20.28 & 7.10 & 0.11 & 0.90 & -0.07 & 0.90 & -0.09 & 0.81 & 0.64 \cr
## 11 & 2.66 & 1.98 & 0.88 & 0.81 & 0.82 & -0.38 & -0.29 & 0.67 & 0.54 \cr
## 12 & 7.45 & 8.24 & 0.49 & 0.88 & 0.50 & 0.55 & 0.09 & 0.45 & 0.62 \cr
## 13 & 3.27 & 22.74 & 0.70 & 0.52 & 0.07 & -0.64 & 0.06 & 0.73 & 0.24 \cr
## 14 & 22.89 & 7.35 & 0.15 & 0.84 & -0.13 & 0.83 & -0.21 & 0.65 & 0.57 \cr
## 15 & 2.61 & 1.84 & 0.87 & 0.82 & 0.79 & -0.34 & -0.30 & 0.61 & 0.59 \cr
## 16 & 8.47 & 7.23 & 0.44 & 0.82 & 0.52 & 0.49 & 0.07 & 0.24 & 0.52 \cr
## 17 & 3.37 & 5.23 & 0.28 & 0.37 & 0.36 & -0.03 & -0.07 & 0.89 & 0.73 \cr
## \hline
## \end{tabular}
## \end{scriptsize}
## \end{center}
## \label{default}
## \end{table}
The R
packages nlme
and lme4
handle a variety of multilevel modeling procedures and can be used to
conduct random coefficient modeling (RCM), which is the formal term for
models that vary at more than one level.
RCM is done in nlme
with the lme
function
and in lme4
with the lmer
function. These
functions allow for the specification of fixed effects and random
effects within and across multiple levels. Therefore, main effects of
traits and states can be modeled using these functions, as can random
effects of states across traits (or any higher level variables).
To do this analysis we first found the mean positive and negative
affect for each subject, and then centered these data around the
individual’s overall mean negative.cent
). We see these
effects (lme
) and the random coefficients for each
individual (extracted using the coef
function) for the two
variables (positive and negative affect) derived from the Fisher data
set in Table~\(\ref{tab:ml:effects}\).
We see that negative emotional \emph{ states) lead to lower
positive emotions, but that the effect of trait* negative emotion
does not affect state positive affect.
We prepare the Fisher data for random coefficient modeling by
computing aggregate means (level 2 data) for each participant, merging
the level 2 data with the level 1 state data, group-mean centering the
state predictor variable around each participant’s mean, and merging the
centered data. Then we conduct random coefficient models are conducted
in the nlme
package with the lme
fuction and in
the lme4
package with the lmer
function. (We
might need to install these packages first. )
Random coefficients for each participant are extracted with the
coef
function (Table~\(\ref{tab:ml:effects}\)).
#compute aggregate means for each participant
affect.mean <- aggregate(fisher,list(affect.df$id),mean, na.rm = TRUE)
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
## Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA
affect.mean <- statsBy(affect.df,group="id")$mean
affect.mean.df <- data.frame(group= rownames(affect.mean),affect.mean)
#rename columns to prepare for merge
colnames(affect.mean.df) <- c("id","id.1","time.mean","positive.mean",
"negative.mean")
#merge participant means with individual responses
affect.mean.df <- merge(affect.df,affect.mean.df,by="id")
#group-mean center positive and negative affect
affect.centered <- affect.mean.df[,c(3,4)] - affect.mean.df[,c(7,8)]
#rename columns to prepare for merge
colnames(affect.centered) <- c("positive.cent","negative.cent")
#add centered variables to data frame
affect.mean.centered <- cbind(affect.mean.df,affect.centered)
(perhaps need to install it first)
#using the nlme package
library(nlme)
#this model predicts positive affect from time and negative affect (centered),
#and it allows the slopes relating positive affect to time and negative affect
# to vary across participants
pa.na.time.nlme <- lme(positive ~ time + negative.cent + negative.mean
+ negative.cent:negative.mean,
random= ~time + negative.cent|id,
data=affect.mean.centered,
na.action = na.omit)
summary(pa.na.time.nlme) #shows fixed and random effects
## Linear mixed-effects model fit by REML
## Data: affect.mean.centered
## AIC BIC logLik
## 6005.934 6061.953 -2990.967
##
## Random effects:
## Formula: ~time + negative.cent | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 18.5638708 (Intr) time
## time 0.1276904 -0.728
## negative.cent 0.3737404 -0.510 0.746
## Residual 9.9587116
##
## Fixed effects: positive ~ time + negative.cent + negative.mean + negative.cent:negative.mean
## Value Std.Error DF t-value p-value
## (Intercept) 44.67385 9.662604 779 4.623376 0.0000
## time -0.06154 0.043767 779 -1.406018 0.1601
## negative.cent -0.66095 0.218081 779 -3.030780 0.0025
## negative.mean -0.21649 0.254564 8 -0.850428 0.4198
## negative.cent:negative.mean 0.00886 0.005636 779 1.572081 0.1163
## Correlation:
## (Intr) time ngtv.c ngtv.m
## time -0.433
## negative.cent -0.180 0.379
## negative.mean -0.791 -0.001 0.014
## negative.cent:negative.mean 0.013 0.003 -0.816 -0.017
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.26512039 -0.53855711 -0.02485865 0.49572673 5.11318894
##
## Number of Observations: 792
## Number of Groups: 10
#extract the coefficients for each participant
coef.pa.time.nlme <- coef(pa.na.time.nlme)
round(coef.pa.time.nlme,2)
## (Intercept) time negative.cent negative.mean negative.cent:negative.mean
## 002 68.93 -0.25 -0.97 -0.22 0.01
## 007 79.08 -0.23 -1.32 -0.22 0.01
## 009 48.53 0.03 -0.11 -0.22 0.01
## 010 34.54 -0.13 -0.73 -0.22 0.01
## 011 22.71 0.13 -0.43 -0.22 0.01
## 013 53.14 -0.02 -0.41 -0.22 0.01
## 022 34.83 -0.06 -0.59 -0.22 0.01
## 023 38.58 -0.02 -0.53 -0.22 0.01
## 030 29.06 0.04 -0.60 -0.22 0.01
## 065 37.33 -0.11 -0.92 -0.22 0.01
describe(coef.pa.time.nlme) #describe the coefficients
## vars n mean sd median trimmed mad min max range skew kurtosis se
## (Intercept) 1 10 44.67 17.88 37.96 43.12 14.43 22.71 79.08 56.37 0.68 -0.97 5.65
## time 2 10 -0.06 0.12 -0.04 -0.06 0.12 -0.25 0.13 0.37 -0.19 -1.28 0.04
## negative.cent 3 10 -0.66 0.34 -0.59 -0.65 0.25 -1.32 -0.11 1.22 -0.34 -0.77 0.11
## negative.mean 4 10 -0.22 0.00 -0.22 -0.22 0.00 -0.22 -0.22 0.00 -Inf NaN 0.00
## negative.cent:negative.mean 5 10 0.01 0.00 0.01 0.01 0.00 0.01 0.01 0.00 NaN NaN 0.00
(might need to install)
#using the lme4 package
library(lme4)
## Loading required package: Matrix
##
## Attaching package: 'lme4'
## The following object is masked from 'package:nlme':
##
## lmList
pa.na.time.lme4 <- lmer(positive ~ time + negative.cent + negative.mean +
negative.cent:negative.mean + (time|id) + (negative.cent|id),
data = affect.mean.centered,
na.action = na.omit)
## boundary (singular) fit: see help('isSingular')
summary(pa.na.time.lme4) #the summary function gives the important results
## Linear mixed model fit by REML ['lmerMod']
## Formula: positive ~ time + negative.cent + negative.mean + negative.cent:negative.mean +
## (time | id) + (negative.cent | id)
## Data: affect.mean.centered
##
## REML criterion at convergence: 5986.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.1124 -0.5503 -0.0143 0.4820 5.1050
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 123.56682 11.1161
## time 0.01611 0.1269 -1.00
## id.1 (Intercept) 180.11011 13.4205
## negative.cent 0.14952 0.3867 -0.17
## Residual 99.06479 9.9531
## Number of obs: 792, groups: id, 10
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 45.502446 9.667998 4.707
## time -0.057191 0.043601 -1.312
## negative.cent -0.805603 0.274894 -2.931
## negative.mean -0.249813 0.263538 -0.948
## negative.cent:negative.mean 0.013241 0.007861 1.685
##
## Correlation of Fixed Effects:
## (Intr) time ngtv.c ngtv.m
## time -0.361
## negativ.cnt -0.137 0.004
## negative.mn -0.818 0.000 0.127
## ngtv.cnt:n. 0.120 0.002 -0.880 -0.148
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
coef.pa.na.time.lme4 <- coef(pa.na.time.lme4)
coef.pa.na.time.lme4
## $id
## (Intercept) time negative.cent negative.mean negative.cent:negative.mean
## 002 79.533749 -0.25149325 -0.9693771 -0.2498129 0.01324144
## 007 69.131061 -0.19209906 -1.4995217 -0.2498129 0.01324144
## 009 32.285605 0.01827019 -0.2782447 -0.2498129 0.01324144
## 010 58.701608 -0.13255207 -0.7459292 -0.2498129 0.01324144
## 011 9.816763 0.14655612 -0.6094032 -0.2498129 0.01324144
## 013 39.784465 -0.02454457 -0.5852936 -0.2498129 0.01324144
## 022 48.479286 -0.07418768 -0.6381256 -0.2498129 0.01324144
## 023 39.121532 -0.02075956 -0.7200819 -0.2498129 0.01324144
## 030 24.868192 0.06061994 -0.8355917 -0.2498129 0.01324144
## 065 53.302200 -0.10172413 -1.1744653 -0.2498129 0.01324144
##
## attr(,"class")
## [1] "coef.mer"
Modern data collection techniques allow for intensive measurement within subjects. Analyzing this type of data requires analyzing data at the within subject as well as between subject level. Although sometimes conclusions will be the same at both levels, it is frequently the case that examining within subject data will show much more complex patterns of results than when they are simply aggregated. This tutorial is a simple introduction to the kind of data analytic strategies that are possible.
Reliability is the question of how much variance in a measure is not error. When examining multi-level data, the question is a bit harder, because we can think about several different reliabilities. Within subjects, across subjects, across time. …
Traditional reliability measures decompose between person test variance (\(\sigma^2_x\)) into reliable and un-reliable or error variance (\(\sigma^2_e\)).
\[\begin{equation} \rho_{xx} = \frac{1 - \sigma^2_e}{\sigma^2_x} \end{equation}\]
The problem is then how to find \(\sigma^2_e\). Solutions include test retest
correlations and estimates based upon the internal structure of the test
. Generalizability theory was developed to solve the problem of multiple
sources of variance for each score. This is the situation for
multi-level data. For when subjects have scores on multiple items and
multiple time points, we are interested in several different indices of
reliability. The technique developed by was to estimate the variance
components using standard ANOVA procedures to find the appropriate Mean
Squares for subjects, time, items, etc. and then convert these to
variance components based upon the expected values of the MS
(Table~\(\ref{tab:anova}\) top part).
Taking advantage of the power of to integrate the output of one function
into another, we combine aov
lme
and
lmer
into one function (multilevel.reliability
or mlr
) which can take wide data, transform it into the
long format needed for aov
etc., do the analyses, and then
find the reliabilities based upon these components and the formulae
given by . Thus the command to find multilevel reliability for a set of
variables is just one line of code rather than the complex expressions
necessary in SPSS or SAS .
An excellent tutorial on this is found in a chapter by Pat Shrout and Sean Lane. We use their toy example for the analysis.
A convenent way to store and transmit small data sets is using the
dput
function which puts the data into a readable
and compact form. We can read it directly back into R.
shrout <- structure(list(Person = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L,
5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), Time = c(1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L,
4L, 4L), Item1 = c(2L, 3L, 6L, 3L, 7L, 3L, 5L, 6L, 3L, 8L, 4L,
4L, 7L, 5L, 6L, 1L, 5L, 8L, 8L, 6L), Item2 = c(3L, 4L, 6L, 4L,
8L, 3L, 7L, 7L, 5L, 8L, 2L, 6L, 8L, 6L, 7L, 3L, 9L, 9L, 7L, 8L
), Item3 = c(6L, 4L, 5L, 3L, 7L, 4L, 7L, 8L, 9L, 9L, 5L, 7L,
9L, 7L, 8L, 4L, 7L, 9L, 9L, 6L)), .Names = c("Person", "Time",
"Item1", "Item2", "Item3"), class = "data.frame", row.names = c(NA,
-20L))
headTail(shrout) #first and last 4 rows of the data frame in normal wide format
## Person Time Item1 Item2 Item3
## 1 1 1 2 3 6
## 2 2 1 3 4 4
## 3 3 1 6 6 5
## 4 4 1 3 4 3
## ... ... ... ... ... ...
## 17 2 4 5 9 7
## 18 3 4 8 9 9
## 19 4 4 8 7 9
## 20 5 4 6 8 6
#or just show the entire data set
shrout
## Person Time Item1 Item2 Item3
## 1 1 1 2 3 6
## 2 2 1 3 4 4
## 3 3 1 6 6 5
## 4 4 1 3 4 3
## 5 5 1 7 8 7
## 6 1 2 3 3 4
## 7 2 2 5 7 7
## 8 3 2 6 7 8
## 9 4 2 3 5 9
## 10 5 2 8 8 9
## 11 1 3 4 2 5
## 12 2 3 4 6 7
## 13 3 3 7 8 9
## 14 4 3 5 6 7
## 15 5 3 6 7 8
## 16 1 4 1 3 4
## 17 2 4 5 9 7
## 18 3 4 8 9 9
## 19 4 4 8 7 9
## 20 5 4 6 8 6
We use the multilevel.reliability
function (from
psych
) to analyze the data. Various reliability (or
generalizability) coefficients may be found from these variance
components. For instance, the reliability of individual differences over
k fixed time points and m multiple items is \[\begin{equation}
R_{kF} = \frac{\sigma^2_{id} + (\sigma^2_{id x items}/m)}{\sigma^2_{id}
+ (\sigma^2_{id x items}/m) + \sigma^2_{error}/(k m)}
\label{eq:Rkf}
\end{equation}\] Equation~\(\ref{eq:Rkf}\) is just one (#6) of the six
generalizability coefficients discussed by .
mg <- multilevel.reliability(shrout,grp="Person",Time="Time",items=
c("Item1","Item2","Item3"),plot=TRUE)
mg
##
## Multilevel Generalizability analysis
## Call: multilevel.reliability(x = shrout, grp = "Person", Time = "Time",
## items = c("Item1", "Item2", "Item3"), plot = TRUE)
##
## The data had 5 observations taken over 4 time intervals for 3 items.
##
## Alternative estimates of reliability based upon Generalizability theory
##
## RkF = 0.97 Reliability of average of all ratings across all items and times (Fixed time effects)
## R1R = 0.6 Generalizability of a single time point across all items (Random time effects)
## RkR = 0.85 Generalizability of average time points across all items (Random time effects)
## Rc = 0.74 Generalizability of change (fixed time points, fixed items)
## RkRn = 0.85 Generalizability of between person differences averaged over time (time nested within people)
## Rcn = 0.65 Generalizability of within person variations averaged over items (time nested within people)
##
## These reliabilities are derived from the components of variance estimated by ANOVA
## variance Percent
## ID 2.34 0.44
## Time 0.38 0.07
## Items 0.61 0.11
## ID x time 0.92 0.17
## ID x items 0.12 0.02
## time x items 0.05 0.01
## Residual 0.96 0.18
## Total 5.38 1.00
##
## The nested components of variance estimated from lme are:
## variance Percent
## id 2.3 0.45
## id(time) 1.1 0.21
## residual 1.7 0.34
## total 5.1 1.00
##
## To see the ANOVA and alpha by subject, use the short = FALSE option.
## To see the summaries of the ICCs by subject and time, use all=TRUE
## To see specific objects select from the following list:
## ANOVA s.lmer s.lme alpha summary.by.person summary.by.time ICC.by.person ICC.by.time lmer long Call
As is true of most R functions, there is a lot more there than meets the eye:
names(mg) #list the various objects included in the mg object
## [1] "n.obs" "n.time" "n.items" "components" "RkF"
## [6] "R1R" "RkR" "Rc" "RkRn" "Rcn"
## [11] "ANOVA" "s.lmer" "s.lme" "alpha" "summary.by.person"
## [16] "summary.by.time" "ICC.by.person" "ICC.by.time" "lmer" "long"
## [21] "Call"
headTail(mg$long) #show the first and last rows of the long object
## id time values items
## 1 1 1 2 Item1
## 2 1 2 3 Item1
## 3 1 3 4 Item1
## 4 1 4 1 Item1
## ... <NA> <NA> ... <NA>
## 57 5 1 7 Item3
## 58 5 2 9 Item3
## 59 5 3 8 Item3
## 60 5 4 6 Item3
One purpose of studying multi-level data is to explore individual
differences in changes over time. People can differ in their affect as a
function of the situation , their scores can increase or decrease over
time, and they can show diurnal variation in their moods or their
performance. To find the slope over time, we simply apply a within
subject regresssion, and to examine the phase and fit of diurnal
variation, we use circular statistics using the cosinor
function. Later, we discuss how to generate and analyze simulated data
with a variety of data structures, particularly growth and decay over
time, and diurnal variation.