--- title: '350.week 4 part 1: Factor analysis' output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) options(width=100) ``` #Running code and debugging it First we show some simple functions and step through them using the *debug* and *browser* functions. We do this using some simple functions. #Dimension reduction through multivariate analysis ```{r} library(psych) library(psychTools) ``` First, look at the Factor analysis handout. This discusses the problem of iterative fit as well as the basic logic of factor analysis. ###Iterative Fit The basic issue in fitting is to try approximations and then adjust the fit to get a better approximation. Consider an iterative solution the square root. (Clearly not as powerful as just asking for `sqrt') ```{r} iterative.sqrt <- function(X,guess) { #a dumb guess if(missing(guess)) guess <- 1 iterate <- function(X,guess) { low <- guess high <- X/guess guess <-((high+low)/2) guess} Iter <- matrix(NA,ncol=3,nrow=10) colnames(Iter) <- c("Guess", "X/Guess","Error") for (i in 1:10) { # a small loop Iter[i,1] <- guess Iter[i,2] <- guess <- iterate(X,guess) Iter[i,3] <- Iter[i,1]- Iter[i,2] } Iter } #this returns the value result <- iterative.sqrt(47,1) matplot(result,type="b") #plot them using matplot ``` Do this function again, with a different starting value ```{r} result <- iterative.sqrt(7228) result #show the numbers matplot(result,type="b") ``` # Multidimensional analysis Many procedures use this concept of iterative fitting Some, such as principal components are ``closed form" and can just be solved directly Others, such as ``factor analysis" need to find solutions by successive approximations The issue of factor or component rotation is an iterative procedure. Principal Components and Factor analysis are all the result of some basic matrix equations to approximate a matrix. ## A brief digression about matrices Matrices have been used to solve simultaneous equations since at least the time of the Babylonians and the early Chinese. They were first introduced into psychology in the early 1930s. They are used in almost all statistical tests. Somes the use is hidden to the user, sometimes it is made explicit. As we know, a *matrix* is just a two dimensional representation of our data. All elements of the matrix must be of the same *type*. For data analysis, the elements of a matrix are numbers. Matrices have *rows* and *columns*. There dimensinality is found by using *dim*.) ## Correlations as matrix operations For a data matrix *X*, if we convert the observations to deviation scores (i.e., subtract all scores from the mean), then find the matrix product of *X* with itself, and divide by the number of subjects, we have a *Covariance* matrix. The diagonal of that matix is a vector of variances. $$C = X'X/(N-1) $$ ### Show this with a small data set We use the attitude data set ```{r} library(psych) X <- attitude head(X) #show the first five lines dim(X) x <-scale(X,scale=FALSE) # find the deviation scores using the scale function head(x) z <- scale(X) #or find the standard scores head(z) C <- t(x) %*%x/(nrow(x)-1) #matrix multiplication gives the covariances round(C,2) V <- diag(C) sdi <- diag(1/sqrt(V)) #divide by the standard deviation R <- sdi %*% C %*% t(sdi) round(R,2) r <- t(z) %*% z/(nrow(z)-1) #or the correlations matrix round(r,2) #or r <- lowerCor(X) ``` # Principal Components analysis and Factor Analysis Conceptually, we are just taking the square root of a correlation matrix: $$R \approx CC' $$ or $$R \approx F F' + U2 $$ For any given correlation matrix, R, can we find a C or an F matrix which approximates it? ## A short digression into Eigen decomposition (optional) A square matrix *R* can be decomposed into a matrix of *orthogonal eigen vectors* and a set of scaling *eigen values* $$R = X \lambda X'$$ such that $$X X' = I $$ We do this using the *eigen* function ```{r} ev <- eigen(R) #find the eigen values and eigen vectors ev evect <- ev$vectors #the eigen vectors evalue = ev$values #the eigen values round(evect %*% diag(evalue) %*% t(evect), 2) ``` ### Principal components are just a simple way of representing eigen values and eigen vectors if $$R = X \lambda X' $$ then we call the principal component $$C = X \sqrt(\lambda)$$ ```{r} PC = evect %*% diag(sqrt(evalue)) round(PC,2) round(PC %*% t(PC),2) ``` ## Use the principal function ```{r} pc <- principal(X,7,rotate="none") pc ``` ## But, this is not very helpful However, if we take just the first or first 2 pcs, we get a nice description ```{r} pc1 <- principal(X) pc1 ``` # Statistics is model fitting *Data = Model + Residual* Our model is one principal component, the data is the correlation matrix, how well does it fit? ```{r} residual = R - pc1$loading %*% t(pc1$loading) round(residual,2) #or residuals(pc1) #the residuals function shows the residuals ``` # Factor Analysis fits the correlations/covariances of the off diagonal Components analysis fits the entir matrix $$R \approx CC' $$ but factor analysis fits the *common* part of the matrix and guesses on the diagonal $$R \approx F F' + U2 $$ ## A very simple example (created using `sim.congeneric`) First create a matrix using a simulation function ```{r} library(psych) R <- sim.congeneric(load =c(.9,.8,.7,.6)) lowerMat(R) ``` Can we approximate this (yes, think about your multiplication tables) ```{r} R <- sim.congeneric(load =c(.9,.8,.7,.6)) lowerMat(R) f1 <- fa(R) #use the factor analysis function in `psych` f1 #look at the factor model f1$model #examine residuals (Data - Model) residuals(f1) ``` # Simulate some more data using 'sim.item' First lets look at the function ```r{} sim.item ``` Now run the function with 12 variables ```{r} my.data <- sim.item(12) #use the defaults describe(my.data) corPlot(my.data) #show the structure f2 <- fa(my.data,2) #extract two factors f2 #show then fa.plot(f2) #another way of showing structure ``` ## Lets try some real data Use the `bfi` data set ```{r} describe(bfi[1:5]) #always describe first R <- lowerCor(bfi[1:5]) #just the first 5 from the bfi f1 <- fa(R) f1 residuals(f1) ``` ## show the item content as well ```{r} fa.lookup(f1,dictionary = bfi.dictionary[1:2]) ``` ##But what if we included more items? ```{r} f1a <- fa(bfi[1:10]) f1a residuals(f1a) ``` ## Extract another factor (The algebra is the same: $R \approx F F' + U2$) ```{r} f2 <- fa(bfi[1:10],nfactors=2) #ask for two factors f2 resid(f2) #these are much smaller fa.lookup(f2,dictionary=bfi.dictionary[1:2]) #note that we sort the loadings ``` ## Lets try a different data set The msq represents 75 mood and emotion items. But we will take just 12 of them ```{r} affect <- c("active", "energetic", "wakeful","sleepy", "tired", "drowsy", "jittery", "fearful", "tense", "placid", "calm", "at.rest") f2 <- fa(msq[affect],2) f2 #but "Alabama need not come first, don't show in alpha order" fa.sort(f2) #sorted by size of loading fa.plot(f2,labels=affect) # show the results graphically ``` ## How many factors are in the data? Although there is no right answer, there are many ways of finding this. Parallel analysis compares the solution to random solutions, `nfactors` trys multiple solutions. ```{r} fa.parallel(msq[1:100,affect]) #run a parallel analysis ``` Now try the `nfactors` aproach ```{r} nfactors(msq[affect]) ``` # Categorical variables underestimate the correlation ## The tetrachoric and polychoric correlation Correlations are based upon the assumption of continous x and y. But if they are categorical or even worse, dichotomous, this will underestimate the "true" correlation. Consider what happens when we dichotomize a bivariate normal: ```{r} draw.tetra() ``` We can estimate the underlying "latent" correlation using the `tetrachoric` or `polychoric` functions. We can then do the analysis using these correlations. ```{r} f2p <- fa(msq[affect],2, cor="poly") fa.sort(f2p) #compare this to what we got before fa.congruence(f2p,f2) #factor congruence coefficients ```