bestScales {psych} R Documentation

## A set of functions for factorial and empirical scale construction

### Description

When constructing scales through rational, factorial, or empirical means, it is useful to examine the content of the items that relate most highly to each other (e.g., the factor loadings of `fa.lookup` of a set of items) , or to some specific set of criteria (e.g., `bestScales`). Given a dictionary of item content, these routines will sort by factor loading or criteria correlations and display the item content.

### Usage

```bestScales(x,criteria,cut=.1,n.item =10,overlap=FALSE,dictionary=NULL,impute="none",
n.iter =1,frac=.9,digits=2)
bestItems(x,criteria=1,cut=.3, abs=TRUE, dictionary=NULL,digits=2)
lookup(x,y,criteria=NULL)
fa.lookup(f,dictionary,digits=2)
item.lookup(f,m, dictionary,cut=.3, digits = 2)
keys.lookup(keys.list,dictionary)
```

### Arguments

 `x` A data matrix or data frame depending upon the function. `y` A data matrix or data frame or a vector `criteria` Which variables (by name or location) should be the empirical target for bestScales and bestItems `f` The object returned from either a factor analysis (fa) or a principal components analysis (principal) `keys.list` A list of scoring keys suitable to use for make.keys `cut` Return all values in abs(x[,c1]) > cut. `abs` if TRUE, sort by absolute value in bestItems `dictionary` a data.frame with rownames corresponding to rownames in the f\$loadings matrix or colnames of the data matrix or correlation matrix, and entries (may be multiple columns) of item content. `m` A data frame of item means `n.item` How many items make up an empirical scale `overlap` Are the correlations with other criteria fair game for bestScales `impute` When finding the best scales, and thus the correlations with the criteria, how should we handle missing data? The default is drop missing items. `n.iter` Replicate the best item function n.iter times, sampling frac of the cases each time, and validating on 1 - frac of the cases. `frac` What fraction of the subjects should be used for the derivation sample. frac is set to one if n.iter = 1 `digits` round to digits

### Details

`bestItems` and `lookup` are simple helper functions to summarize correlation matrices or factor loading matrices. `bestItems` will sort the specified column (criteria) of x on the basis of the (absolute) value of the column. The return as a default is just the rowname of the variable with those absolute values > cut. If there is a dictionary of item content and item names, then include the contents as a two column matrix with rownames corresponding to the item name and then as many fields as desired for item content. (See the example dictionary `bfi.dictionary`).

`lookup` is used by `bestItems` and will find values in c1 of y that match those in x. It returns those rows of y of that match x. Suppose that you have a "dictionary" of the many variables in a study but you want to consider a small subset of them in a data set x. Then, you can find the entries in the dictionary corresponding to x by lookup(rownames(x),y) If the column is not specified, then it will match by rownames(y).

`fa.lookup` is used when examining the output of a factor analysis and one wants the corresponding variable names and contents. The returned object may then be printed in LaTex by using the `df2latex` function with the char option set to TRUE.

Similarly, given a correlation matrix, r, of the x variables, if you want to find the items that most correlate with another item or scale, and then show the contents of that item from the dictionary, bestItems(r,c1=column number or name of x, contents = y)

`bestScales` will find up to n.items that have absolute correlations with a criterion greater than cut. If the overlap option is FALSE (default) the other criteria are not used. This is an example of “dust bowl empiricism" in that there is not latent construct being measured, just those items that most correlate with a set of criteria. The empirically identified items are then formed into scales (ignoring concepts of internal consistency) which are then correlated with the criteria.

Clearly, `bestScales` is capitalizing on chance associations. Thus, we can validate the empirical scales by deriving them on a fraction of the total number of subjects, and cross validating on the remaining subjects. This is done n.iter times to show the variability of these empirically derived scale validities. For very large data sets (e.g., those from SAPA) these scales seem very stable.

`item.lookup` combines the output from a factor analysis `fa` with simple descriptive statistics (a data frame of means) with a dictionary. Items are grouped by factor loadings > cut, and then sorted by item mean. This allows a better understanding of how a scale works, in terms of the meaning of the item endorsements.

### Value

`bestScales` returns the correlation of the empirically constructed scale with each criteria and the items used in the scale. If a dictionary is specified, it also returns a list (value) that shows the item content. Also returns the keys list so that scales can be found using `cluster.cor` or `scoreItems`.

`bestItems` returns a sorted list of factor loadings or correlations with the labels as provided in the dictionary.

`lookup` is a very simple implementation of the match function.

`fa.lookup` takes a factor/cluster analysis object (or just a keys like matrix), sorts it using `fa.sort` and then matches by row.name to the corresponding dictionary entries.

### Note

To create a dictionary, create an object with row names as the item numbers, and the columns as the item content. See the `link{bfi.dictionary}` as an example.

### Note

Although empirical scale construction is appealing, it has the basic problem of capitalizing on chance. Thus, be careful of over interpreting the results unless working with large samples.

William Revelle

### References

Revelle, W. (in preparation) An introduction to psychometric theory with applications in R. Springer. (Available online at http://personality-project.org/r/book).

`fa`, `iclust`,`principal`

### Examples

```bs <- bestScales(bfi,criteria=c("gender","education","age"),dictionary=bfi.dictionary)
bs
f5 <- fa(bfi,5)
m <- colMeans(bfi,na.rm=TRUE)
item.lookup(f5,m,dictionary=bfi.dictionary[2])
fa.lookup(f5,dictionary=bfi.dictionary[2])  #just show the item content, not the source of the items

```

[Package psych version 1.7.8 ]