When describing data, we want to know both the central tendencies (mean, median) as well as the dispersion (range, variance, interquartile range). There are a number of ways of doing this.
describe
the data or we can show the variations
around the central tendency.library(psych)
library(psychTools)
fn <- "http://personality-project.org/courses/350/datasets/simulation.txt"
my.data <- read.file(fn)
## Data from the .txt file http://personality-project.org/courses/350/datasets/simulation.txt has been loaded.
describe(my.data)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## Time 1 72 14.28 5.03 19.0 14.34 0.00 9 19 10 -0.11 -2.02 0.59
## Anxiety 2 72 5.24 2.18 5.0 5.24 2.97 0 10 10 -0.04 -0.65 0.26
## Impulsivity 3 72 4.90 3.98 4.5 4.88 5.19 0 10 10 0.02 -1.83 0.47
## sex 4 72 1.50 0.50 1.5 1.50 0.74 1 2 1 0.00 -2.03 0.06
## Arousal 5 72 60.90 8.10 66.0 61.29 5.93 48 70 22 -0.27 -1.67 0.96
## Tension 6 72 56.83 6.29 57.0 57.14 5.93 38 69 31 -0.53 0.42 0.74
## Performance 7 72 72.21 17.41 78.0 73.19 18.53 38 98 60 -0.43 -1.10 2.05
There are a number of ways of doing this.
In core R, we can use the boxplot
function:
boxplot(my.data) # or
boxplot(my.data,notch=TRUE) #to show median confidence intervals
## Warning in (function (z, notch = FALSE, width = NULL, varwidth = FALSE, : some notches went outside
## hinges ('box'): maybe set notch=FALSE
##show it by a categorical variable
boxplot(Arousal ~ Time,data=my.data,notch=TRUE)
# Or show the bivariate relationships using ‘pairs’
pairs(my.data)
Or, using some of the psych
functions
Lets first show the multivariate relatioships
pairs.panels(my.data)
Now show the range of each variable
error.bars(my.data)
error.bars(my.data ~ sex) #formula input
results <- error.dots(my.data)
results #show the output
## $des
## vars n mean sd median trimmed mad min max range skew kurtosis se
## Time 1 72 14.28 5.03 19.0 14.34 0.00 9 19 10 -0.11 -2.02 0.59
## Anxiety 2 72 5.24 2.18 5.0 5.24 2.97 0 10 10 -0.04 -0.65 0.26
## Impulsivity 3 72 4.90 3.98 4.5 4.88 5.19 0 10 10 0.02 -1.83 0.47
## sex 4 72 1.50 0.50 1.5 1.50 0.74 1 2 1 0.00 -2.03 0.06
## Arousal 5 72 60.90 8.10 66.0 61.29 5.93 48 70 22 -0.27 -1.67 0.96
## Tension 6 72 56.83 6.29 57.0 57.14 5.93 38 69 31 -0.53 0.42 0.74
## Performance 7 72 72.21 17.41 78.0 73.19 18.53 38 98 60 -0.43 -1.10 2.05
##
## $order
## [1] 4 3 2 1 6 5 7
densityBy(my.data,"Arousal" , grp="sex")
#do it again, but with a legend
densityBy(my.data,"Arousal" , grp="sex",legend=1)
#
#We can also do this using 'formula' mode and show a legend
densityBy(Arousal ~ Time, data=my.data, legend=1)
#or we can do two x variables at once
densityBy(Arousal ~ Time + sex, data=my.data, legend=1) #although the legend is bad
histBy(my.data,"Arousal" , group ="Time") #but this one does
cd <- cohen.d(my.data,"Time")
cd #show them numerically
## Call: cohen.d(x = my.data, group = "Time")
## Cohen d statistic of difference between two means
## lower effect upper
## Anxiety -0.44 0.03 0.49
## Impulsivity -0.34 0.12 0.59
## sex -0.57 -0.11 0.35
## Arousal 5.39 6.58 7.75
## Tension -0.36 0.10 0.56
## Performance 2.84 3.60 4.35
##
## Multivariate (Mahalanobis) distance between groups
## [1] 7.4
## r equivalent of difference between two means
## Anxiety Impulsivity sex Arousal Tension Performance
## 0.01 0.06 -0.06 0.96 0.05 0.87
error.dots(cd,main="Cohen d statistic for our data")
names(cd) #what are the various objects
## [1] "cohen.d" "hedges.g" "M.dist" "r" "t" "n"
## [7] "p" "wt.d" "descriptive" "se" "dict" "order"
## [13] "Call"
cd$t #show the t test values
## Anxiety Impulsivity sex Arousal Tension Performance
## 0.1105898 0.5130930 -0.4662524 27.4682568 0.4231092 15.0529108
cd$descriptive #show the descriptive statistics
## Statistics within and between groups
## Call: statsBy(data = x, group = group)
## Intraclass Correlation 1 (Percentage of variance due to groups)
## Time Anxiety Impulsivity sex Arousal Tension Performance
## 1.00 -0.03 -0.02 -0.02 0.95 -0.02 0.86
## Intraclass Correlation 2 (Reliability of group differences)
## Time Anxiety Impulsivity sex Arousal Tension Performance
## 1.00 -80.77 -2.80 -3.60 1.00 -4.59 1.00
## eta^2 between groups
## Anxiety.bg Impulsivity.bg sex.bg Arousal.bg Tension.bg Performance.bg
## 0.00 0.00 0.00 0.92 0.00 0.76
##
## To see the correlations between and within groups, use the short=FALSE option in your print statement.
## Many results are not shown directly. To see specific objects select from the following list:
## mean sd n F ICC1 ICC2 ci1 ci2 raw rbg pbg rwg nw ci.wg pwg etabg etawg nwg nG Call