---
title: "350.wk2a"
author: "William Revelle"
date: 4/01/24
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
options(width=100) #This sets the width of the output
```


## Exploring variability

When describing data, we want to know both the central tendencies (mean, median) as well as the dispersion (range, variance, interquartile range).  There are a number of ways of doing this. 

# We can `describe` the data or we can show the variations around the central tendency.

### We will use the example data set from before

```{r}
library(psych)
library(psychTools)
fn <-  "http://personality-project.org/courses/350/datasets/simulation.txt"  
my.data <- read.file(fn) 
describe(my.data)
```

## Better yet is to show the data graphically

There are a number of ways of doing this.

In core R, we can use the `boxplot` function:

```{r}
boxplot(my.data)  # or
boxplot(my.data,notch=TRUE) #to show median confidence intervals
##show it by a categorical variable
boxplot(Arousal ~ Time,data=my.data,notch=TRUE)
```
# Or show the bivariate relationships using 'pairs'

```{r}
pairs(my.data)
```

## The psych package has  number of descriptive functions

Or, using some of the `psych` functions

Lets first show the multivariate relatioships

```{r}
pairs.panels(my.data)
```

Now show the range of each variable

```{r}
error.bars(my.data)
```

# Do these data different by some grouping variable?

```{r}
error.bars(my.data ~ sex)  #formula input
```


### We can also do a dot chart

```{r}
results <- error.dots(my.data)
results #show the output
```

### or smoothed histograms
```{r}
densityBy(my.data,"Arousal" , grp="sex")
#do it again, but with a legend
densityBy(my.data,"Arousal" , grp="sex",legend=1)
#
#We can also do this using 'formula' mode and show a legend
densityBy(Arousal ~ Time, data=my.data, legend=1)
#or we can do two x variables at once
densityBy(Arousal ~ Time + sex, data=my.data, legend=1) #although the legend is bad
histBy(my.data,"Arousal" , group ="Time")  #but this one does
```

## other statistics to describe group differences include Cohen's d

```{r}
cd <- cohen.d(my.data,"Time")
cd  #show them numerically
error.dots(cd,main="Cohen d statistic for our data")
```

# Functions return more than they show, examine the output of cd (from above)

```{r}
names(cd) #what are the various objects 
cd$t  #show the t test values
cd$descriptive  #show the descriptive statistics
```