---
title: "314.wk2.Rmd"
output:
html_document: default
pdf_document: default
---
Part of the lecture notes and assignments for Using R in psychological research at Northwestern University, Fall, 2017.
#314: Exercises for Week 2: Reading and writing data
Before it is possible to use R for analysis, we must first get the data. Data files come in many different flavors. Here we will explore how to read in data from the clipboard, from text and csv files, as well as from SPSS.
##Preliminaries, using RMarkdown to annotate and show your work
We run this in the script window of RStudio so that we can keep our notes.
This way we can embed text (what you are reading) with the actual R commands and the R output. This is a convenient way to remember what you are doing.
Before we do anything, we need to set up RMarkdown so it has nice parameters
I show the actual commands issued which are hidden when we Knitr.
{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
options(width=100) #This sets the width of the output, 80 seems to be the default and is too narrow
To make these commands run in R, you precede the first line with three ` (below the tilda key on the keyboard)
and then close the last line by adding three more ' on the next line.
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
options(width=100) #This sets the width of the output, 80 seems to be the default and is too narrow
```
This
entire file is saved in the class notes folder so that you can see how the Markdown commands are written.
##Creating a RMarkdown script
Open RStudio
Create a new file by choosing the File menu (with the R markdown option). You now have a Rmarkdown template that you can modify with the commands that you want. Remember to make some R code run in your template, precede what you want with three ` and then {r} new line with some r commands followed eventually with three more `
##Reading the data using the ```psych``` package.
Much of this is summarized in the vignette: An introduction to the psych package: Part I: data entry and data description which you may get by finding the vignettes for psych.
For these examples, we first need to activate the psych package.
We will read the data using several different approaches. For each of these approaches, we will save the data in the object `my.data'. You can, of course, call this object anything you want.
```{r}
library(psych) #this assumes we have already installed psych
```
# Just read from the clipboard
If you have a data set that you have read from a web browser, or found in a file that you viewed, you can copy the file to your clipboard (using the appropriate commands for your system) and then read the clipboard into R.
First, we use our browser to read the remote file:
http://personality-project.org/r/datasets/simulation.txt
Select all elements of the file and copy to the clipboard. Then
```{r}
my.data <- read.clipboard() #this takes what is in the clipboard and makes into the my.data object
```
Now, lets see what we got. We will ask for the dimensions of my.data, show the first and last few lines, and then get some basic desriptive statistics.
```{r}
dim(my.data) #what is the size of the object we read?
headTail(my.data) #show the first and last 4 lines of the object
describe(my.data) #get some descriptive statistics of this object
```
#Or we can specify the file name and then use the read.file command
Instead of reading from the clipboard, we can specify the local or remote location of the file and read it directly.
```{r}
file.name <- "http://personality-project.org/r/datasets/simulation.txt"
my.data <- read.file(file.name) #goes to the remote location and reads it
```
Once again, we want to see what we got.
```{r}
dim(my.data) #what is the size of the object we read?
headTail(my.data) #show the first and last 4 lines of the object
describe(my.data) #get some descriptive statistics of this object
```
#Read a local file using file.choose()
We can find the file on our local hard disk by looking for it with the file.choose command. Unfortunately, I need to comment out this statement because I can not dynamically do it as part of a script. So, I will make up a new object `fn' (file.name) which I will set to what we got before
```{r}
#next line is suppressed because we can not do it interactively
#so instead, we will define fn as file.name
#fn <-file.choose() # this opens your system to look for the file
fn <- "/Users/WR/Box Sync/pmc_folder/314/datasets/simulation.txt" #from my looking for it
fn # show the name of the file
my.data <- read.file(fn)
dim(my.data) #still the 72 by 7 data file
```
#Combining file.choose and read.file into one command
If I do not specify the name of the file (fn) in my read.file command, R will open a system window to let you find it on your machine. What it is doing is calling the file.choose function for you.
I can not show this, but you can try it on your machine.
```{r}
#my.data <- read.file()
dim(my.data)
```
#Reading an SPSS file
SPSS saves the data in format with the .sav suffix. We can read these data in using read.file. Eli Finkel has shared a small SPSS.sav file
```{r}
fn <- "http://personality-project.org/r/datasets/finkel.sav"
eli <- read.file(fn) #go and get it and convert to a normal data.frame
dim(eli)
headTail(eli)
colnames(eli)
describe(eli)
```
##Keeping (viewing) the original codes
By default, the read.file function translates complex coding systems into numercal values. Sometimes you want to see the actual encoding of the SPSS file. You can do this by specifying 'use.value.labels=TRUE'.
Compare the next two objects. (Taken from the the help pages of an SPSS online training workshop at Central Michigan University).
```{r}
fn <- "http://personality-project.org/r/datasets/Cars.sav"
data1 <- read.file(fn) #go and get it and convert to a normal data.frame
data2 <- read.file(fn,use.value.labels=TRUE) #don't convert the value labels
headTail(data1) #look at the first and last few lines
headTail(data2) #notice we now have the values as entered
```
The describe function will describe both data sets. It converts the levels information in the second data set into numeric values and then does the desription. Note that the conversion of the year variable (was 1 to 13 in the in the spss converted file, but 70-82 in the describe converted object.
```{r}
describe(data1)
describe(data2)
```
#Writing data
Just as there are several input formats, so are there several output formats.
Collections of files that are to be read in again from R can be `saved' as .Rda files (Rdata files).
A single file can be written as an .rds file.
Files can also writte as text files so that other programs outside of R can read them.
You choose the way you want to write and save the file by specifying the suffix:
.text becomes a normal text file (that is to say, readable by a word processor)
.rds becomes a file readable by R
.rda can save multiple objects
##Creating a file and writing to it
To create a new file on your disk, use the file.choose function with new=TRUE or just write.file(object, f=)
That is to say, write.file by specifying the object to save, and f= where to save it.
```{r}
#fn.txt <- file.choose(new=TRUE) #commented out but creates
fn.txt <- "/Users/WR/Box Sync/pmc_folder/314/datasets/cars.txt"
fn.rda <- "/Users/WR/Box Sync/pmc_folder/314/datasets/cars.rda"
fn.rds <- "/Users/WR/Box Sync/pmc_folder/314/datasets/cars.rds"
write.file(data1,f=fn.txt) #save as text file
write.file(data1,f=fn.rds) #save as a file for R to read again
save(data1,data2,file=fn.rda) #use the save command to save several objects
```
#Showing and clearing your workspace
In RStudio, the upper right hand window sows the various objects in your workspace.
We can show all the objects in your work space by using the ls() function
```{r}
ls()
```
##Cleaning up the workspace
Lets get rid of unnneccessary objects. We will remove the ones we do not want using the rm() function
```{r}
rm(eli,data1,data2,my.data,file.name)
ls() #list them again
```
Now, read in from the data file named fn.rds
```{r}
fn.rds #show the location
my.data <- read.file(fn.rds)
dim(my.data)
```
#Assignment for Week 2, part 1
After reading through the examples from above, and reading in each of the demonstation data sets, try to read in some of your own data. Then try to save it, and then read it again.
## Use the file.choose() function
Use the file.choose() function as you explore files on your own machine.