Data visualization exercises

The following code loads the data discussed in the lecture:

File <- "http://sachaepskamp.com/files/OPdata.csv"
Data <-read.csv(File)

Exercise 1: Replicate the following plot:

Exercise 2: Replicate the following plot:

Exercise 3: Replicate the following plot (column blocks correspond to study yes or no, row blocks to working):

Exercise 4: Replicate the following plot:

The following code computes the mean and standard deviation of stress levels per person:

library("dplyr")
DataSummarized <- Data %>% group_by(userID) %>% summarize(
  mean = mean(Stress),
  sd = sd(Stress),
  Gender = Gender[1],
  Neuroticism = Neuroticism[1]
)

Exercise 5: Replicate the following plot:

Exercise 6: Further summarize DataSummarized to compute the average stress levels for males, females high neurotics and low neurotics (try to do this using dplyr):

## Source: local data frame [4 x 3]
## Groups: Gender
## 
##   Gender Neuroticism OverallMean
## 1 female        high   0.5279762
## 2 female         low   0.4119792
## 3   male        high   0.4958333
## 4   male         low   0.3312500

Then replicate the following plots (notice the range on the \(y\) axis):

If you are done, here are some things you can try to do with the plots (use Google!):

Change the amount of major tick marks to every 0.1 point
Change the amount of minor tick marks to every 0.05 point
Rotate the \(x\)-axis labels 45 degrees
Change the colors of the lines to green and red
Add error bars to the last plot

Data visualization exercises

Sacha Epskamp

05-06-2015