Load Packages

library(multilevel)
library(foreign)
library(psych)
library(nlme)

Using linear models

Load Data

data<-read.spss("./Data/teams.sav", use.value.labels=F, to.data.frame=TRUE,use.missings=TRUE)

Clean

Turning the grouping variable into a factor

names(data)

##  [1] "SubjectID"       "GroupNum"        "TO1"             "TO2"            
##  [5] "TO3"             "TO4"             "IntCoh1"         "IntCoh2"        
##  [9] "IntCoh3"         "IntCoh4"         "IntCoh5"         "TaskCoh1"       
## [13] "TaskCoh2"        "TaskCoh3"        "TaskCoh4"        "TaskCoh5"       
## [17] "TurnOver"        "IntCoh"          "TaskCoh"         "TurnOver_mean"  
## [21] "IntCoh_mean"     "TaskCoh_mean"    "group.size"      "pooled_TurnOver"
## [25] "pooled_IntCoh"   "pooled_TaskCoh"

class(data$GroupNum)

## [1] "numeric"

data$GroupNum<-as.factor(data$GroupNum)
class(data$GroupNum)

## [1] "factor"

ICC1

Can we include higher-order constructs in our model?

model.1<-(lme(IntCoh~1,random=~1|GroupNum,data=data,na.action=na.omit))
VarCorr(model.1)

## GroupNum = pdLogChol(1) 
##             Variance  StdDev   
## (Intercept) 0.1531487 0.3913421
## Residual    0.6607064 0.8128385

ICC1.model.1=.1531487/(.1531487+.6607064)
ICC1.model.1

## [1] 0.1881769

18% of the variance in interpersonal cohesion is due to being on different teams

ICC2

Are the group means reliable/stable?

tells us about the psychometric quality of the group means - are the means for each group relatively stable/consistent?

ICC2.model.1=.1531487/(.1531487+.6607064/4.42) 
ICC2.model.1

## [1] 0.5060596

The reliability is below what we would probably like (i.e., ~.70)

RWG

Can we aggregate the individual scores?

names(data)

##  [1] "SubjectID"       "GroupNum"        "TO1"             "TO2"            
##  [5] "TO3"             "TO4"             "IntCoh1"         "IntCoh2"        
##  [9] "IntCoh3"         "IntCoh4"         "IntCoh5"         "TaskCoh1"       
## [13] "TaskCoh2"        "TaskCoh3"        "TaskCoh4"        "TaskCoh5"       
## [17] "TurnOver"        "IntCoh"          "TaskCoh"         "TurnOver_mean"  
## [21] "IntCoh_mean"     "TaskCoh_mean"    "group.size"      "pooled_TurnOver"
## [25] "pooled_IntCoh"   "pooled_TaskCoh"

#See LeBreton and Senter table re: rwgj code below - number to use after ranvar= depends on scale points and the distribution desired
rwgj.intcoh.un<-rwg.j(data[,c(7:11)],data$GroupNum,ranvar=4.00)

#uniform distribution
rwgj.intcoh.ss<-rwg.j(data[,c(7:11)],data$GroupNum,ranvar=2.90)

#slightly skewed distribution
summary(rwgj.intcoh.un)

##      grpid        rwg.j            gsize     
##  1      : 1   Min.   :0.6402   Min.   :2.00  
##  10     : 1   1st Qu.:0.8757   1st Qu.:4.00  
##  11     : 1   Median :0.9211   Median :5.00  
##  12     : 1   Mean   :0.8948   Mean   :4.42  
##  14     : 1   3rd Qu.:0.9476   3rd Qu.:5.00  
##  15     : 1   Max.   :0.9840   Max.   :5.00  
##  (Other):75

summary(rwgj.intcoh.ss)

##      grpid        rwg.j            gsize     
##  1      : 1   Min.   :0.0000   Min.   :2.00  
##  10     : 1   1st Qu.:0.7888   1st Qu.:4.00  
##  11     : 1   Median :0.8763   Median :5.00  
##  12     : 1   Mean   :0.7944   Mean   :4.42  
##  14     : 1   3rd Qu.:0.9215   3rd Qu.:5.00  
##  15     : 1   Max.   :0.9774   Max.   :5.00  
##  (Other):75

Using the Psych Package

example from Shrout and Fleiss (1979)

sf <- matrix(c(9,    2,   5,    8,
6,    1,   3,    2,
8,    4,   6,    8,
7,    1,   2,    6,
10,   5,   6,    9,
6,   2,   4,    7),ncol=4,byrow=TRUE)
colnames(sf) <- paste("J",1:4,sep="")
rownames(sf) <- paste("S",1:6,sep="")
sf

##    J1 J2 J3 J4
## S1  9  2  5  8
## S2  6  1  3  2
## S3  8  4  6  8
## S4  7  1  2  6
## S5 10  5  6  9
## S6  6  2  4  7

ICC(sf)

## Call: ICC(x = sf)
## 
## Intraclass correlation coefficients 
##                          type  ICC    F df1 df2       p lower bound upper bound
## Single_raters_absolute   ICC1 0.17  1.8   5  18 0.16472      -0.097        0.64
## Single_random_raters     ICC2 0.29 11.0   5  15 0.00013       0.043        0.69
## Single_fixed_raters      ICC3 0.71 11.0   5  15 0.00013       0.412        0.93
## Average_raters_absolute ICC1k 0.44  1.8   5  18 0.16472      -0.545        0.88
## Average_random_raters   ICC2k 0.62 11.0   5  15 0.00013       0.152        0.90
## Average_fixed_raters    ICC3k 0.91 11.0   5  15 0.00013       0.737        0.98
## 
##  Number of subjects = 6     Number of Judges =  4

With ANOVA

data(bhr2000)

hrs.mod<-aov(HRS~as.factor(GRP),data=bhr2000)
summary(hrs.mod)

##                  Df Sum Sq Mean Sq F value Pr(>F)    
## as.factor(GRP)   98   3371   34.40    12.5 <2e-16 ***
## Residuals      5301  14591    2.75                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ICC1(hrs.mod)

## [1] 0.1741008

ICC2(hrs.mod)

## [1] 0.9199889

The ICC(1) value of .17 indicates that 17% of the variance in individual perceptions of work hours can be “explained” by group membership. The ICC(2) value of .92 indicates that groups can be reliably differentiated in terms of average work hours.

Graphing ICCs

graph.ran.mean(bhr2000$HRS, bhr2000$GRP, nreps=1000, limits=c(8,14),bootci=TRUE)

The bar chart represents each groups’ average rating of work hours sorted from highest to lowest, and the line represents a random distribution where 99 pseudo groups (with exact size characteristics of the actual groups) were created 100 times and the sorted values were averaged across the 1000 iterations. The dotted lines represent the upper and lower 95% confidence interval estimates. In short, the line represents the expected distribution if there were no group-level properties associated with these data. The graph suggests fairly evenly distributed group-level properties associated with the data. That is, the ICC(1) value of .17 does not seem to be caused by one or two aberrant groups.

rwg and rwg(j) in the multilevel package

Both the rwg and rwg.j functions are based upon the formulations described in James et al.(1984). Both functions require the user to specify three pieces of information. The first piece of information is the variable of interest (x), the second is the grouping variable (grpid), and third is the estimate of the expected random variance (ranvar). The default estimate of ranvar is 2, which is the expected random variance based upon the rectangular distribution for a 5-point item) calculated using the formula ranvar=(A^2-1)/12 where A represents the number of items (i.e., # of response options associated with the scale anchors.)

rwg

RWG.RELIG<-rwg(bhr2000$RELIG,bhr2000$GRP,ranvar=2)
RWG.RELIG[1:10,] #examine first 10 rows of data

The first column contains the group names (grpid), the second column contains the 99 rwg values – one for each group. The third column contains the group size. To calculate the mean rwg value use the summary command.

summary(RWG.RELIG)

##      grpid         rwg             gsize       
##  1      : 1   Min.   :0.0000   Min.   :  8.00  
##  10     : 1   1st Qu.:0.1046   1st Qu.: 29.50  
##  11     : 1   Median :0.1899   Median : 45.00  
##  12     : 1   Mean   :0.1864   Mean   : 54.55  
##  13     : 1   3rd Qu.:0.2630   3rd Qu.: 72.50  
##  14     : 1   Max.   :0.4328   Max.   :188.00  
##  (Other):93

The summary command informs us that the average rwg value is .186 and the range is from 0 to 0.433. By convention, values at or above 0.70 are considered good agreement, so there appears to be low agreement among individuals with regard to religion. The summary command also provides information about the group sizes.

hist(RWG.RELIG[,2])

Testing a different distribution

To calculate rwg for work hours, the expected random variance (EV) needs to be changed from its default value of 2. Work hours was asked using an 11-point item, so EV based on the rectangular distribution is 10 [(11^2 -1)/12)

RWG.HRS<-rwg(bhr2000$HRS,bhr2000$GRP,ranvar=10.00)
mean(RWG.HRS[,2])

## [1] 0.7353417

rwg(j)

The first argument to rwg.j is a matrix instead of a vector. In the matrix, each column represents one item in the multi-item scale, and each row represents an individual response. For instance, columns 2-12 in bhr2000 represent 11 items comprising a leadership scale. The items were assessed using 5-point response options (Strongly Disagree to Strongly Agree), so the expected random variance is 2.

RWGJ.LEAD<-rwg.j(bhr2000[,2:12],bhr2000$GRP,ranvar=2)
summary(RWGJ.LEAD)

##      grpid        rwg.j            gsize       
##  1      : 1   Min.   :0.7859   Min.   :  8.00  
##  10     : 1   1st Qu.:0.8708   1st Qu.: 29.50  
##  11     : 1   Median :0.8925   Median : 45.00  
##  12     : 1   Mean   :0.8876   Mean   : 54.55  
##  13     : 1   3rd Qu.:0.9088   3rd Qu.: 72.50  
##  14     : 1   Max.   :0.9440   Max.   :188.00  
##  (Other):93

Supplementary

awg index

AWG.LEAD<-awg(bhr2000[,2:12],bhr2000$GRP)
summary(AWG.LEAD)

##      grpid         a.wg            nitems      nraters        avg.grp.var    
##  1      : 1   Min.   :0.2223   Min.   :11   Min.   :  8.00   Min.   :0.2787  
##  10     : 1   1st Qu.:0.3654   1st Qu.:11   1st Qu.: 29.50   1st Qu.:0.4348  
##  11     : 1   Median :0.4193   Median :11   Median : 45.00   Median :0.5166  
##  12     : 1   Mean   :0.4125   Mean   :11   Mean   : 54.55   Mean   :0.5157  
##  13     : 1   3rd Qu.:0.4635   3rd Qu.:11   3rd Qu.: 72.50   3rd Qu.:0.5692  
##  14     : 1   Max.   :0.5781   Max.   :11   Max.   :188.00   Max.   :0.9144  
##  (Other):93

ICCs

Justin M. Jones