library(multilevel)
library(foreign)
library(psych)
library(nlme)
data<-read.spss("./Data/teams.sav", use.value.labels=F, to.data.frame=TRUE,use.missings=TRUE)
Turning the grouping variable into a factor
names(data)
## [1] "SubjectID" "GroupNum" "TO1" "TO2"
## [5] "TO3" "TO4" "IntCoh1" "IntCoh2"
## [9] "IntCoh3" "IntCoh4" "IntCoh5" "TaskCoh1"
## [13] "TaskCoh2" "TaskCoh3" "TaskCoh4" "TaskCoh5"
## [17] "TurnOver" "IntCoh" "TaskCoh" "TurnOver_mean"
## [21] "IntCoh_mean" "TaskCoh_mean" "group.size" "pooled_TurnOver"
## [25] "pooled_IntCoh" "pooled_TaskCoh"
class(data$GroupNum)
## [1] "numeric"
data$GroupNum<-as.factor(data$GroupNum)
class(data$GroupNum)
## [1] "factor"
Can we include higher-order constructs in our model?
model.1<-(lme(IntCoh~1,random=~1|GroupNum,data=data,na.action=na.omit))
VarCorr(model.1)
## GroupNum = pdLogChol(1)
## Variance StdDev
## (Intercept) 0.1531487 0.3913421
## Residual 0.6607064 0.8128385
ICC1.model.1=.1531487/(.1531487+.6607064)
ICC1.model.1
## [1] 0.1881769
18% of the variance in interpersonal cohesion is due to being on different teams
Are the group means reliable/stable?
tells us about the psychometric quality of the group means - are the means for each group relatively stable/consistent?
ICC2.model.1=.1531487/(.1531487+.6607064/4.42)
ICC2.model.1
## [1] 0.5060596
The reliability is below what we would probably like (i.e., ~.70)
Can we aggregate the individual scores?
names(data)
## [1] "SubjectID" "GroupNum" "TO1" "TO2"
## [5] "TO3" "TO4" "IntCoh1" "IntCoh2"
## [9] "IntCoh3" "IntCoh4" "IntCoh5" "TaskCoh1"
## [13] "TaskCoh2" "TaskCoh3" "TaskCoh4" "TaskCoh5"
## [17] "TurnOver" "IntCoh" "TaskCoh" "TurnOver_mean"
## [21] "IntCoh_mean" "TaskCoh_mean" "group.size" "pooled_TurnOver"
## [25] "pooled_IntCoh" "pooled_TaskCoh"
#See LeBreton and Senter table re: rwgj code below - number to use after ranvar= depends on scale points and the distribution desired
rwgj.intcoh.un<-rwg.j(data[,c(7:11)],data$GroupNum,ranvar=4.00)
#uniform distribution
rwgj.intcoh.ss<-rwg.j(data[,c(7:11)],data$GroupNum,ranvar=2.90)
#slightly skewed distribution
summary(rwgj.intcoh.un)
## grpid rwg.j gsize
## 1 : 1 Min. :0.6402 Min. :2.00
## 10 : 1 1st Qu.:0.8757 1st Qu.:4.00
## 11 : 1 Median :0.9211 Median :5.00
## 12 : 1 Mean :0.8948 Mean :4.42
## 14 : 1 3rd Qu.:0.9476 3rd Qu.:5.00
## 15 : 1 Max. :0.9840 Max. :5.00
## (Other):75
summary(rwgj.intcoh.ss)
## grpid rwg.j gsize
## 1 : 1 Min. :0.0000 Min. :2.00
## 10 : 1 1st Qu.:0.7888 1st Qu.:4.00
## 11 : 1 Median :0.8763 Median :5.00
## 12 : 1 Mean :0.7944 Mean :4.42
## 14 : 1 3rd Qu.:0.9215 3rd Qu.:5.00
## 15 : 1 Max. :0.9774 Max. :5.00
## (Other):75
sf <- matrix(c(9, 2, 5, 8,
6, 1, 3, 2,
8, 4, 6, 8,
7, 1, 2, 6,
10, 5, 6, 9,
6, 2, 4, 7),ncol=4,byrow=TRUE)
colnames(sf) <- paste("J",1:4,sep="")
rownames(sf) <- paste("S",1:6,sep="")
sf
## J1 J2 J3 J4
## S1 9 2 5 8
## S2 6 1 3 2
## S3 8 4 6 8
## S4 7 1 2 6
## S5 10 5 6 9
## S6 6 2 4 7
ICC(sf)
## Call: ICC(x = sf)
##
## Intraclass correlation coefficients
## type ICC F df1 df2 p lower bound upper bound
## Single_raters_absolute ICC1 0.17 1.8 5 18 0.16472 -0.097 0.64
## Single_random_raters ICC2 0.29 11.0 5 15 0.00013 0.043 0.69
## Single_fixed_raters ICC3 0.71 11.0 5 15 0.00013 0.412 0.93
## Average_raters_absolute ICC1k 0.44 1.8 5 18 0.16472 -0.545 0.88
## Average_random_raters ICC2k 0.62 11.0 5 15 0.00013 0.152 0.90
## Average_fixed_raters ICC3k 0.91 11.0 5 15 0.00013 0.737 0.98
##
## Number of subjects = 6 Number of Judges = 4
With ANOVA
data(bhr2000)
hrs.mod<-aov(HRS~as.factor(GRP),data=bhr2000)
summary(hrs.mod)
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(GRP) 98 3371 34.40 12.5 <2e-16 ***
## Residuals 5301 14591 2.75
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ICC1(hrs.mod)
## [1] 0.1741008
ICC2(hrs.mod)
## [1] 0.9199889
The ICC(1) value of .17 indicates that 17% of the variance in individual perceptions of work hours can be “explained” by group membership. The ICC(2) value of .92 indicates that groups can be reliably differentiated in terms of average work hours.
graph.ran.mean(bhr2000$HRS, bhr2000$GRP, nreps=1000, limits=c(8,14),bootci=TRUE)
The bar chart represents each groups’ average rating of work hours sorted from highest to lowest, and the line represents a random distribution where 99 pseudo groups (with exact size characteristics of the actual groups) were created 100 times and the sorted values were averaged across the 1000 iterations. The dotted lines represent the upper and lower 95% confidence interval estimates. In short, the line represents the expected distribution if there were no group-level properties associated with these data. The graph suggests fairly evenly distributed group-level properties associated with the data. That is, the ICC(1) value of .17 does not seem to be caused by one or two aberrant groups.
Both the rwg and rwg.j functions are based upon the formulations described in James et al.(1984). Both functions require the user to specify three pieces of information. The first piece of information is the variable of interest (x), the second is the grouping variable (grpid), and third is the estimate of the expected random variance (ranvar). The default estimate of ranvar is 2, which is the expected random variance based upon the rectangular distribution for a 5-point item) calculated using the formula ranvar=(A^2-1)/12 where A represents the number of items (i.e., # of response options associated with the scale anchors.)
RWG.RELIG<-rwg(bhr2000$RELIG,bhr2000$GRP,ranvar=2)
RWG.RELIG[1:10,] #examine first 10 rows of data
The first column contains the group names (grpid), the second column contains the 99 rwg values – one for each group. The third column contains the group size. To calculate the mean rwg value use the summary command.
summary(RWG.RELIG)
## grpid rwg gsize
## 1 : 1 Min. :0.0000 Min. : 8.00
## 10 : 1 1st Qu.:0.1046 1st Qu.: 29.50
## 11 : 1 Median :0.1899 Median : 45.00
## 12 : 1 Mean :0.1864 Mean : 54.55
## 13 : 1 3rd Qu.:0.2630 3rd Qu.: 72.50
## 14 : 1 Max. :0.4328 Max. :188.00
## (Other):93
The summary command informs us that the average rwg value is .186 and the range is from 0 to 0.433. By convention, values at or above 0.70 are considered good agreement, so there appears to be low agreement among individuals with regard to religion. The summary command also provides information about the group sizes.
hist(RWG.RELIG[,2])
To calculate rwg for work hours, the expected random variance (EV) needs to be changed from its default value of 2. Work hours was asked using an 11-point item, so EV based on the rectangular distribution is 10 [(11^2 -1)/12)
RWG.HRS<-rwg(bhr2000$HRS,bhr2000$GRP,ranvar=10.00)
mean(RWG.HRS[,2])
## [1] 0.7353417
The first argument to rwg.j is a matrix instead of a vector. In the matrix, each column represents one item in the multi-item scale, and each row represents an individual response. For instance, columns 2-12 in bhr2000 represent 11 items comprising a leadership scale. The items were assessed using 5-point response options (Strongly Disagree to Strongly Agree), so the expected random variance is 2.
RWGJ.LEAD<-rwg.j(bhr2000[,2:12],bhr2000$GRP,ranvar=2)
summary(RWGJ.LEAD)
## grpid rwg.j gsize
## 1 : 1 Min. :0.7859 Min. : 8.00
## 10 : 1 1st Qu.:0.8708 1st Qu.: 29.50
## 11 : 1 Median :0.8925 Median : 45.00
## 12 : 1 Mean :0.8876 Mean : 54.55
## 13 : 1 3rd Qu.:0.9088 3rd Qu.: 72.50
## 14 : 1 Max. :0.9440 Max. :188.00
## (Other):93
awg index
AWG.LEAD<-awg(bhr2000[,2:12],bhr2000$GRP)
summary(AWG.LEAD)
## grpid a.wg nitems nraters avg.grp.var
## 1 : 1 Min. :0.2223 Min. :11 Min. : 8.00 Min. :0.2787
## 10 : 1 1st Qu.:0.3654 1st Qu.:11 1st Qu.: 29.50 1st Qu.:0.4348
## 11 : 1 Median :0.4193 Median :11 Median : 45.00 Median :0.5166
## 12 : 1 Mean :0.4125 Mean :11 Mean : 54.55 Mean :0.5157
## 13 : 1 3rd Qu.:0.4635 3rd Qu.:11 3rd Qu.: 72.50 3rd Qu.:0.5692
## 14 : 1 Max. :0.5781 Max. :11 Max. :188.00 Max. :0.9144
## (Other):93