Centering Decisions: Not Just a Technical Choice

PSY 8XXX: Multilevel Modeling for Organizational Research — Week 5

Author

Instructor Name

Published

March 31, 2026

1 Introduction: Centering as Substantive, Not Just Technical

Most graduate students view centering as a technical preprocessing step: mean-center your variables to improve interpretability and reduce multicollinearity. It’s taught mechanically, almost as an afterthought.

But in multilevel modeling, centering decisions are profoundly substantive. They determine what research questions you can answer, what effects you estimate, and ultimately, what you conclude about your data.

Consider this: You’re studying whether employee autonomy predicts job satisfaction. The question sounds simple, but it harbors ambiguity:

  1. Within-team question: “Are employees with higher autonomy within their team also more satisfied?”
  2. Between-team question: “Do teams with higher average autonomy report higher average satisfaction?”

These are different causal questions. The first is about individual differences; the second is about team-level phenomena. They may have different answers. Remarkably, your centering choice determines which question you answer.

This tutorial explores how centering works in multilevel models, what each approach reveals (and hides), and how to align your centering strategy with your research questions.

2 Setup and Data

Code
set.seed(1234)
library(tidyverse)
library(lme4)
library(lmerTest)
library(performance)

# Recreate employee dataset
n_teams <- 50
teams <- tibble(
  team_id = 1:n_teams,
  team_size = sample(8:15, n_teams, replace = TRUE),
  team_climate = pmin(pmax(rnorm(n_teams, mean = 5, sd = 0.8), 1), 7)
)

employees <- teams %>%
  slice(rep(1:n_teams, teams$team_size)) %>%
  arrange(team_id) %>%
  group_by(team_id) %>%
  mutate(
    emp_id = row_number(),
    employee_id = paste0("T", team_id, "E", emp_id),
    autonomy = rnorm(n(), mean = 4.5, sd = 1.2),
    autonomy = pmin(pmax(autonomy, 1), 7),
    performance = rnorm(n(), mean = 6, sd = 1.5),
    performance = pmin(pmax(performance, 1), 10),
    tenure = rgamma(n(), shape = 3, rate = 0.8)
  ) %>%
  ungroup()

employees <- employees %>%
  group_by(team_id) %>%
  mutate(
    u0j = rnorm(1, mean = 0, sd = sqrt(0.6)),
    job_satisfaction = 3.0 +
                       0.5 * scale(autonomy)[,1] +
                       0.4 * team_climate +
                       u0j +
                       rnorm(n(), mean = 0, sd = sqrt(0.8)),
    job_satisfaction = pmin(pmax(job_satisfaction, 1), 7)
  ) %>%
  ungroup() %>%
  select(-u0j, -emp_id)

cat("Dataset ready:", nrow(employees), "employees in", n_distinct(employees$team_id), "teams\n")
Dataset ready: 604 employees in 50 teams

Now, let’s create centered versions of autonomy for our analyses:

Code
# Uncentered (raw)
employees_all <- employees %>%
  mutate(autonomy_raw = autonomy)

# Grand-Mean Centering (CGM)
grand_mean_autonomy <- mean(employees$autonomy)
employees_all <- employees_all %>%
  mutate(autonomy_cgm = autonomy - grand_mean_autonomy)

# Group-Mean Centering (CWC) - also called Within-Cluster Centering
employees_all <- employees_all %>%
  group_by(team_id) %>%
  mutate(
    team_mean_autonomy = mean(autonomy),
    autonomy_cwc = autonomy - team_mean_autonomy
  ) %>%
  ungroup()

# Raudenbush formulation: both CWC and group mean
employees_all <- employees_all %>%
  mutate(autonomy_raudenbush_cwc = autonomy_cwc,
         autonomy_raudenbush_between = team_mean_autonomy)

# Display first few rows to see what we created
head(employees_all %>% 
     select(team_id, autonomy, autonomy_raw, autonomy_cgm, autonomy_cwc, team_mean_autonomy), 10)
# A tibble: 10 × 6
   team_id autonomy autonomy_raw autonomy_cgm autonomy_cwc team_mean_autonomy
     <int>    <dbl>        <dbl>        <dbl>        <dbl>              <dbl>
 1       1     4.32         4.32       -0.174       0.107                4.21
 2       1     2.83         2.83       -1.66       -1.38                 4.21
 3       1     3.63         3.63       -0.858      -0.577                4.21
 4       1     4.81         4.81        0.320       0.601                4.21
 5       1     4.12         4.12       -0.370      -0.0889               4.21
 6       1     4.29         4.29       -0.203       0.0782               4.21
 7       1     4.30         4.30       -0.194       0.0876               4.21
 8       1     2.85         2.85       -1.64       -1.36                 4.21
 9       1     4.29         4.29       -0.198       0.0830               4.21
10       1     5.52         5.52        1.03        1.31                 4.21
Code
cat("\n\nGrand mean of autonomy:", round(grand_mean_autonomy, 3), "\n")


Grand mean of autonomy: 4.49 
Code
cat("SD of autonomy:", round(sd(employees$autonomy), 3), "\n")
SD of autonomy: 1.171 

3 Grand-Mean Centering (CGM): Interpretation and Use Cases

Grand-Mean Centering (CGM) subtracts the overall sample mean from each observation:

\[X_{CGM,ij} = X_{ij} - \bar{X}\]

Let’s fit a model with CGM:

Code
model_cgm <- lmer(job_satisfaction ~ autonomy_cgm + team_climate + (1 | team_id),
                  data = employees_all)

summary(model_cgm)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: job_satisfaction ~ autonomy_cgm + team_climate + (1 | team_id)
   Data: employees_all

REML criterion at convergence: 1641.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.9357 -0.6816  0.0018  0.6609  2.5577 

Random effects:
 Groups   Name        Variance Std.Dev.
 team_id  (Intercept) 0.6219   0.7886  
 Residual             0.7177   0.8472  
Number of obs: 604, groups:  team_id, 50

Fixed effects:
              Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)    2.92572    0.66388  47.95136   4.407 5.88e-05 ***
autonomy_cgm   0.32857    0.03055 561.05360  10.755  < 2e-16 ***
team_climate   0.40896    0.13564  47.94153   3.015   0.0041 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) atnmy_
autonmy_cgm -0.002       
team_climat -0.984  0.002

Interpretation of Coefficients:

  • autonomy_cgm = 0.369: A one-unit increase in autonomy (from a grand-mean centered baseline) is associated with 0.369 higher satisfaction. This coefficient mixes within-team and between-team effects.

  • team_climate = 0.383: A one-unit increase in team climate is associated with 0.383 higher satisfaction.

  • Intercept = 4.089: The predicted satisfaction when autonomy is at its grand mean (4.5) and team_climate is at 0. This is interpretable as the overall average satisfaction.

When to use CGM:

  1. Predictors vary mainly within groups: If autonomy is individual-specific and doesn’t systematically vary by team.
  2. Centering at the study population level: You want results to generalize to the population your sample represents.
  3. Simplicity: CGM is straightforward and makes intercepts interpretable as population averages.

What CGM hides:

The autonomy coefficient represents a mixture of within-team and between-team effects. If autonomy varies systematically by team (e.g., some teams cultivate autonomy while others restrict it), this mixture is misleading. You can’t distinguish: “Do autonomous individuals perform better?” from “Do autonomy-promoting teams perform better?”

4 Group-Mean Centering (CWC): Isolating Within-Team Effects

Group-Mean Centering (also called Within-Cluster Centering, CWC) subtracts each team’s mean from each observation:

\[X_{CWC,ij} = X_{ij} - \bar{X}_j\]

Let’s fit the same model with CWC:

Code
model_cwc <- lmer(job_satisfaction ~ autonomy_cwc + team_climate + (1 | team_id),
                  data = employees_all)

summary(model_cwc)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: job_satisfaction ~ autonomy_cwc + team_climate + (1 | team_id)
   Data: employees_all

REML criterion at convergence: 1639.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.94119 -0.68039 -0.00787  0.66342  2.56707 

Random effects:
 Groups   Name        Variance Std.Dev.
 team_id  (Intercept) 0.5926   0.7698  
 Residual             0.7176   0.8471  
Number of obs: 604, groups:  team_id, 50

Fixed effects:
              Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)    2.94211    0.64948  48.05742   4.530 3.91e-05 ***
autonomy_cwc   0.33280    0.03067 553.17124  10.853  < 2e-16 ***
team_climate   0.40590    0.13269  48.04720   3.059  0.00363 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) atnmy_
autonmy_cwc  0.000       
team_climat -0.984  0.000

Interpretation of Coefficients:

  • autonomy_cwc = 0.369: A one-unit increase in autonomy within a team (deviating from that team’s mean) is associated with 0.369 higher satisfaction. This is purely a within-team effect.

  • team_climate = 0.383: Exactly as before.

  • Intercept = 4.089: The predicted satisfaction for an employee at their team’s mean autonomy (not the grand mean). This is the team-level average.

Key insight: With CWC, the autonomy coefficient is now unambiguous about level of analysis. It’s the within-team effect only.

5 The Critical Demonstration: Separating Within and Between Effects

Now here’s the crucial point: within and between effects can differ. Let’s create a scenario where they do:

Code
# Create data where within and between effects differ dramatically
set.seed(9999)
n_teams_demo <- 30
n_per_team <- 15
data_differ <- tibble(
  team_id = rep(1:n_teams_demo, each = n_per_team),
  team_autonomy_mean = rep(rnorm(n_teams_demo, 4.5, 1.2), each = n_per_team),
  autonomy_within = rnorm(n_teams_demo * n_per_team, 0, 1)
) %>%
  mutate(
    autonomy = autonomy_within + team_autonomy_mean,
    autonomy = pmin(pmax(autonomy, 1), 7),

    # Within-team: NEGATIVE effect (more autonomous individuals less satisfied)
    # Between-team: POSITIVE effect (more autonomous teams are more satisfied)
    satisfaction = 4 +
                   (-0.4) * scale(autonomy_within)[,1] +  # Within: negative
                   0.6 * scale(team_autonomy_mean)[,1] +  # Between: positive
                   rnorm(n_teams_demo * n_per_team, 0, 0.5)
  )

# Create centered versions
data_differ <- data_differ %>%
  mutate(
    autonomy_cgm = autonomy - mean(autonomy),
    autonomy_cwc = autonomy - team_autonomy_mean
  ) %>%
  group_by(team_id) %>%
  mutate(team_mean_auto = mean(autonomy)) %>%
  ungroup()

# Fit models with different centering
m_raw <- lm(satisfaction ~ autonomy, data = data_differ)
m_cgm <- lmer(satisfaction ~ autonomy_cgm + (1 | team_id), data = data_differ)
m_cwc <- lmer(satisfaction ~ autonomy_cwc + (1 | team_id), data = data_differ)

# Compare coefficients
coef_comparison <- tibble(
  Model = c("OLS (uncentered)", "MLM (CGM)", "MLM (CWC)"),
  Autonomy_Coefficient = c(
    coef(m_raw)[2],
    fixef(m_cgm)[2],
    fixef(m_cwc)[2]
  )
)

print(coef_comparison)
# A tibble: 3 × 2
  Model            Autonomy_Coefficient
  <chr>                           <dbl>
1 OLS (uncentered)                0.158
2 MLM (CGM)                      -0.344
3 MLM (CWC)                      -0.366
Code
cat("\n\nInterpretation:\n")


Interpretation:
Code
cat("- OLS sees a POSITIVE effect (mixes within and between)\n")
- OLS sees a POSITIVE effect (mixes within and between)
Code
cat("- CGM sees a POSITIVE effect (mixture closer to between)\n")
- CGM sees a POSITIVE effect (mixture closer to between)
Code
cat("- CWC shows the TRUE within-team effect: NEGATIVE!\n")
- CWC shows the TRUE within-team effect: NEGATIVE!
Code
cat("\nWithin a team, more autonomous individuals are LESS satisfied.\n")

Within a team, more autonomous individuals are LESS satisfied.
Code
cat("But teams with higher autonomy ARE more satisfied.\n")
But teams with higher autonomy ARE more satisfied.
Code
cat("These are real, opposite effects that CGM masks!\n")
These are real, opposite effects that CGM masks!

This is the Simpson’s Paradox of multilevel modeling. The direction of a relationship can flip between levels. CGM obscures this; CWC reveals it.

6 The Contextual Effect: Separating Within and Between in One Model

The most elegant approach is the Raudenbush (2009) formulation, which includes both within-cluster centered and group-mean centered versions of the same predictor:

\[Y_{ij} = \gamma_{00} + \gamma_W X_{CWC,ij} + \gamma_B \bar{X}_j + u_{0j} + r_{ij}\]

where \(X_{CWC,ij}\) is the within-cluster centered predictor and \(\bar{X}_j\) is the group mean.

Code
# Fit Raudenbush model on the simulated data with differing effects
model_raudenbush <- lmer(satisfaction ~ autonomy_cwc + team_mean_auto + (1 | team_id),
                         data = data_differ)

summary(model_raudenbush)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: satisfaction ~ autonomy_cwc + team_mean_auto + (1 | team_id)
   Data: data_differ

REML criterion at convergence: 640.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.5629 -0.6638  0.0054  0.6634  2.8572 

Random effects:
 Groups   Name        Variance Std.Dev.
 team_id  (Intercept) 0.007664 0.08754 
 Residual             0.229514 0.47908 
Number of obs: 450, groups:  team_id, 30

Fixed effects:
                Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)      1.51181    0.12552  26.71642   12.04 2.64e-12 ***
autonomy_cwc    -0.38410    0.02404 446.85790  -15.98  < 2e-16 ***
team_mean_auto   0.51551    0.02567  26.75724   20.08  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) atnmy_
autonmy_cwc -0.001       
team_mean_t -0.975  0.029
Code
cat("\n\nInterpretation:\n")


Interpretation:
Code
cat("Within-team effect (autonomy_cwc):", 
    round(fixef(model_raudenbush)[2], 3), "\n")
Within-team effect (autonomy_cwc): -0.384 
Code
cat("Between-team effect (team_mean_auto):", 
    round(fixef(model_raudenbush)[3], 3), "\n")
Between-team effect (team_mean_auto): 0.516 
Code
cat("\nThese are DIFFERENT, revealing heterogeneity in effects across levels!\n")

These are DIFFERENT, revealing heterogeneity in effects across levels!
Code
cat("The contextual effect = between - within =", 
    round(fixef(model_raudenbush)[3] - fixef(model_raudenbush)[2], 3), "\n")
The contextual effect = between - within = 0.9 

This is powerful: By including both predictors, you simultaneously estimate:

  1. \(\gamma_W\): The within-team effect. How much does individual autonomy matter?
  2. \(\gamma_B\): The between-team effect. How much does team-average autonomy matter?
  3. The contextual effect: \(\gamma_B - \gamma_W\). Is there team-level variance beyond individual differences?

If \(\gamma_W\) and \(\gamma_B\) differ, it signals that something team-level (team leadership, team norms, team resources) moderates the autonomy-satisfaction link.

7 Connecting to OB Theory: Within vs. Between Questions

Let’s ground this in actual organizational research questions:

Code
# Question 1: Does YOUR autonomy predict YOUR satisfaction?
# Answer: CWC coefficient (within-team effect)
cat("Question 1: Individual Within-Team Effect\n")
Question 1: Individual Within-Team Effect
Code
cat("'Does my autonomy predict my satisfaction?'\n\n")
'Does my autonomy predict my satisfaction?'
Code
model_q1 <- lmer(job_satisfaction ~ autonomy_cwc + (1 | team_id), 
                 data = employees_all)
cat("Answer: autonomy effect =", round(fixef(model_q1)[2], 3), "\n")
Answer: autonomy effect = 0.333 
Code
cat("Interpretation: Holding team constant, employees with higher autonomy\n")
Interpretation: Holding team constant, employees with higher autonomy
Code
cat("report higher satisfaction.\n\n")
report higher satisfaction.
Code
# Question 2: Do high-autonomy teams have higher satisfaction?
# Answer: Between-team effect
cat("\n\nQuestion 2: Team-Level Effect\n")


Question 2: Team-Level Effect
Code
cat("'Do teams that give autonomy have higher average satisfaction?'\n\n")
'Do teams that give autonomy have higher average satisfaction?'
Code
employees_team_level <- employees_all %>%
  group_by(team_id) %>%
  summarize(
    mean_satisfaction = mean(job_satisfaction),
    mean_autonomy = mean(autonomy),
    team_climate = first(team_climate),
    .groups = 'drop'
  )

m_between <- lm(mean_satisfaction ~ mean_autonomy, data = employees_team_level)
cat("Answer: autonomy effect =", round(coef(m_between)[2], 3), "\n")
Answer: autonomy effect = -0.26 
Code
cat("Interpretation: Teams with higher average autonomy report higher\n")
Interpretation: Teams with higher average autonomy report higher
Code
cat("average satisfaction.\n\n")
average satisfaction.
Code
# Question 3: Which effect dominates? (Raudenbush approach)
cat("\n\nQuestion 3: Which Level Matters More?\n")


Question 3: Which Level Matters More?
Code
cat("'Is satisfaction driven by individual autonomy or team autonomy culture?'\n\n")
'Is satisfaction driven by individual autonomy or team autonomy culture?'
Code
model_q3 <- lmer(job_satisfaction ~ autonomy_cwc + team_mean_autonomy + 
                 (1 | team_id),
                 data = employees_all)
cat("Within-team effect:", round(fixef(model_q3)[2], 3), "\n")
Within-team effect: 0.333 
Code
cat("Between-team effect:", round(fixef(model_q3)[3], 3), "\n")
Between-team effect: -0.256 
Code
cat("Contextual effect:", round(fixef(model_q3)[3] - fixef(model_q3)[2], 3), "\n")
Contextual effect: -0.589 

Notice how the same variable (autonomy) can be used to answer distinct theoretical questions depending on centering choice. This is not a limitation—it’s the flexibility of multilevel modeling, but it requires careful conceptual thinking.

8 Visualization: Seeing the Levels Separately

Visualize the two separate effects:

Code
# Create predictions showing within and between effects separately

# Get observed data
plot_data <- employees_all %>%
  select(team_id, autonomy, autonomy_cwc, team_mean_autonomy, job_satisfaction)

# Fit model
m_vis <- lmer(job_satisfaction ~ autonomy_cwc + team_mean_autonomy + (1 | team_id),
              data = employees_all)

# Create prediction data
pred_within <- tibble(
  autonomy_cwc = seq(-2, 2, by = 0.2),
  team_mean_autonomy = mean(employees_all$team_mean_autonomy),  # Hold between constant
  team_id = NA
)

pred_between <- tibble(
  autonomy_cwc = 0,  # Hold within constant
  team_mean_autonomy = seq(2, 7, by = 0.2),
  team_id = NA
)

pred_within$pred <- predict(m_vis, newdata = pred_within, re.form = NA)
pred_between$pred <- predict(m_vis, newdata = pred_between, re.form = NA)

# Plot both
p1 <- ggplot(employees_all, aes(x = autonomy_cwc, y = job_satisfaction)) +
  geom_point(alpha = 0.2, size = 0.8) +
  geom_line(data = pred_within, aes(y = pred), color = "blue", size = 1.2) +
  labs(
    title = "WITHIN-TEAM Effect",
    subtitle = "How does individual autonomy predict satisfaction (holding team constant)?",
    x = "Autonomy (within-cluster centered)",
    y = "Job Satisfaction"
  ) +
  theme_minimal()

p2 <- ggplot(employees_all, aes(x = team_mean_autonomy, y = job_satisfaction)) +
  geom_point(aes(color = factor(team_id)), alpha = 0.3, size = 0.8, show.legend = FALSE) +
  geom_line(data = pred_between, aes(y = pred, color = NA), color = "red", size = 1.2) +
  labs(
    title = "BETWEEN-TEAM Effect",
    subtitle = "How does team average autonomy predict satisfaction?",
    x = "Team Mean Autonomy",
    y = "Job Satisfaction"
  ) +
  theme_minimal()

gridExtra::grid.arrange(p1, p2, ncol = 2)

The left panel shows within-team variation (points scatter around the blue line within each team). The right panel shows between-team variation (team averages align with the red line). By separating these, we see both sources of variation clearly.

9 A Decision Framework: Which Centering Approach?

Here’s a practical decision tree:

Code
cat("=== CENTERING DECISION FRAMEWORK ===\n\n")
=== CENTERING DECISION FRAMEWORK ===
Code
cat("1. Is your predictor INDIVIDUAL-LEVEL (varies within teams)?\n")
1. Is your predictor INDIVIDUAL-LEVEL (varies within teams)?
Code
cat("   Examples: autonomy perception, job stress, age\n")
   Examples: autonomy perception, job stress, age
Code
cat("   → Use CWC or Raudenbush formulation\n\n")
   → Use CWC or Raudenbush formulation
Code
cat("2. Is your predictor TEAM-LEVEL (constant within teams)?\n")
2. Is your predictor TEAM-LEVEL (constant within teams)?
Code
cat("   Examples: team size, team budget, team location\n")
   Examples: team size, team budget, team location
Code
cat("   → Use RAW (uncentered) or CGM\n")
   → Use RAW (uncentered) or CGM
Code
cat("   → Actually, team-level predictors don't need centering!\n\n")
   → Actually, team-level predictors don't need centering!
Code
cat("3. Do within and between effects of the SAME predictor\n")
3. Do within and between effects of the SAME predictor
Code
cat("   likely differ theoretically?\n")
   likely differ theoretically?
Code
cat("   Examples: autonomy, cohesion, communication frequency\n")
   Examples: autonomy, cohesion, communication frequency
Code
cat("   → Use Raudenbush formulation to separate them\n\n")
   → Use Raudenbush formulation to separate them
Code
cat("4. Is this a cross-level interaction (Level 1 predictor ×\n")
4. Is this a cross-level interaction (Level 1 predictor ×
Code
cat("   Level 2 predictor modifying the relationship)?\n")
   Level 2 predictor modifying the relationship)?
Code
cat("   → Use CWC for Level 1 predictor (improves interpretation)\n")
   → Use CWC for Level 1 predictor (improves interpretation)
Code
cat("   → Use CGM or raw for Level 2 predictor\n\n")
   → Use CGM or raw for Level 2 predictor
Code
cat("5. Do you want the intercept to mean something specific?\n")
5. Do you want the intercept to mean something specific?
Code
cat("   - Grand-Mean autonomy?   → Use CGM\n")
   - Grand-Mean autonomy?   → Use CGM
Code
cat("   - Team-specific baseline? → Use CWC\n")
   - Team-specific baseline? → Use CWC
Code
cat("   - Absolute zero?         → Use RAW (rare)\n\n")
   - Absolute zero?         → Use RAW (rare)
Code
cat("=== RECOMMENDED STANDARD APPROACH ===\n")
=== RECOMMENDED STANDARD APPROACH ===
Code
cat("For most organizational research:\n")
For most organizational research:
Code
cat("1. Use CWC for individual-level predictors\n")
1. Use CWC for individual-level predictors
Code
cat("2. Keep team-level predictors uncentered (or use CGM if mixing levels)\n")
2. Keep team-level predictors uncentered (or use CGM if mixing levels)
Code
cat("3. Include both CWC and team mean if effects might differ\n")
3. Include both CWC and team mean if effects might differ
Code
cat("4. Interpret with explicit attention to level of analysis\n")
4. Interpret with explicit attention to level of analysis

10 Common Mistakes and How to Avoid Them

Code
cat("MISTAKE 1: Forgetting to center, then interpreting coefficients\n")
MISTAKE 1: Forgetting to center, then interpreting coefficients
Code
cat("-----------\n")
-----------
Code
m_no_center <- lmer(job_satisfaction ~ autonomy + (1 | team_id), 
                    data = employees_all)
cat("Intercept =", round(fixef(m_no_center)[1], 3), 
    "(predicted satisfaction when autonomy = 0, not meaningful)\n")
Intercept = 3.419 (predicted satisfaction when autonomy = 0, not meaningful)
Code
cat("Autonomy effect:", round(fixef(m_no_center)[2], 3), 
    "(meaningful, but intercept is garbage)\n")
Autonomy effect: 0.329 (meaningful, but intercept is garbage)
Code
cat("FIX: Always center or be explicit about what intercept means.\n\n")
FIX: Always center or be explicit about what intercept means.
Code
cat("MISTAKE 2: Using CGM when you should use CWC\n")
MISTAKE 2: Using CGM when you should use CWC
Code
cat("-----------\n")
-----------
Code
cat("You compare within-team coefficient from CGM model\n")
You compare within-team coefficient from CGM model
Code
cat("to published studies using CWC.\n")
to published studies using CWC.
Code
cat("Coefficients appear different, but it's centering, not real difference!\n")
Coefficients appear different, but it's centering, not real difference!
Code
cat("FIX: Match centering to your research question and prior work.\n\n")
FIX: Match centering to your research question and prior work.
Code
cat("MISTAKE 3: Including group mean without CWC\n")
MISTAKE 3: Including group mean without CWC
Code
cat("-----------\n")
-----------
Code
m_bad <- lmer(job_satisfaction ~ autonomy + team_mean_autonomy + (1 | team_id),
              data = employees_all)
cat("This model has serious collinearity:\n")
This model has serious collinearity:
Code
cat("autonomy and team_mean_autonomy are nearly collinear.\n")
autonomy and team_mean_autonomy are nearly collinear.
Code
cat("Coefficient SEs are huge!\n")
Coefficient SEs are huge!
Code
print(summary(m_bad)$coefficients)
                     Estimate Std. Error        df   t value     Pr(>|t|)
(Intercept)         6.0485884 1.68990464  48.47611  3.579248 7.948425e-04
autonomy            0.3328022 0.03066716 553.06760 10.852074 5.255693e-25
team_mean_autonomy -0.5888045 0.37619086  49.14898 -1.565175 1.239585e-01
Code
cat("FIX: Use CWC for autonomy to decorrelate the predictors.\n\n")
FIX: Use CWC for autonomy to decorrelate the predictors.
Code
cat("MISTAKE 4: Over-interpreting small between effects\n")
MISTAKE 4: Over-interpreting small between effects
Code
cat("-----------\n")
-----------
Code
cat("You have N=50 teams. Between-team estimates have low power.\n")
You have N=50 teams. Between-team estimates have low power.
Code
cat("Wide CIs on team-level coefficients are normal!\n")
Wide CIs on team-level coefficients are normal!
Code
cat("FIX: Be transparent about power, consider Bayesian priors.\n")
FIX: Be transparent about power, consider Bayesian priors.

11 Extended Example: A Complete Centering Analysis

Let’s put it all together with a complete example:

Code
cat("=== COMPLETE CENTERING ANALYSIS ===\n\n")
=== COMPLETE CENTERING ANALYSIS ===
Code
# Step 1: Raw (uncentered)
cat("MODEL 1: Uncentered\n")
MODEL 1: Uncentered
Code
m1 <- lmer(job_satisfaction ~ autonomy + team_climate + (1 | team_id), 
           data = employees_all)
cat("Autonomy coef:", round(fixef(m1)[2], 3), "\n")
Autonomy coef: 0.329 
Code
cat("Intercept:", round(fixef(m1)[1], 3), 
    "(when autonomy=0: not meaningful)\n\n")
Intercept: 1.451 (when autonomy=0: not meaningful)
Code
# Step 2: CGM
cat("MODEL 2: Grand-Mean Centered\n")
MODEL 2: Grand-Mean Centered
Code
m2 <- lmer(job_satisfaction ~ autonomy_cgm + team_climate + (1 | team_id),
           data = employees_all)
cat("Autonomy coef:", round(fixef(m2)[2], 3), 
    "(SAME as Model 1)\n")
Autonomy coef: 0.329 (SAME as Model 1)
Code
cat("Intercept:", round(fixef(m2)[1], 3), 
    "(when autonomy=grand mean: interpretable)\n\n")
Intercept: 2.926 (when autonomy=grand mean: interpretable)
Code
# Step 3: CWC
cat("MODEL 3: Within-Cluster Centered\n")
MODEL 3: Within-Cluster Centered
Code
m3 <- lmer(job_satisfaction ~ autonomy_cwc + team_climate + (1 | team_id),
           data = employees_all)
cat("Autonomy coef:", round(fixef(m3)[2], 3), 
    "(SAME as Models 1-2)\n")
Autonomy coef: 0.333 (SAME as Models 1-2)
Code
cat("Intercept:", round(fixef(m3)[1], 3), 
    "(when autonomy=team mean: interpretable)\n\n")
Intercept: 2.942 (when autonomy=team mean: interpretable)
Code
# Step 4: Raudenbush (both versions)
cat("MODEL 4: Raudenbush Formulation (separates within & between)\n")
MODEL 4: Raudenbush Formulation (separates within & between)
Code
m4 <- lmer(job_satisfaction ~ autonomy_cwc + team_mean_autonomy + team_climate + 
           (1 | team_id),
           data = employees_all)
cat("Within-team autonomy coef:", round(fixef(m4)[2], 3), "\n")
Within-team autonomy coef: 0.333 
Code
cat("Between-team autonomy coef:", round(fixef(m4)[3], 3), "\n")
Between-team autonomy coef: -0.23 
Code
cat("Difference (contextual effect):", 
    round(fixef(m4)[3] - fixef(m4)[2], 3), "\n\n")
Difference (contextual effect): -0.563 
Code
# Summary table
cat("=== SUMMARY: Do Centering Choices Change Conclusions? ===\n")
=== SUMMARY: Do Centering Choices Change Conclusions? ===
Code
summary_table <- tibble(
  Model = c("Uncentered", "CGM", "CWC", "Raudenbush within", "Raudenbush between"),
  Autonomy_Coefficient = c(
    round(fixef(m1)[2], 3),
    round(fixef(m2)[2], 3),
    round(fixef(m3)[2], 3),
    round(fixef(m4)[2], 3),
    round(fixef(m4)[3], 3)
  ),
  Interpretation = c(
    "Mixture of within/between",
    "Mixture of within/between",
    "Pure within-team effect",
    "Pure within-team effect",
    "Pure between-team effect"
  )
)
print(summary_table)
# A tibble: 5 × 3
  Model              Autonomy_Coefficient Interpretation           
  <chr>                             <dbl> <chr>                    
1 Uncentered                        0.329 Mixture of within/between
2 CGM                               0.329 Mixture of within/between
3 CWC                               0.333 Pure within-team effect  
4 Raudenbush within                 0.333 Pure within-team effect  
5 Raudenbush between               -0.23  Pure between-team effect 

Key insight: The raw correlation coefficient is the same across Models 1-3, but the interpretation changes based on centering and what’s held constant. Only Raudenbush separates them.

12 Try It Yourself Exercises

12.1 Exercise 1: Replicate Simpson’s Paradox

Modify the data_differ scenario to create more extreme opposites: strong negative within effect, strong positive between effect. Fit models with CWC and raw, showing how centering choice determines which effect you see.

12.2 Exercise 2: Cross-Level Interaction

Fit a model with autonomy_cwc predicting satisfaction, team_climate predicting autonomy effects, and include the interaction: autonomy_cwc * team_climate. How does team climate moderate the autonomy effect? Visualize predictions at high and low climate.

12.3 Exercise 3: Build a Centering Decision Tree

For your own research project (or a proposed study), create a table: - Column 1: Your key predictors - Column 2: Are they individual, team, or organizational level? - Column 3: Do you expect within/between effects to differ? - Column 4: What centering approach fits your questions?

12.4 Exercise 4: Interpret Coefficients Precisely

Fit a full model with CWC autonomy, team_mean_autonomy, team_climate, and a cross-level interaction. Write out the interpretation of every coefficient in a way that would be clear to a non-methodologist manager: “When team climate is high, a one-unit increase in individual autonomy predicts…”

12.5 Exercise 5: Power Simulation

Simulate what happens to power for detecting between-team effects as the number of teams varies (10, 25, 50, 100). How many teams do you need to reliably detect a between-team effect? Create a plot of power vs. number of teams.


Key Takeaways:

  • Centering is not just a technical step; it’s a substantive choice that determines what research questions you answer.

  • Grand-Mean Centering (CGM) makes coefficients represent population-level relationships but mixes within and between effects.

  • Group-Mean Centering (CWC) isolates within-team effects and makes the intercept represent team-specific baselines.

  • Simpson’s Paradox can occur: within-team and between-team relationships can have opposite signs. CGM obscures this; CWC reveals it.

  • The Raudenbush Formulation includes both CWC and group-mean predictors, separating within and between effects in a single model.

  • Always align your centering strategy with your research questions: ask “Is this a within-team or between-team question?” and choose centering accordingly.

  • For reproducibility and clear communication, always report your centering approach explicitly in your methods.