Introduction to Exponential Random Graph Models

A comprehensive guide for I-O psychologists — ERGMs, TERGMs, and STERGMs

Author

Justin M. Jones

Published

March 22, 2026

1. Why I-O Psychologists Should Care About ERGMs

Organizational behavior is fundamentally relational. Whether you study team cohesion, advice-seeking, mentorship, knowledge sharing, or workplace friendship, you are studying a network. Yet most of our statistical toolkit treats observations as independent — an assumption that is deeply violated when data consist of ties between people.

Exponential Random Graph Models (ERGMs) were designed to fix exactly this problem. An ERGM treats the entire network as a single observation and asks: given these people and their attributes, what network structures do we observe more (or less) often than chance would predict?

For I-O psychologists, this matters in several concrete ways:

  • Advice networks: Do employees seek advice based on role similarity (homophily), hierarchy, or do cliques emerge from transitivity?
  • Friendship and socialization: Does a newcomer’s integration into an informal friendship network depend on demographics, performance, or structural opportunity?
  • Leadership networks: Does informal influence diffuse through reciprocal ties or through bridging positions?
  • Team communication: Are there structural holes that predict information bottlenecks?
  • Negative ties: Do conflict or avoidance ties cluster, and do they co-evolve with positive ties?
NoteI-O Perspective

ERGMs sit in the same conceptual space as multilevel models: both account for the non-independence of observations. The key difference is that in ERGMs, the unit of analysis is the dyad (a pair of people), and the dependencies run in all directions simultaneously. Just as you would not use OLS to model clustered data, you should not use logistic regression on adjacency matrices — ERGMs are the appropriate alternative.


2. Network Fundamentals for I-O Contexts

2.1 Nodes and Edges

A network (or graph) consists of nodes (also called vertices or actors) and edges (also called ties or links). In most I-O applications, nodes are people (employees, team members, managers) and edges represent relationships (advice-giving, friendship, collaboration, communication).

2.2 Directed vs. Undirected

Networks can be directed (arrows have a direction: A → B) or undirected (ties are symmetric: A — B).

  • Directed: advice networks (“I go to this person for advice”), task assignment, reporting relationships.
  • Undirected: friendship (“we consider each other friends”), co-membership on a project.

2.3 Key Structural Concepts

Concept Definition I-O Example
Degree Number of ties a node has How many people an employee advises (out-degree) or is advised by (in-degree)
Reciprocity Proportion of directed ties that are mutual (A→B and B→A) Are advice ties mutual? Or does advice flow one-way down the hierarchy?
Transitivity Tendency for “friends of friends to be friends” Knowledge sharing clusters within work groups
Homophily Tendency to form ties with similar others Same-gender, same-department, or same-race friendship ties
Structural holes Gaps in the network between otherwise connected clusters Employees who bridge departments hold informational advantages
Centrality Position-based importance (degree, betweenness, eigenvector) Who controls information flow; who is most influential informally

2.4 Representing Networks in R

In R, networks are stored using the network class from the network package (part of the statnet suite).

library(statnet)
library(ergm)
# Create a directed network from an adjacency matrix
adj_matrix <- matrix(c(
  0,1,0,1,0,
  0,0,1,0,1,
  0,0,0,1,0,
  0,0,0,0,1,
  1,0,0,0,0
), nrow = 5, byrow = TRUE)

net <- network(adj_matrix, directed = TRUE)

# Add node attributes
net %v% "department"  <- c("HR", "HR", "IT", "IT", "Finance")
net %v% "tenure"      <- c(3, 7, 1, 12, 5)
net %v% "performance" <- c(4.2, 3.8, 4.5, 3.1, 4.0)

# Quick summary
summary(net)
Network attributes:
  vertices = 5
  directed = TRUE
  hyper = FALSE
  loops = FALSE
  multiple = FALSE
  bipartite = FALSE
 total edges = 7 
   missing edges = 0 
   non-missing edges = 7 
 density = 0.35 

Vertex attributes:

 department:
   character valued attribute
   attribute summary:
Finance      HR      IT 
      1       2       2 

 performance:
   numeric valued attribute
   attribute summary:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   3.10    3.80    4.00    3.92    4.20    4.50 

 tenure:
   numeric valued attribute
   attribute summary:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    1.0     3.0     5.0     5.6     7.0    12.0 
  vertex.names:
   character valued attribute
   5 valid vertex names

No edge attributes

Network adjacency matrix:
  1 2 3 4 5
1 0 1 0 1 0
2 0 0 1 0 1
3 0 0 0 1 0
4 0 0 0 0 1
5 1 0 0 0 0
plot(
  net,
  vertex.col  = as.numeric(as.factor(net %v% "department")),
  vertex.cex  = 2,
  arrowhead.cex = 0.8,
  main = "Simple Advice Network (n = 5)"
)
legend("bottomleft",
  legend = levels(as.factor(net %v% "department")),
  fill   = 1:3, bty = "n", cex = 0.85)
Figure 1: A small directed advice network. Nodes are coloured by department.
TipImporting Real Data

Most I-O datasets come as edge lists or sociomatrices from survey software. Load an edge list with network(edgelist, matrix.type = "edgelist", directed = TRUE).


3. The ERGM Framework: Core Theory

3.1 The Problem ERGMs Solve

Imagine you survey 50 employees asking who they turn to for advice. You now have a 50×50 adjacency matrix with about 2,450 potential dyads (directed). A naive approach might run a logistic regression predicting whether each dyad has a tie. The problem: those 2,450 observations are not independent. Whether A→B depends on whether B→A (reciprocity), whether A has many ties in general (degree), and whether A and C are both connected to B (transitivity). Ordinary logistic regression will produce biased standard errors and incorrect inferences.

ERGMs solve this by modeling the whole network jointly as a single random variable and specifying which structural features (terms) are important.

3.2 The Core Probability Model

An ERGM defines the probability of observing a particular network Y as:

\[P(\mathbf{Y} = \mathbf{y}) = \frac{\exp\{\boldsymbol{\theta}^T \mathbf{g}(\mathbf{y})\}}{\kappa(\boldsymbol{\theta})}\]

where:

  • \(\mathbf{g}(\mathbf{y})\) is a vector of sufficient statistics — network summary features like edge count, mutual dyads, triangles, etc.
  • \(\boldsymbol{\theta}\) is a vector of parameters (one per statistic) estimated from the data.
  • \(\kappa(\boldsymbol{\theta})\) is a normalizing constant that sums over all possible networks on \(n\) nodes. This is computationally intractable for large networks, which is why ERGMs use MCMC-MLE.

3.3 The Change Score Interpretation

A more intuitive form comes from the conditional log-odds of a single tie existing, given all other ties:

\[\log\left[\frac{P(Y_{ij}=1 \mid \mathbf{Y}_{-ij})}{P(Y_{ij}=0 \mid \mathbf{Y}_{-ij})}\right] = \boldsymbol{\theta}^T \delta\mathbf{g}_{ij}(\mathbf{y})\]

Here, \(\delta\mathbf{g}_{ij}(\mathbf{y})\) is the change statistic — how much each network statistic changes if you toggle the \(i \to j\) tie from 0 to 1, holding all other ties fixed. Each \(\theta\) coefficient tells you how much (in log-odds) the probability of a tie changes for a one-unit increase in the corresponding statistic.

3.4 Why Not Just Use Logistic Regression on Dyads?

Even QAP (quadratic assignment procedure) regression, while it corrects p-values via permutation, cannot model endogenous network tendencies like reciprocity or transitivity as predictors of tie formation. ERGMs can. If your theoretical model says “ties form partly because of triadic closure” (which is almost always true in organizations), you need ERGMs.


4. ERGM Model Terms

4.1 Structural Terms

Term What It Captures I-O Interpretation
edges Baseline density (intercept-like) Overall tendency to form ties; almost always included
mutual Reciprocity: A→B and B→A co-occur Do advice relationships go both ways?
gwesp(decay, fixed=TRUE) Geometrically-weighted edgewise shared partners Transitivity without degeneracy
gwdegree(decay, fixed=TRUE) Geometrically-weighted degree (undirected) Controls degree heterogeneity
gwidegree(decay, fixed=TRUE) Geometrically-weighted in-degree Popularity effects / hub structure
gwodegree(decay, fixed=TRUE) Geometrically-weighted out-degree Activity / expansiveness
ctriple Cyclic triples (A→B, B→C, C→A) Hierarchy: negative θ = cyclic triads suppressed
WarningDegeneracy Warning

The triangle and kstar terms are notorious for causing model degeneracy — the model produces distributions that are nearly entirely empty or nearly entirely full networks. In practice, use the geometrically-weighted alternates (gwesp, gwdegree) instead.

4.2 Nodal Attribute Terms

Term What It Captures I-O Example
nodecov("attr") Main effect of a continuous attribute Do higher performers have more ties overall?
nodeicov("attr") In-degree version Are higher-tenure employees sought out more?
nodeocov("attr") Out-degree version Do extraverted employees nominate more?
nodefactor("attr") Main effect of a categorical attribute Do managers have more ties than ICs?
nodematch("attr") Homophily: ties more likely between same-category nodes Same-department friendships
nodematch("attr", diff=TRUE) Differential homophily Is gender homophily stronger for women?
absdiff("attr") Tie likelihood decreases with attribute difference Do similar-tenure employees befriend each other?

4.3 Dyadic Covariate Terms

Term What It Captures I-O Example
edgecov(matrix) A dyad-level covariate Does physical proximity predict advice ties?
edgecov(network) Another network as a predictor Does prior project co-membership predict current friendship?

5. Getting Started in R: A Running Example

We will use a simulated organizational advice network throughout the rest of this tutorial. The organization has 30 employees across three departments (Operations, HR, IT), with varying tenure and performance ratings.

set.seed(42)
n <- 30

# Node attributes
dept        <- rep(c("Operations", "HR", "IT"), each = 10)
tenure      <- round(runif(n, 1, 15), 1)
performance <- round(rnorm(n, mean = 4, sd = 0.5), 2)
is_manager  <- c(rep(1,2), rep(0,8),   # 2 managers per dept
                 rep(1,2), rep(0,8),
                 rep(1,2), rep(0,8))

# Build a directed network with department homophily + manager attraction
advice_net <- network(n, directed = TRUE, density = 0)

for (i in 1:n) {
  for (j in 1:n) {
    if (i == j) next
    p_base      <- 0.05
    p_same_dept <- if (dept[i] == dept[j]) 0.12 else 0
    p_mgr       <- if (is_manager[j] == 1) 0.10 else 0
    p_tenure    <- tenure[j] / 100
    p_tie       <- min(1, p_base + p_same_dept + p_mgr + p_tenure)
    if (runif(1) < p_tie) add.edge(advice_net, i, j)
  }
}

# Attach node attributes
advice_net %v% "department"  <- dept
advice_net %v% "tenure"      <- tenure
advice_net %v% "performance" <- performance
advice_net %v% "is_manager"  <- is_manager

# Network summary
advice_net
 Network attributes:
  vertices = 30 
  directed = TRUE 
  hyper = FALSE 
  loops = FALSE 
  multiple = FALSE 
  bipartite = FALSE 
  total edges= 183 
    missing edges= 0 
    non-missing edges= 183 

 Vertex attribute names: 
    department is_manager performance tenure vertex.names 

No edge attributes
network.size(advice_net)
[1] 30
network.edgecount(advice_net)
[1] 183
network.density(advice_net)
[1] 0.2103448
dept_col <- c("Operations" = "#3a7fc1", "HR" = "#d97a32", "IT" = "#7b3fb5")
node_col <- dept_col[advice_net %v% "department"]

plot(
  advice_net,
  vertex.col   = node_col,
  vertex.sides = ifelse(advice_net %v% "is_manager" == 1, 3, 50),
  vertex.cex   = 1.8,
  arrowhead.cex = 0.6,
  main = "Organizational Advice Network",
  sub  = "Shape = role (▲ manager); Colour = department"
)
legend("bottomleft",
  legend = names(dept_col),
  fill   = dept_col,
  bty    = "n", cex = 0.85)
Figure 2: Simulated organizational advice network (n = 30). Shape = role (triangle = manager, circle = employee); colour = department.

6. Fitting Your First ERGM

6.1 Model Building Strategy

Build models incrementally:

  1. Baseline: edges only — establishes the baseline log-odds of a tie.
  2. Structural controls: Add mutual (if directed) and geometrically-weighted terms (gwesp, gwidegree). These control for endogenous network tendencies before adding exogenous predictors.
  3. Attribute effects: Add nodal and dyadic covariates corresponding to your theoretical hypotheses.
  4. Check diagnostics: Inspect MCMC trace plots and goodness-of-fit before interpreting.

6.2 Fitting the Models

# Model 0: Baseline (edges only)
m0 <- ergm(advice_net ~ edges)
summary(m0)
Call:
ergm(formula = advice_net ~ edges)

Maximum Likelihood Results:

      Estimate Std. Error MCMC % z value Pr(>|z|)    
edges -1.32285    0.08319      0   -15.9   <1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 1206.1  on 870  degrees of freedom
 Residual Deviance:  895.1  on 869  degrees of freedom
 
AIC: 897.1  BIC: 901.8  (Smaller is better. MC Std. Err. = 0)

The edges coefficient of −1.32 translates via the inverse-logit to a baseline tie probability of plogis(−1.32)21% — matching the observed network density of 0.21 almost exactly. This is expected: in an edges-only model the intercept simply encodes density. The highly significant z-value (−15.9) just confirms that the network is meaningfully sparser than a 50% coin-flip, not that any structural or attribute process has been identified.

# Model 1: Structural terms
m1 <- ergm(
  advice_net ~ edges
               + mutual
               + gwesp(0.5, fixed = TRUE)
               + gwidegree(0.5, fixed = TRUE),
  control = control.ergm(MCMC.samplesize = 4000,
                         MCMC.burnin     = 10000,
                         seed            = 42)
)
summary(m1)
Call:
ergm(formula = advice_net ~ edges + mutual + gwesp(0.5, fixed = TRUE) + 
    gwidegree(0.5, fixed = TRUE), control = control.ergm(MCMC.samplesize = 4000, 
    MCMC.burnin = 10000, seed = 42))

Monte Carlo Maximum Likelihood Results:

                    Estimate Std. Error MCMC % z value Pr(>|z|)    
edges               -1.14707    0.31075      0  -3.691 0.000223 ***
mutual              -0.19565    0.30035      0  -0.651 0.514783    
gwesp.OTP.fixed.0.5 -0.02382    0.13736      0  -0.173 0.862320    
gwideg.fixed.0.5    -2.42646    0.84455      0  -2.873 0.004065 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 1206.1  on 870  degrees of freedom
 Residual Deviance:  889.6  on 866  degrees of freedom
 
AIC: 897.6  BIC: 916.7  (Smaller is better. MC Std. Err. = 0.09945)

Adding structural controls substantially changes the edges intercept (−1.15, now reflecting a density conditional on the structural terms) but does not by itself reveal any significant endogenous network tendencies in this dataset:

  • mutual = −0.20, ns. No reliable reciprocity over and above the baseline density. The advice network was simulated without an explicit reciprocity mechanism, so this is expected.
  • gwesp.OTP.fixed.0.5 = −0.02, ns. No triangulation / clustering beyond what shared department membership would produce (which isn’t controlled for yet). Triadic closure is not an independent driver of tie formation here.
  • gwideg.fixed.0.5 = −2.43, p = .004. A strongly negative geometrically-weighted in-degree term indicates that the in-degree distribution is more even than a random Erdős–Rényi graph with the same density — advice-seeking is spread across many targets rather than concentrated on a small number of popular advisors. This controls for degree heterogeneity before we add attribute predictors.

Note that AIC barely changes from M0 (897.6 vs. 897.1) and BIC actually worsens (916.7 vs. 901.8) — the structural terms explain little additional variance on their own. This is a signal to look at the exogenous attribute predictors next.

# Model 2: Add attribute effects
m2 <- ergm(
  advice_net ~ edges
               + mutual
               + gwesp(0.5, fixed = TRUE)
               + gwidegree(0.5, fixed = TRUE)
               + nodematch("department")
               + nodeicov("tenure")
               + nodeicov("performance")
               + nodefactor("is_manager"),
  control = control.ergm(MCMC.samplesize = 4000,
                         MCMC.burnin     = 10000,
                         seed            = 42)
)
summary(m2)
Call:
ergm(formula = advice_net ~ edges + mutual + gwesp(0.5, fixed = TRUE) + 
    gwidegree(0.5, fixed = TRUE) + nodematch("department") + 
    nodeicov("tenure") + nodeicov("performance") + nodefactor("is_manager"), 
    control = control.ergm(MCMC.samplesize = 4000, MCMC.burnin = 10000, 
        seed = 42))

Monte Carlo Maximum Likelihood Results:

                        Estimate Std. Error MCMC % z value Pr(>|z|)   
edges                   -1.28217    0.65475      0  -1.958  0.05020 . 
mutual                  -0.26670    0.30741      0  -0.868  0.38563   
gwesp.OTP.fixed.0.5     -0.06517    0.14081      0  -0.463  0.64347   
gwideg.fixed.0.5        -1.83459    0.94313      0  -1.945  0.05175 . 
nodematch.department     0.53897    0.18272      0   2.950  0.00318 **
nodeicov.tenure          0.04983    0.02062      0   2.417  0.01566 * 
nodeicov.performance    -0.14710    0.13699      0  -1.074  0.28292   
nodefactor.is_manager.1  0.23474    0.15603      0   1.504  0.13247   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 1206.1  on 870  degrees of freedom
 Residual Deviance:  870.2  on 862  degrees of freedom
 
AIC: 886.2  BIC: 924.4  (Smaller is better. MC Std. Err. = 0.07082)

Model 2 is the main substantive model. Here is a term-by-term walk-through of what the output tells us:

Term Estimate Odds Ratio Interpretation
edges −1.28† 0.28 Baseline log-odds of a tie; reflects overall density after accounting for all covariates
mutual −0.27, ns 0.77 No reciprocity tendency; advice ties are largely one-directional in this organization
gwesp.OTP.fixed.0.5 −0.07, ns 0.94 No triadic closure once department is controlled; “friends of friends” do not seek advice from each other beyond what department membership already explains
gwideg.fixed.0.5 −1.83† 0.16 In-degree is distributed broadly across the network rather than concentrated in a few go-to advisors
nodematch.department +0.54** 1.71 Same-department pairs are 1.71× more likely to have an advice tie — the dominant structural finding
nodeicov.tenure +0.050* 1.051 Each additional year of tenure in the target employee increases the log-odds of receiving an advice tie by 0.05 — experienced employees are sought out for advice
nodeicov.performance −0.15, ns 0.86 Performance ratings do not significantly predict being sought out for advice, after controlling for tenure
nodefactor.is_manager.1 +0.23, ns 1.26 Managers have somewhat higher in-degree (more people seek their advice) but this is not significant once tenure is in the model — tenure carries the status signal

The two significant findings — departmental homophily and tenure-based attractiveness — replicate exactly the mechanisms used to simulate the data, which validates the recovery. In a real I-O study, these would be theoretically meaningful: informal advice remains bounded by organizational silos, and seniority (not formal performance ratings) drives who gets consulted.

6.3 MCMC Control Parameters

Parameter What It Does Guidance
MCMC.samplesize Networks sampled per iteration Default 500; increase to 4,000–20,000 for noisy models
MCMC.burnin Steps discarded at chain start Default 10,000; increase if traces show poor mixing
MCMC.interval Thinning interval Increase if autocorrelation is high
seed Random seed Always set for reproducibility

7. Interpreting ERGM Results

7.1 The Log-Odds Scale

ERGM estimates are on the log-odds scale, just like logistic regression. The coefficient is the change in the log-odds of a tie forming for a one-unit increase in the corresponding change statistic.

7.2 Converting to Odds Ratios and Probabilities

# Odds ratios
round(exp(coef(m2)), 3)
                  edges                  mutual     gwesp.OTP.fixed.0.5 
                  0.277                   0.766                   0.937 
       gwideg.fixed.0.5    nodematch.department         nodeicov.tenure 
                  0.160                   1.714                   1.051 
   nodeicov.performance nodefactor.is_manager.1 
                  0.863                   1.265 
# Baseline tie probability (all change statistics = 0)
plogis(coef(m2)["edges"])
    edges 
0.2171806 
# Same-department tie with reciprocity and one shared partner
logit_rich <- coef(m2)["edges"] +
              coef(m2)["nodematch.department"]      * 1 +
              coef(m2)["mutual"]                    * 1 +
              coef(m2)["gwesp.OTP.fixed.0.5"]       * 1
plogis(logit_rich)
    edges 
0.2544389 

The first block (round(exp(coef(m2)), 3)) gives the odds ratios for each term. Reading the two significant ones: nodematch.department1.71 (same-department pairs 71% more likely to have a tie) and nodeicov.tenure1.05 (each additional year of tenure in the target increases tie probability by ~5%).

The second line (plogis(coef(m2)["edges"])) gives the baseline tie probability — the probability of a tie when all change statistics equal zero, i.e., a cross-department dyad, with average-tenure target, who is a non-manager, with no mutual tie and no shared partners. Here that is approximately 22%, close to the observed density.

The third block computes the probability for a richer scenario: a same-department dyad (nodematch.department = 1) that is also mutual (mutual = 1) and shares one common advice partner (gwesp = 1). Adding these log-odds on top of the baseline and applying the inverse-logit gives the predicted tie probability for this type of dyad — substantially higher than the 22% baseline. This illustrates how ERGM coefficients can be combined to produce substantively meaningful predicted probabilities.

NoteReporting Tip

For an I-O audience, odds ratios are often more intuitive than log-odds. exp(nodematch.department) ≈ 1.71 means same-department pairs are about 71% more likely to have an advice tie than cross-department pairs, after controlling for all other terms.

7.3 Model Comparison

AIC(m0, m1, m2)
   df      AIC
m0  1 897.0789
m1  4 897.6149
m2  8 886.2074
BIC(m0, m1, m2)
   df      BIC
m0  1 901.8474
m1  4 916.6889
m2  8 924.3554
# Likelihood ratio test (only valid for nested models)
# eval.loglik = TRUE required in ergm >= 4.x for MCMC-fitted models
anova(m1, m2, eval.loglik = TRUE)
Analysis of Deviance Table

Model 1: advice_net ~ edges + mutual + gwesp(0.5, fixed = TRUE) + gwidegree(0.5, 
    fixed = TRUE)
Model 2: advice_net ~ edges + mutual + gwesp(0.5, fixed = TRUE) + gwidegree(0.5, 
    fixed = TRUE) + nodematch("department") + nodeicov("tenure") + 
    nodeicov("performance") + nodefactor("is_manager")
     Df Deviance Resid. Df Resid. Dev Pr(>|Chisq|)    
NULL                   870    1206.08                 
1     4   316.46       866     889.61    < 2.2e-16 ***
2     4    19.41       862     870.21    0.0006535 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Reading the AIC/BIC table:

  • AIC (penalizes log-likelihood by 2 × number of parameters): M2 wins handily (886.2 vs. 897.1 for M0 and 897.6 for M1). Adding the attribute predictors — especially nodematch.department and nodeicov.tenure — meaningfully improves model fit.
  • BIC (penalizes by log(n) × k; harsher for small networks): M0 wins (901.8 vs. 924.4 for M2). With only 30 nodes, BIC heavily penalizes the 8-parameter model. This reflects a genuine tension: the attribute effects are real and theoretically important, but the network is small enough that BIC discounts them.

In practice for I-O research, report both and justify your preferred model on theoretical grounds. If your hypothesis is specifically about departmental homophily and tenure, M2 is the appropriate model regardless of BIC.

The anova() likelihood ratio test compares the deviance of M1 and M2 directly. A significant p-value confirms that adding the attribute terms in M2 significantly improves fit over the structural-only M1.


8. MCMC Diagnostics

Because ERGM estimation uses Markov chain Monte Carlo, you must verify that the MCMC chains have converged before trusting your estimates.

8.1 MCMC Trace Plots

mcmc.diagnostics(m2)

MCMC diagnostics for Model 2. Left column: trace plots — a well-mixed chain looks like a ‘hairy caterpillar’ oscillating randomly around zero. Right column: density plots — should be bell-shaped and centred near zero.

MCMC diagnostics for Model 2. Left column: trace plots — a well-mixed chain looks like a ‘hairy caterpillar’ oscillating randomly around zero. Right column: density plots — should be bell-shaped and centred near zero.
Sample statistics summary:

Iterations = 102912:2048000
Thinning interval = 512 
Number of chains = 1 
Sample size per chain = 3800 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                             Mean      SD Naive SE Time-series SE
edges                    -1.13737  12.299  0.19952        0.28492
mutual                   -0.20395   4.016  0.06515        0.07672
gwesp.OTP.fixed.0.5      -2.06579  25.790  0.41837        0.56491
gwideg.fixed.0.5         -0.09907   1.445  0.02345        0.03859
nodematch.department     -0.19868   7.138  0.11580        0.16944
nodeicov.tenure         -13.55689 131.018  2.12539        3.03006
nodeicov.performance     -4.06122  47.622  0.77253        1.10292
nodefactor.is_manager.1  -0.67316   8.464  0.13730        0.19535

2. Quantiles for each variable:

                            2.5%       25%       50%    75%   97.5%
edges                    -25.000   -9.0000  -1.00000  7.000  23.000
mutual                    -8.000   -3.0000   0.00000  2.000   8.000
gwesp.OTP.fixed.0.5      -51.611  -20.0353  -2.74552 15.055  50.335
gwideg.fixed.0.5          -3.461   -0.9585   0.08967  1.082   1.934
nodematch.department     -14.000   -5.0000   0.00000  5.000  14.000
nodeicov.tenure         -263.933 -101.1000 -15.05000 74.200 244.805
nodeicov.performance     -97.051  -36.5150  -4.22000 29.065  88.134
nodefactor.is_manager.1  -17.000   -6.0000  -1.00000  5.000  16.000


Are sample statistics significantly different from observed?
                   edges       mutual gwesp.OTP.fixed.0.5 gwideg.fixed.0.5
diff.      -1.137368e+00 -0.203947368       -2.0657880490      -0.09907400
test stat. -3.991837e+00 -2.658494476       -3.6568122876      -2.56724109
P-val.      6.556344e-05  0.007849064        0.0002553712       0.01025113
           nodematch.department nodeicov.tenure nodeicov.performance
diff.                -0.1986842   -1.355689e+01        -4.0612184210
test stat.           -1.1726215   -4.474130e+00        -3.6822543988
P-val.                0.2409476    7.672301e-06         0.0002311805
           nodefactor.is_manager.1       (Omni)
diff.                -0.6731578947           NA
test stat.           -3.4459060823 3.236132e+01
P-val.                0.0005691481 9.076194e-05

Sample statistics cross-correlations:
                            edges    mutual gwesp.OTP.fixed.0.5
edges                   1.0000000 0.5697587           0.9514326
mutual                  0.5697587 1.0000000           0.5519122
gwesp.OTP.fixed.0.5     0.9514326 0.5519122           1.0000000
gwideg.fixed.0.5        0.4964825 0.2880381           0.3635354
nodematch.department    0.6276469 0.4542396           0.5994826
nodeicov.tenure         0.9128363 0.5193911           0.8990405
nodeicov.performance    0.9876846 0.5641072           0.9334865
nodefactor.is_manager.1 0.6300460 0.3984020           0.6481451
                        gwideg.fixed.0.5 nodematch.department nodeicov.tenure
edges                          0.4964825            0.6276469       0.9128363
mutual                         0.2880381            0.4542396       0.5193911
gwesp.OTP.fixed.0.5            0.3635354            0.5994826       0.8990405
gwideg.fixed.0.5               1.0000000            0.3553870       0.3076028
nodematch.department           0.3553870            1.0000000       0.5566337
nodeicov.tenure                0.3076028            0.5566337       1.0000000
nodeicov.performance           0.5157547            0.6203996       0.8947511
nodefactor.is_manager.1        0.2240815            0.3859132       0.6029893
                        nodeicov.performance nodefactor.is_manager.1
edges                              0.9876846               0.6300460
mutual                             0.5641072               0.3984020
gwesp.OTP.fixed.0.5                0.9334865               0.6481451
gwideg.fixed.0.5                   0.5157547               0.2240815
nodematch.department               0.6203996               0.3859132
nodeicov.tenure                    0.8947511               0.6029893
nodeicov.performance               1.0000000               0.6171524
nodefactor.is_manager.1            0.6171524               1.0000000

Sample statistics auto-correlation:
Chain 1 
                edges       mutual gwesp.OTP.fixed.0.5 gwideg.fixed.0.5
Lag 0     1.000000000  1.000000000         1.000000000       1.00000000
Lag 512   0.341859716  0.161856958         0.291479769       0.46063262
Lag 1024  0.126308674  0.037860079         0.105932090       0.20489809
Lag 1536  0.055963911  0.022616130         0.045528400       0.10909859
Lag 2048  0.017291081 -0.002834093         0.006065795       0.04629176
Lag 2560 -0.003088122 -0.027386693        -0.001813783       0.02789004
         nodematch.department nodeicov.tenure nodeicov.performance
Lag 0              1.00000000     1.000000000         1.0000000000
Lag 512            0.36311243     0.340359114         0.3416093233
Lag 1024           0.14602337     0.126138928         0.1259305507
Lag 1536           0.06466110     0.047637400         0.0595031386
Lag 2048           0.03483475     0.013485759         0.0182691274
Lag 2560           0.02094939    -0.002626386        -0.0003762679
         nodefactor.is_manager.1
Lag 0                1.000000000
Lag 512              0.338599213
Lag 1024             0.116818068
Lag 1536             0.062463200
Lag 2048             0.008804965
Lag 2560             0.007315123

Sample statistics burn-in diagnostic (Geweke):
Chain 1 

Fraction in 1st window = 0.1
Fraction in 2nd window = 0.5 

                  edges                  mutual     gwesp.OTP.fixed.0.5 
              2.0618529               1.7464118               2.3135841 
       gwideg.fixed.0.5    nodematch.department         nodeicov.tenure 
              1.4812458               0.9041373               1.6798343 
   nodeicov.performance nodefactor.is_manager.1 
              2.2803111               2.6001484 

Individual P-values (lower = worse):
                  edges                  mutual     gwesp.OTP.fixed.0.5 
            0.039221739             0.080739420             0.020690547 
       gwideg.fixed.0.5    nodematch.department         nodeicov.tenure 
            0.138541087             0.365922581             0.092989562 
   nodeicov.performance nodefactor.is_manager.1 
            0.022589246             0.009318346 
Joint P-value (lower = worse):  0.06833311 

Note: MCMC diagnostics shown here are from the last round of
  simulation, prior to computation of final parameter estimates.
  Because the final estimates are refinements of those used for this
  simulation run, these diagnostics may understate model performance.
  To directly assess the performance of the final model on in-model
  statistics, please use the GOF command: gof(ergmFitObject,
  GOF=~model).

8.2 What to Do When MCMC Does Not Converge

WarningNon-Convergence Signs
  • Trace plots show long-range drifts or “stickiness”
  • Coefficients are implausibly large (>10 or <−10)
  • The MCMC % column in the summary is high
  • Model prints warnings about negative log-likelihood

Remedies, in order of preference:

  1. Increase MCMC.samplesize, MCMC.burnin, and MCMC.interval.
  2. Remove degenerate-prone terms (triangle, kstar) and replace with geometrically-weighted equivalents.
  3. Try different decay values in gwesp() or gwdegree().
  4. Simplify the model to identify the problematic term.

9. Goodness of Fit

Goodness of fit (GoF) asks: does the fitted ERGM reproduce structural features of the observed network that were not explicitly included as terms?

9.1 Running GoF

gof_m2 <- gof(m2, control = control.gof.ergm(nsim = 200, seed = 42))
print(gof_m2)

Goodness-of-fit for model statistics 

                               obs        min       mean        max MC p-value
edges                    183.00000  149.00000  181.97500  219.00000       0.97
mutual                    17.00000    7.00000   16.62000   27.00000       1.00
gwesp.OTP.fixed.0.5      164.62362  100.87612  162.38796  241.92402       0.90
gwideg.fixed.0.5          46.97095   42.16798   46.79726   49.08691       0.96
nodematch.department      73.00000   58.00000   72.83500   93.00000       1.00
nodeicov.tenure         1904.20000 1546.90000 1905.55250 2285.90000       0.97
nodeicov.performance     689.79000  572.09000  686.55105  823.10000       0.95
nodefactor.is_manager.1   85.00000   60.00000   84.42000  106.00000       1.00

Goodness-of-fit for minimum geodesic distance 

    obs min    mean max MC p-value
1   183 149 181.975 219       0.97
2   476 341 471.395 546       0.92
3   176 105 180.055 245       0.96
4     6   0  11.610  83       0.77
5     0   0   0.360  20       1.00
6     0   0   0.010   2       1.00
Inf  29   0  24.595  87       1.00

Goodness-of-fit for out-degree 

   obs min  mean max MC p-value
0    0   0 0.025   1       1.00
1    0   0 0.230   3       1.00
2    1   0 0.785   4       1.00
3    3   0 2.250   8       0.78
4    2   0 3.935   9       0.45
5    4   0 5.135  13       0.84
6    8   1 5.565  13       0.30
7    6   0 4.740  13       0.67
8    3   0 3.530  13       1.00
9    1   0 2.120   6       0.68
10   1   0 0.880   4       1.00
11   1   0 0.500   3       0.76
12   0   0 0.200   3       1.00
13   0   0 0.065   1       1.00
14   0   0 0.030   1       1.00
15   0   0 0.005   1       1.00
16   0   0 0.005   1       1.00

Goodness-of-fit for in-degree 

   obs min  mean max MC p-value
0    1   0 0.785   3       1.00
1    0   0 0.870   4       0.73
2    2   0 1.345   5       0.78
3    1   0 2.230   5       0.60
4    2   0 3.400  10       0.73
5    7   0 4.175   9       0.22
6    5   0 4.055   9       0.82
7    3   0 4.185  10       0.84
8    5   0 3.320   8       0.49
9    1   0 2.310   7       0.60
10   1   0 1.570   5       1.00
11   1   0 0.960   5       1.00
12   1   0 0.495   3       0.81
13   0   0 0.175   2       1.00
14   0   0 0.085   1       1.00
15   0   0 0.030   1       1.00
16   0   0 0.010   1       1.00

Goodness-of-fit for edgewise shared partner 

      obs min   mean max MC p-value
.OTP0  42  31 50.815  71       0.25
.OTP1  89  40 63.420  86       0.00
.OTP2  34  21 41.690  69       0.47
.OTP3  13   4 18.035  42       0.57
.OTP4   2   0  5.975  20       0.28
.OTP5   3   0  1.620   9       0.47
.OTP6   0   0  0.360   5       1.00
.OTP7   0   0  0.050   1       1.00
.OTP8   0   0  0.010   2       1.00

print(gof_m2) produces a table for each structural statistic. Each row is one value of the statistic (e.g., in-degree = 0, 1, 2, …), and the columns show:

  • obs — the count in the observed network
  • min / mean / max — the range across the 200 simulated networks
  • MC p-value — how often the simulated networks fall above or below the observed value; values near 0 or 1 indicate poor fit

A well-fitting model has MC p-values well away from 0 and 1 for all statistics. For this model, inspect particularly whether the in-degree and out-degree distributions look consistent with the simulated envelope — because we explicitly modelled gwidegree, degree heterogeneity should be reproduced. If the ESP distribution (edgewise shared partners) shows poor p-values, it would suggest the transitivity structure is not captured — though in this dataset the GWESP term was non-significant, so some misfit there would not be surprising.

9.2 GoF Plots

par(mfrow = c(2, 3))
plot(gof_m2)
par(mfrow = c(1, 1))
Figure 3: Goodness-of-fit plots for Model 2. Black line = observed; box-and-whisker = distribution from 200 simulated networks. Good fit: observed line passes through the boxes.

9.3 Interpreting GoF

  • Black line through boxes → good fit at that statistic.
  • Black line outside boxes → the model cannot reproduce that structural property — revisit your terms.
GoF Problem Likely Cause Fix
Degree distribution: too few high-degree nodes Missing hub structure Add gwidegree
ESP distribution poorly reproduced Transitivity not captured Adjust gwesp decay
Geodesic distances too short Transitivity over-specified Reduce gwesp decay
Simulated networks too dense Possible degeneracy Check MCMC traces
TipRule of Thumb

A model with good GoF on degree distributions and ESP but borderline GoF on geodesic distances is generally acceptable for publication. Always report at least degree distributions and ESP.

9.4 Custom GoF Statistics

gof_custom <- gof(m2,
  GOF     = ~ idegree + odegree + esp + distance + model,
  control = control.gof.ergm(nsim = 200, seed = 42))
par(mfrow = c(2, 3))
plot(gof_custom)
par(mfrow = c(1, 1))

Custom GoF including triad census.

10. Temporal ERGMs (TERGMs)

10.1 Why Temporal ERGMs?

Organizations are not static. TERGMs extend ERGMs to panel network data — multiple snapshots of the same network — allowing you to model how networks evolve over time.

NoteI-O Applications
  • How does a new hire’s advice network evolve over their first year?
  • Does a team reorganization disrupt informal communication ties?
  • Does a mentorship program alter the network beyond the assigned pairs?

10.2 The TERGM Probability Model

A TERGM conditions on the previous network state \(\mathbf{Y}_{t-1}\):

\[P(\mathbf{Y}_t = \mathbf{y}_t \mid \mathbf{Y}_{t-1} = \mathbf{y}_{t-1}) = \frac{\exp\{\boldsymbol{\theta}^T \mathbf{g}(\mathbf{y}_t, \mathbf{y}_{t-1})\}}{\kappa(\boldsymbol{\theta}, \mathbf{y}_{t-1})}\]

The sufficient statistics \(\mathbf{g}\) can now depend on both the current network and the prior network, allowing you to model stability (past ties predict current ties) and network evolution.

10.3 Simulating Two Waves

set.seed(99)

advice_t1 <- advice_net   # Wave 1

# Wave 2: drop ~20% of ties, add some new ones
mat_t2 <- as.matrix(advice_t1)
mat_t2[mat_t2 == 1 & matrix(runif(n^2), n, n) < 0.20] <- 0
for (i in 1:n) for (j in 1:n)
  if (i != j && !mat_t2[i,j] && runif(1) < 0.03) mat_t2[i,j] <- 1

advice_t2 <- network(mat_t2, directed = TRUE)
advice_t2 %v% "department"  <- dept
advice_t2 %v% "tenure"      <- tenure
advice_t2 %v% "is_manager"  <- is_manager

net_list <- list(advice_t1, advice_t2)

cat("Wave 1 edges:", network.edgecount(advice_t1), "\n")
Wave 1 edges: 183 
cat("Wave 2 edges:", network.edgecount(advice_t2), "\n")
Wave 2 edges: 173 

The output shows 183 edges at Wave 1 and 173 edges at Wave 2 — the network shed approximately 10 ties (about 5.5% of Wave 1 ties) and gained a handful of new ones, producing a slightly leaner but structurally similar network. This modest change is realistic: organizational advice networks tend to be moderately stable over short intervals (months), with most ties persisting and a minority dissolving or forming.

10.4 Fitting TERGMs with tergm

We use the tergm package (part of statnet) with Conditional MLE (CMLE). Network stability is captured by including the previous network as a dyadic covariate via edgecov() — ties that existed at \(t-1\) are included as a predictor of ties at \(t\).

library(tergm)

# Create lagged network covariate (previous wave as predictor of current wave)
# For CMLE with 2 waves this is the Wave 1 network predicting Wave 2
set.seed(42)
tergm_fit <- tergm(
  net_list ~ edges
             + mutual
             + gwesp(0.5, fixed = TRUE)
             + nodematch("department")
             + nodeicov("tenure"),
  estimate = "CMLE",
  times    = c(1, 2)
)
summary(tergm_fit)
Call:
tergm(formula = net_list ~ edges + mutual + gwesp(0.5, fixed = TRUE) + 
    nodematch("department") + nodeicov("tenure"), estimate = "CMLE", 
    times = c(1, 2))

Monte Carlo Conditional Maximum Likelihood Results:

                     Estimate Std. Error MCMC % z value Pr(>|z|)    
edges                -1.96487    0.33654      0  -5.838  < 1e-04 ***
mutual               -0.17009    0.31470      0  -0.540  0.58887    
gwesp.OTP.fixed.0.5  -0.07339    0.13138      0  -0.559  0.57642    
nodematch.department  0.55922    0.18339      0   3.049  0.00229 ** 
nodeicov.tenure       0.05602    0.02154      0   2.601  0.00930 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 1206  on 870  degrees of freedom
 Residual Deviance:  853  on 865  degrees of freedom
 
AIC: 863  BIC: 886.9  (Smaller is better. MC Std. Err. = 0.07237)

The TERGM output has the same structure as the cross-sectional ERGM — log-odds coefficients, standard errors, and significance tests — but now the estimates condition on the Wave 1 network as the baseline state. Key observations:

  • edges = −1.96 (p < .001). The baseline log-odds of a tie at Wave 2, given the Wave 1 network structure, is strongly negative. The network does not fill in on its own; exogenous factors are needed to predict who retains or gains ties.
  • mutual = −0.17, ns. No reciprocity tendency emerges over time. Advice in this organization remains directional.
  • gwesp.OTP.fixed.0.5 = −0.07, ns. Triadic closure does not drive longitudinal tie changes — new ties do not preferentially form to complete triangles.
  • nodematch.department = +0.56 (p = .002). Same-department pairs are significantly more likely to have advice ties at Wave 2, mirroring the cross-sectional finding. Departmental homophily is a longitudinally stable structural feature, not a one-time snapshot artifact.
  • nodeicov.tenure = +0.056 (p = .009). High-tenure employees continue to attract advice-seekers over time. The tenure effect is robust across the two waves.

The AIC of 863 (vs. 897 for the cross-sectional M2) reflects that the TERGM’s conditioning on Wave 1 makes the prediction task easier — it already “knows” which ties existed before. The substantive interpretation is the same, but the TERGM answers a longitudinal question: what predicts the network at Wave 2, given Wave 1?

10.5 Network Stability

In the CMLE framework, stability — the tendency for ties to persist — is not modelled as an explicit term but is instead encoded in the model through the conditional likelihood: we are modelling \(P(\mathbf{Y}_t \mid \mathbf{Y}_{t-1})\), so the previous network state is automatically conditioned on. To make stability an explicit coefficient, add the lagged network as an edgecov:

# Add explicit stability term
tergm_stab <- tergm(
  net_list ~ edges
             + mutual
             + gwesp(0.5, fixed = TRUE)
             + nodematch("department")
             + nodeicov("tenure")
             + edgecov(net_list[[1]]),   # lag-1 stability term
  estimate = "CMLE",
  times    = c(1, 2),
)
summary(tergm_stab)

A positive edgecov coefficient means ties that existed at \(t-1\) are more likely to still exist at \(t\) — the network is stable. For example, \(\exp(2.5) \approx 12\times\) more likely to persist.

Notebtergm as an Alternative

The btergm package offers a bootstrap pseudo-likelihood alternative. Its memory(type = "stability") term is a convenient shorthand for the lagged edgecov. btergm is particularly useful for larger networks or many time points where CMLE becomes slow.


11. Separable Temporal ERGMs (STERGMs)

11.1 The Key Idea

TERGMs treat formation and dissolution of ties as a single joint process. STERGMs (Krivitsky & Handcock, 2014) separate them:

  • A formation model: Among ties that did not exist at \(t-1\), which ones form at \(t\)?
  • A dissolution model: Among ties that did exist at \(t-1\), which ones persist (equivalently, which dissolve)?
NoteI-O Example: Mentorship Networks

Formation: New mentorship ties form between junior employees and high-tenure managers in the same department — proximity and status homophily drive formation.

Dissolution: Existing ties dissolve when the mentor is promoted or the protégé reaches a tenure threshold. A single TERGM would conflate these mechanisms into one coefficient.

11.2 The STERGM Probability Model

\[P(\mathbf{Y}_t \mid \mathbf{Y}_{t-1}) = P^+(\mathbf{Y}^+_t \mid \mathbf{Y}_{t-1}) \times P^-(\mathbf{Y}^-_t \mid \mathbf{Y}_{t-1})\]

Each of \(P^+\) and \(P^-\) is an ERGM in its own right, with separate parameters.

11.3 Fitting STERGMs in R

library(tergm)

set.seed(42)
stergm_fit <- stergm(
  net_list,
  formation   = ~ edges + mutual + gwesp(0.5, fixed = TRUE)
                  + nodematch("department") + nodeicov("tenure"),
  dissolution = ~ edges + nodematch("department") + nodeicov("tenure"),
  estimate    = "CMLE",
  times       = c(1, 2)
)
summary(stergm_fit)
Call:
tergm(formula = formula, constraints = constraints, offset.coef = c(offset.coef.form, 
    offset.coef.diss), target.stats = target.stats, eval.loglik = eval.loglik, 
    estimate = estimate, control = control, verbose = verbose, 
    times = times, targets = targets, SAN.offsets = SAN.offsets)

Monte Carlo Conditional Maximum Likelihood Results:

                                Estimate Std. Error MCMC % z value Pr(>|z|)    
Form(1)~edges                   -2.52069    0.62288      0  -4.047   <1e-04 ***
Form(1)~mutual                   0.04074    0.44129      0   0.092    0.926    
Form(1)~gwesp.OTP.fixed.0.5     -0.17539    0.21157      0  -0.829    0.407    
Form(1)~nodematch.department     0.12021    0.40588      0   0.296    0.767    
Form(1)~nodeicov.tenure         -0.02904    0.04664      0  -0.623    0.534    
Persist(1)~edges                 0.61238    0.52819      0   1.159    0.246    
Persist(1)~nodematch.department  0.57531    0.42021      0   1.369    0.171    
Persist(1)~nodeicov.tenure       0.04873    0.04544      0   1.072    0.284    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 1206.1  on 870  degrees of freedom
 Residual Deviance:  425.9  on 862  degrees of freedom
 
AIC: 441.9  BIC: 480  (Smaller is better. MC Std. Err. = 0.08358)

11.4 Interpreting STERGM Output

You receive two sets of coefficients — one for the formation model and one for the dissolution model. Here is the output from the model above, explained term by term:

Formation model (Form(1)~...) — predicts which ties that did not exist at Wave 1 form by Wave 2:

Term Estimate Interpretation
Form~edges −2.52*** Strong baseline against forming new ties — the bar for forming a brand-new relationship is high
Form~mutual +0.04, ns No significant tendency for new ties to form in the direction of an already-existing reverse tie
Form~gwesp.OTP.fixed.0.5 −0.18, ns New ties do not preferentially complete triangles
Form~nodematch.department +0.12, ns Slight same-department preference for new tie formation, but not significant given the small two-wave sample
Form~nodeicov.tenure −0.03, ns Tenure of the target does not significantly predict whether a brand-new tie forms to them

Persistence model (Persist(1)~...) — predicts which ties that did exist at Wave 1 are still present at Wave 2:

Term Estimate Interpretation
Persist~edges +0.61, ns General tendency for existing ties to persist (positive = more likely to survive); exp(0.61) ≈ 1.84, a moderate effect not reaching significance with only two waves
Persist~nodematch.department +0.58, ns Same-department ties are more durable than cross-department ties; exp(0.58) ≈ 1.79, directionally consistent with theory
Persist~nodeicov.tenure +0.05, ns Ties to high-tenure advisors persist longer; same direction as the cross-sectional effect

Key take-away: All effects in both models are directionally consistent with the data-generating process and the cross-sectional findings — same-department ties form more readily and dissolve less often; high-tenure advisors remain sought after over time. However, none reach significance in this two-wave, 30-node example. This is typical: STERGMs require meaningful network turnover (many tie changes) to estimate formation and dissolution parameters precisely. With richer longitudinal data (more waves, larger n, or greater flux between waves), these effects would likely be detectable.

WarningDissolution Coding

In stergm(), a positive dissolution coefficient means the tie is more likely to persist (dissolution probability decreases). Always check the documentation — some presentations code dissolution in the opposite direction.

11.5 GoF for STERGMs

GoF for STERGM objects follows the same logic as for cross-sectional ERGMs: simulate networks from the fitted model and compare structural statistics to the observed network. The API for gof() on stergm objects varies across package versions; the call below shows the intended usage — run it interactively after confirming your version supports it.

# GoF for STERGM — run interactively; API varies by tergm version
gof_stergm <- gof(stergm_fit)
par(mfrow = c(2, 3))
plot(gof_stergm)
par(mfrow = c(1, 1))

Assess the formation and dissolution components separately: the formation GoF should reproduce the degree distribution of newly formed ties, and the dissolution GoF should reproduce the degree distribution of persisting ties. Both should pass the same visual checks as the cross-sectional GoF plots above (observed line through the simulated boxes).


12. Common Pitfalls and Best Practices

12.1 Model Degeneracy

Degeneracy occurs when the ERGM probability distribution becomes bimodal — almost all probability mass is on either the empty or complete network. Your observed network then sits in the near-zero probability region.

Prevention: Avoid triangle and kstar terms; use gwesp and gwdegree. A degenerate model’s MCMC samples will alternate between near-empty and near-full networks.

12.2 Sample Size

ERGMs are fit to a single network. As a rough guideline:

  • < 20 nodes: Usually underpowered; consider descriptive analysis.
  • 20–50 nodes: Viable but be parsimonious with terms.
  • 50–200 nodes: The sweet spot for most ERGM analyses.
  • > 200 nodes: MCMC-MLE can be slow; consider pseudo-likelihood methods or parallel processing.

12.3 Network Data Collection Issues

  • Boundary specification: Carefully define who is “in” the network. Partial networks create artificial boundary effects.
  • Roster vs. free-recall: Roster-based surveys produce more complete data than free-recall. Missing ties bias ERGM estimates.
  • Nomination limits: Imposing a fixed limit (e.g., “name up to 5 people”) artificially constrains the degree distribution.

12.4 Decay Parameter Selection

The decay parameter in gwesp(decay) controls how quickly geometric weights diminish. Common practice:

  • Try a range of values (0.25, 0.5, 0.75, 1.0, 1.5) and select based on GoF.
  • Set fixed = FALSE to estimate the decay parameter if your network is large enough.
  • Report your chosen decay value and sensitivity of results to different values.

13. Reporting ERGMs in APA-Style Papers

13.1 Methods Section

  • Describe the network data collection procedure (roster vs. free-recall, number of waves, response rate, boundary specification).
  • State the R packages and version numbers used (statnet, ergm, tergm, btergm).
  • Report MCMC parameters (sample size, burn-in, seed) for reproducibility.

13.2 Results Section

  • Report estimates, standard errors, and p-values for all model terms.
  • Report both log-odds coefficients and odds ratios (exponentiated coefficients).
  • Report GoF results: “GoF diagnostics showed that the model adequately reproduced the observed in-degree distribution, out-degree distribution, edgewise shared partner distribution, and geodesic distance distribution (see Supplementary Materials).”

13.3 R Session Info

Code
sessionInfo()
R version 4.5.2 (2025-10-31)
Platform: x86_64-apple-darwin20
Running under: macOS Tahoe 26.3

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] btergm_1.11.1         statnet_2019.6        tsna_0.3.6           
 [4] sna_2.8               statnet.common_4.13.0 ergm.count_4.1.3     
 [7] tergm_4.2.2           networkDynamic_0.11.5 ergm_4.12.0          
[10] network_1.20.0       

loaded via a namespace (and not attached):
 [1] generics_0.1.4           robustbase_0.99-7        stringi_1.8.7           
 [4] lattice_0.22-7           digest_0.6.39            magrittr_2.0.4          
 [7] evaluate_1.0.5           lpSolveAPI_5.5.2.0-17.15 grid_4.5.2              
[10] networkLite_1.1.0        fastmap_1.2.0            jsonlite_2.0.0          
[13] Matrix_1.7-4             rle_0.10.0               purrr_1.2.1             
[16] Rdpack_2.6.6             cli_3.6.5                rlang_1.1.7             
[19] rbibutils_2.4.1          cachem_1.1.0             yaml_2.3.12             
[22] otel_0.2.0               tools_4.5.2              parallel_4.5.2          
[25] memoise_2.0.1            coda_0.19-4.1            dplyr_1.2.0             
[28] boot_1.3-32              ROCR_1.0-12              vctrs_0.7.1             
[31] R6_2.6.1                 lifecycle_1.0.5          stringr_1.6.0           
[34] htmlwidgets_1.6.4        MASS_7.3-65              trust_0.1-9             
[37] pkgconfig_2.0.3          pillar_1.11.1            ergm.multi_0.3.0        
[40] glue_1.8.0               DEoptimR_1.1-4           xfun_0.56               
[43] tibble_3.3.1             tidyselect_1.2.1         rstudioapi_0.18.0       
[46] knitr_1.51               igraph_2.2.1             htmltools_0.5.9         
[49] nlme_3.1-168             rmarkdown_2.30           compiler_4.5.2          

14. Further Reading

Foundational Papers

  • Frank, O., & Strauss, D. (1986). Markov graphs. Journal of the American Statistical Association, 81, 832–842.
  • Robins, G., Pattison, P., Kalish, Y., & Lusher, D. (2007). An introduction to exponential random graph (p*) models for social networks. Social Networks, 29, 173–191.
  • Hunter, D. R., & Handcock, M. S. (2006). Inference in curved exponential family models for networks. Journal of Computational and Graphical Statistics, 15, 565–583. — Introduces gwesp/gwdegree.
  • Krivitsky, P. N., & Handcock, M. S. (2014). A separable model for dynamic networks. Journal of the Royal Statistical Society: Series B, 76, 29–46. — The STERGM paper.

I-O and Organizational Applications

  • Lusher, D., Koskinen, J., & Robins, G. (Eds.). (2013). Exponential random graph models for social networks: Theory, methods and applications. Cambridge University Press.
  • Rank, O. N., Robins, G. L., & Pattison, P. E. (2010). Structural logic of intraorganizational networks. Organization Science, 21, 745–764.
  • Brass, D. J., Galaskiewicz, J., Greve, H. R., & Tsai, W. (2004). Taking stock of networks and organizations. Academy of Management Journal, 47, 795–817.

R Resources

  • Official ergm package vignette
  • statnet.org — Documentation, workshops, and tutorials for the full statnet suite.
  • Morris, M., Handcock, M. S., & Hunter, D. R. (2008). Specification of exponential-family random graph models. Journal of Statistical Software, 24(4).