Frequently Asked Questions

1. What’s the difference between a multiple-membership model and a conventional multilevel model?

In a conventional hierarchical multilevel model, each lower-level unit belongs to exactly one higher-level unit (e.g., students nested in schools). In a multiple-membership model (MMMM), lower-level units can belong to multiple higher-level units simultaneously, or alternatively, higher-level outcomes can be influenced by multiple lower-level units.

For example:

Conventional MLM: Students (level 1) nested in schools (level 2) — each student attends exactly one school.
MMMM: Coalition governments (level 2) composed of multiple political parties (level 1) — each party can participate in multiple governments over time, and each government comprises multiple parties.

The MMMM accounts for this complex membership structure by weighting the contributions of each lower-level unit, allowing researchers to model how multiple units jointly shape higher-level outcomes.

2. What’s the difference between a conventional MMMM and the extended MMMM implemented in `bml`?

The conventional MMMM (as implemented in MLwiN, brms, or other software) uses fixed, pre-specified weights to aggregate lower-level effects. For instance, you might use equal weights (1/n) or weights based on time spent in each context.

The extended MMMM in bml allows you to:

Parameterize the weight function: Instead of fixing weights, you can specify a functional form for weights (e.g., w ~ 1/n^exp(b*X)) and estimate the parameters that determine how lower-level units are aggregated.
Test alternative aggregation mechanisms: Compare different weighting schemes (equal weights, proportional weights, functions of covariates) to determine which best fits the data.
Endogenize weight matrices: Rather than imposing spatial or network weights externally, let the data determine connection strengths as functions of covariates.

This flexibility enables researchers to explicitly model the micro-to-macro link — how lower-level characteristics aggregate to produce higher-level outcomes.

3. When should I use `bml` instead of other multilevel modeling packages?

Use bml when:

You have multiple-membership structures: Higher-level outcomes depend on multiple lower-level units (e.g., coalitions composed of parties, teams composed of individuals, neighborhoods influenced by surrounding areas).
You need flexible weight functions: Your theory suggests weights should depend on covariates, group size, or other features, not be fixed in advance.
You’re studying aggregation processes / micro-to-macro relationships: Your research question focuses on how lower-level units jointly shape higher-level outcomes, rather than how higher-level contexts shape lower-level outcomes. Rather than assuming a fixed aggregation (e.g., simple average), you want to test and estimate how lower-level effects combine.

For standard hierarchical models without multiple membership, packages like lme4, brms, or MCMCglmm will be more efficient. For conventional MMMMs with fixed weights, brms or MLwiN are excellent alternatives.

4. What outcome types and distributions does `bml` support?

bml supports a variety of outcome distributions commonly used in social science research:

Continuous outcomes: Gaussian (normal) regression
Binary outcomes: Logit (logistic) regression
Survival/duration outcomes:
- Cox proportional hazards model
- Weibull accelerated failure time (AFT) model

You can also specify hierarchical random effects (hm() blocks) in addition to multiple-membership structures, allowing for hierarchically nested and cross-classified designs.

5. How do I specify the weight function, and what are the `c` and `ar` parameters?

The weight function is specified in the fn() container within an mm() block:

mm(
  id = id(member_id, group_id),
  vars = vars(X),
  fn = fn(w ~ 1/n, c = TRUE),
  RE = TRUE,
  ar = FALSE
)

Weight function components:

w ~ ...: Specifies the functional form for weights. You can use:
- Group size: n (number of members in each group)
- Covariates: Any variable in your data
- Mathematical functions: exp(), log(), sqrt(), etc.
- Examples: w ~ 1/n (equal weights), w ~ tenure (proportional to tenure), w ~ 1/n^exp(b*similarity) (similarity-weighted)
c parameter: Controls weight normalization
- c = TRUE (default): Weights are normalized to sum to 1 within each group (row-standardization)
- c = FALSE: Weights are not normalized (useful when aggregating sums rather than averages)
ar parameter: Autoregressive random effects
- ar = FALSE (default): Standard independent random effects for each member
- ar = TRUE: Member-level random effects evolve as a random walk across repeated group participations, capturing dynamics where a member’s unobserved heterogeneity changes over time

Important: Only one mm() block can have RE = TRUE in a given model.

6. How do I fix parameters to known values?

There are two ways to fix parameters in bml, depending on where in the model they appear.

Main equation and `hm()` blocks: Using `fix()`

Use the fix() helper inside vars() to hold a coefficient at a specified value rather than estimating it. This works both in the main equation and in hm() blocks.

Main equation:

# Fix the coefficient of 'exposure' to 1 (i.e., use as an offset)
m <- bml(
  y ~ 1 + fix(exposure, 1) + x1 + x2,
  family = "Gaussian",
  data = dat
)

This is equivalent to adding exposure as a predictor but constraining its coefficient to 1 instead of estimating it. This is useful for offsets or when prior theory dictates a specific coefficient value.

In hm() blocks:

m <- bml(
  y ~ 1 + x1 +
    mm(id = id(pid, gid), fn = fn(w ~ 1/n), RE = TRUE) +
    hm(id = id(cid), vars = vars(fix(investiture, 0.5) + gdp), type = "FE"),
  family = "Gaussian",
  data   = coalgov
)

In mm() blocks:

m <- bml(
  Surv(dur_wkb, event_wkb) ~ 1 + majority +
    mm(
      id   = id(pid, gid),
      vars = vars(fix(cohesion, 1) + finance),
      fn   = fn(w ~ 1/n, c = TRUE),
      RE   = TRUE
    ),
  family = "Weibull",
  data   = coalgov
)

Here, the coefficient of cohesion is fixed at 1 while finance is freely estimated.

Weight function `fn()`: Omitting parameters

In the weight function specified via fn(), you fix weights by simply not including any free parameters (b0, b1, …) in the formula. When no parameters appear, the weight function is treated as a known, fixed transformation.

# Fixed equal weights (no parameters to estimate)
fn(w ~ 1/n, c = TRUE)

# Fixed weights proportional to seat share (no parameters to estimate)
fn(w ~ pseat, c = TRUE)

Compare this with weight functions that include free parameters:

# Estimable: b is estimated from data
fn(w ~ 1/n^exp(b * x), c = TRUE)

# Estimable: b0 is estimated from data
fn(w ~ 1 / (1 + (n - 1) * exp(-b0)), c = FALSE)

The key distinction: if the fn() formula contains b, b0, b1, etc., these are estimated. If the formula contains only data variables and constants (like n, pseat, 1), the weights are fixed.

7. I get “Error in node w.1[…]: Invalid parent values” — what does this mean?

This error occurs when JAGS cannot evaluate the weight function for a particular observation because the computed weight is numerically invalid (e.g., NaN, Inf, or a value outside the domain of a downstream function). It most commonly arises with parameterized weight functions during the initialization phase.

Why it happens: Weight function parameters (b0, b1, …) are given vague priors by default (dnorm(0, 0.0001) in JAGS precision, which corresponds to SD = 100). When JAGS initializes the MCMC chains, it may draw extreme starting values (e.g., b0 = 80). For weight functions involving nonlinear transformations like ilogit() or exp(), extreme parameter values can cause numerical issues downstream — even if the weight function itself is mathematically well-defined, the resulting weights may produce overflow in the likelihood (e.g., exp(-mu * shape) in the Weibull model).

Built-in safeguard: bml initializes all weight parameters at 0 by default. This ensures numerically stable starting values (e.g., ilogit(0) = 0.5). However, if the error persists or occurs during sampling (not just initialization), consider the following steps.

How to fix it:

Use more informative priors. Narrow the prior spread for weight parameters so that the sampler stays in a numerically stable region:
```
m <- bml(
  ...,
  priors = list(
    "b.w.1[1] ~ dnorm(0, 1)",   # SD = 1 instead of 100
    "b.w.1[2] ~ dnorm(0, 1)"
  )
)
```
This is especially important for parameters inside ilogit(), exp(), or other functions that saturate or explode at extreme inputs.
Ensure the weight function is bounded. Unbounded weight functions can produce extreme values that destabilize the likelihood. Common strategies:
- Use ilogit() to bound weights between 0 and 1: fn(w ~ ilogit(b0 + b1 * x), c = TRUE)
- Use exp() carefully — it grows rapidly, so pair it with constraints like c = TRUE (normalization) or wrap the argument: fn(w ~ exp(b1 * x), c = TRUE) where x is standardized
Standardize covariates in the weight function. If a weight variable has a large range (e.g., income in thousands), the product b1 * x can easily overflow. Standardize such variables before including them in fn().
Supply explicit initial values. If the default initialization at 0 doesn’t work for your model, provide custom starting values:
```
m <- bml(
  ...,
  inits = list("b.w.1" = c(0.5, -0.1))
)
```
Re-run the model. Since the error can be seed-dependent (different chains draw different initial values), simply re-running may resolve it. However, this indicates a fragile parameterization — consider steps 1–3 for a robust solution.

1. What’s the difference between a multiple-membership model and a conventional multilevel model?

2. What’s the difference between a conventional MMMM and the extended MMMM implemented in bml?

3. When should I use bml instead of other multilevel modeling packages?

4. What outcome types and distributions does bml support?

5. How do I specify the weight function, and what are the c and ar parameters?