Introduction
The bml package implements Bayesian
Multiple-Membership Multilevel Models with Parameterizable Weight
Functions. These models are designed for situations where
higher-level outcomes depend on the combined influence of multiple
lower-level units. Instead of aggregating lower-level data to higher
levels (risking aggregation bias) or disaggregating outcomes to lower
levels (artificially inflating sample size), bml explicitly
models how lower-level units jointly shape higher-level outcomes.
Installation
Install the stable version from CRAN:
install.packages("bml")You’ll also need JAGS installed on your system. See the installation vignette for details.
Basic Example
Let’s start with a simple example using coalition government data. Each coalition (higher-level unit) is composed of multiple political parties (lower-level units), and we want to model how party characteristics influence coalition outcomes.
library(bml)
data(coalgov)
# Examine the data structure
head(coalgov[, c("gid", "pid", "pname", "n", "finance", "dur_wkb", "event_wkb")])Understanding the Data Structure
The coalgov dataset is in long format
where each row represents a party’s participation in a coalition:
-
gid: Government (coalition) identifier -
pid: Party identifier -
n: Number of parties in the coalition -
finance: Party’s financial dependence on member contributions (standardized) -
dur_wkb: Coalition duration in days -
event_wkb: Early termination indicator (1 = terminated early, 0 = censored)
Model 1: Basic Multiple-Membership Model
Our first model examines coalition duration as a function of party-level characteristics, aggregated using equal weights:
mod1 <- bml(
Surv(dur_wkb, event_wkb) ~ 1 + majority +
mm(
id = id(pid, gid),
vars = vars(finance),
fn = fn(w ~ 1/n, c = TRUE),
RE = TRUE
),
family = "Weibull",
data = coalgov,
seed = 1
)
summary(mod1)Breaking down the formula:
-
Surv(dur_wkb, event_wkb): Survival outcome (duration and event indicator) -
majority: Government-level covariate (binary indicator) -
mm(): Multiple-membership block containing:-
id = id(pid, gid): Party ID (member) and government ID (group) -
vars = vars(finance): Party-level covariate to aggregate -
fn = fn(w ~ 1/n, c = TRUE): Weight function (equal weights, constrained to sum to 1) -
RE = TRUE: Include party-specific random effects
-
Model 2: Parameterizing the Weight Function
A key feature of bml is the ability to model aggregation
weights as a function of covariates. Instead of assuming all parties
have equal influence, we can test whether seat share affects a party’s
weight in the coalition:
mod2 <- bml(
Surv(dur_wkb, event_wkb) ~ 1 +
majority +
mm(
id = id(pid, gid),
vars = vars(finance),
fn = fn(w ~ 1 / (1 + (n - 1) * exp(-(b0 + b1 * pseat))), c = TRUE),
RE = TRUE
),
family = "Weibull",
priors = c("b.w ~ dnorm(0,1)"), # weakly informative prior on the weight parameters
data = coalgov,
seed = 1,
monitor = TRUE
)
summary(mod2)New elements:
- The weight function now includes
pseat(party seat share) with parameterb1 -
priors: Specify prior for the weight function parameter -
monitor = TRUE: Save MCMC chains for diagnostics
Interpretation:
- If \(b1 \approx 0\): Seat share doesn’t affect weights (equal influence)
- If \(b1 > 0\): Larger parties have more weight
- If \(b1 < 0\): Smaller parties have more weight
Next Steps
- See the Model documentation for mathematical details
- Explore Examples for spatial, network, and aggregation applications
- Check the FAQ for common questions
- Read the accompanying paper: Rosche (2026)
Key Concepts
- Multiple-membership structure: Higher-level units (coalitions) contain multiple lower-level units (parties), and lower-level units can appear in multiple higher-level units.
-
Weight functions: Control how lower-level
characteristics are aggregated. Can be:
- Fixed (e.g., equal weights
w ~ 1/n) - Endogenous (estimated from data, e.g.,
w ~ 1/n^exp(b1*x))
- Fixed (e.g., equal weights
-
Constraint parameter
c:-
c = TRUE: Weights sum to 1 within each group -
c = FALSE: No constraint on weight sums
-
-
Random effects: Account for unobserved
heterogeneity at the member level (
RE = TRUEinmm()) or nesting level (type = "RE"inhm())