Skip to contents

ineqx: Descriptive and causal variance decompositions

Inequality is usually summarized in one number — a Gini, a variance, a 90/10 ratio. But that number hides how much of the inequality lives between groups (men vs women, college vs non-college, …) and how much lives within them, and it hides how a treatment, a policy, or any other binary predictor moves either piece. The ineqx package addresses both.

For a single cross-section it implements the classic Western & Bloome (2009) within/between decomposition. For repeated cross-sections it tracks how each piece changes over time. And — the part that’s new in Rosche (2026) — it extends the decomposition to a causal potential-outcomes setting: it splits the effect of a binary treatment on inequality into

  • a between-group effect, further decomposed into a heterogeneity term (Varπ(β)\text{Var}_\pi(\beta) — does the treatment effect vary across groups?) and a covariance term (2Covπ(μ(0),β)2\,\text{Cov}_\pi(\mu(0),\beta) — do the larger gains accrue to already-advantaged groups?), and
  • a within-group effect, similarly split into the average effect on group dispersion (hetW\text{het}_W) and the sorting of variance effects across groups (covW\text{cov}_W).

For repeated cross-sections, the change in this treatment effect further decomposes into three channels — behavioral (the treatment effects themselves changed), compositional (who gets treated changed), and pre-treatment (the baseline distribution changed) — combining a variance decomposition with a Kitagawa–Blinder–Oaxaca-style attribution.

With the ineqx package you can analyze:

  • how inequality splits at a single point in time into within- and between-group components,
  • how that split has changed over time, attributing the change to shifting means, dispersions, or group sizes,
  • how a treatment moves inequality — beyond just the mean — through the four sub-components above, and
  • how that effect has evolved over time, separating behavioral, compositional, and pre-treatment channels.

The package supports both the variance (VV) and the squared coefficient of variation (CV2CV^2) as the underlying measure (and accepts ystat = "VL" for the descriptive case, with caveats — see the FAQ).

This is what ineqx() looks like:

data(cps_sample)

# Descriptive: how does CV² of women's earnings split into within/between SES?
ineqx("earnweekf", ystat = "CV2", group = "SES",
      time = "year", ref = 1982, weights = "earnwtf",
      data = cps_sample)

# Causal: how does motherhood reshape that inequality?
ineqx("earnweekf", ystat = "CV2",
      treat = "mother", post = "byear",
      group = "edu", time = "year10", ref = 1980,
      formula_mu    = ~ mother * byear * factor(edu) * factor(year10) + age,
      formula_sigma = ~ mother * byear * factor(edu) * factor(year10),
      weights = "earnwtf", se = "delta",
      data = subset(cps_sample, age >= 18 & age <= 49 & earnweekf > 0))

See the Model structure page for the math, the Examples page for both calls walked through end-to- end, and the FAQ for common questions.

Developers

I welcome contributions to the package! Feel free to submit changes for review or contact me if you have any questions.

Issues or feature requests

If you would like to log an issue or submit a feature request, please open one on GitHub Issues.

Changelog

See NEWS.md for the package changelog.

More information at benrosche.com/projects/ineqx.