Skip to contents

1 Descriptive variance decomposition

Take the vector YtY_t to be individual incomes at time tt, and the vector GtG_t to be the group to which the individuals belong, where Gt=jG_t=j is a categorical variable with j=1,,Jj=1,…,J categories that represent mutually exclusive and exhaustive groups. The variance VV in income at time tt can then be expressed as the sum of the variance within and between groups:

Vt(Yt)=E(V(Yt|Gt))+V(E(Yt|Gt))=Wt+Bt=jπjtσjt2+jπjt(μjtjπjtμjt)2\begin{equation} \begin{split} V_t(Y_t) &= E(V(Y_t |G_t))+V(E(Y_t |G_t )) \\[10pt] &= W_t + B_t \\[10pt] &= \sum_j\pi_{jt}\sigma_{jt}^2+\sum_j\pi_{jt}\big(\mu_{jt}-\sum_j\pi_{jt}\mu_{jt}\big)^2 \end{split} \tag{1} \end{equation}

where πjt\pi_{jt} is the proportion of individuals in group jj at time tt, μjt\mu_{jt} is the mean income in group jj at time tt, and σjt2\sigma_{jt}^2 is the variance around this mean in group jj at time tt.

With repeated cross-sectional or panel data, the change in variance from t0t_0 (baseline) to tt (any timepoint post baseline) can then be decomposed into the sum of a within-group effect (δW\delta_W), a between-group effect (δB\delta_B), and a compositional effect (δC\delta_C). That is, $$\begin{equation} V_t-V_{t_0} = \delta_W^t + \delta_B^t + \delta_C^t \\[10pt] \text{where} \\[10pt] \begin{split} \delta_W^t &= \sum_j \pi_{jt_0} \big( \sigma_{jt}^2 - \sigma_{jt_0}^2 \big) \\[10pt] \delta_B^t &= \sum_j \pi_{jt_0} \big( (\mu_{jt} - \sum_j\pi_{jt}\mu_{jt})^2 - (\mu_{jt_0} - \sum_j\pi_{jt_0}\mu_{jt_0})^2 \big) \\[10pt] \delta_C^t &= \sum_j \big( \pi_{jt}-\pi_{jt_0} \big) \big( (\mu_{jt} - \sum_j\pi_{jt}\mu_{jt})^2 + \sigma_{jt}^2 \big) \end{split} \tag{2} \end{equation}$$

The between-group effect (δBt\delta_B^t) captures the change in total variance induced by changes in the mean of each group. The within-group effect (δWt\delta_W^t) captures the change in total variance induced by changes in the variance around the mean of each group. Finally, the compositional effect (δCt\delta_C^t) captures the change in total variance induced by changes in the relative size of each group. The superscript tt on the δs\delta\text{s} indicates that the change over time is considered.

2 Causal variance decomposition

I first entertain the treatment effect framework at a single point in time. Let D{0,1}D\in\{0,1\} be a binary treatment, Y(D)Y(D) be the potential outcome of the same outcome vector as before, and τ=Y(1)Y(0)\tau=Y(1)-Y(0) be the intra-individual causal effect of treatment on the outcome.

The group-specific treatment effect on the mean of the treated (ATTATT) then equals the expected value of the differences in the potential outcomes in each group:

ATTj=E[Y(1)Y(0)|G=j,D=1]\begin{equation} ATT_j=E[Y(1)-Y(0)|G=j,D=1] \tag{3} \end{equation}

Further, the group-specific treatment effect on the variance of the treated (VTTVTT) equals the difference in the variance in the potential outcomes in each group:

VTTj=V[Y(1)|G=j,D=1]V[Y(0)|G=j,D=1]\begin{equation} VTT_j=V[Y(1)|G=j,D=1]-V[Y(0)|G=j,D=1] \tag{4} \end{equation}

2a Decomposing the effect of treatment

The focus here is on the ATTATT rather than the average treatment effect (ATEATE) as we are interested in the aggregate consequences of treatment, which depends on the actual distribution of treatment across groups, E[D|G]E[D|G], and therefore on the ATTATT. Note that we assume that the treatment effect on the variance is fully described by its effect on the group-specific means and variances. Given these definitions, the effect of treatment on the variance can be decomposed into a within- and between group component, where the variance in the ATTATT equals the between-group component and the expected value of the VTTVTT equals the within-group component:

$$\begin{equation} V[Y(1)|D]-V[Y(0)|D] = \delta_B^D + \delta_W^D \\[10pt] \text{where} \\[10pt] \begin{split} \delta_B^D &= \sum_j \pi_j \big( (\mu_j+\beta_j- \sum_j\pi_j(\mu_j+\beta_j))^2 - (\mu_j - \sum_j\pi_j\mu_j)^2 \big) \\[10pt] \delta_W^D &= \sum_j \pi_j \big( (\sigma_j+\lambda_j )^2 - \sum_j\pi_j\sigma_j^2 \big) \end{split} \tag{5} \end{equation}$$

Note that the interpretation of some quantities changes as compared to equation (2). In equation (5), πj\pi_j is the proportion of individuals in each group receiving treatment (i.e., E[D|G]E[D|G]), μj\mu_j is the pre-treatment mean in group jj, σj\sigma_j is the pre-treatment standard deviation in group jj, βj\beta_j is the causal effect of treatment on μj\mu_j, λj\lambda_j is the causal effect of treatment on σj\sigma_j. The superscript DD on the δs\delta\text{s} indicates that the change caused by treatment (at a single point in time) is considered. The between-group effect δBD\delta_B^D captures the change in total variance induced by the effect of treatment on the mean of each group. The within-group effect δWD\delta_W^D captures the change in total variance induced by the effect of treatment on the variance of each group.

2b Decomposing the change in the effect of treatment

The treatment effect on the variance depends both on the treatment effects on the group-specific means and variances and on the distribution of treatment across groups and on the level of pre-treatment inequality. Therefore, with repeated cross-sectional or panel data, the change in total variance from t0t_0 (baseline) to tt (any timepoint post baseline) due to change in the effect of treatment can be decomposed into the sum of a between-group effect (δBD,t\delta_B^{D,t}), within-group effect (δWD,t\delta_W^{D,t}) a compositional effect (δCD,t\delta_C^{D,t}), and a pre-treatment effect (δPD,t\delta_P^{D,t}):

$$\begin{equation} (V[Y_t(1)|D_t]-V[Y_t(0)|D_t]) - (V[Y_{t_0}(1)|D_{t_0}]-V[Y_{t_0}(0)|D_{t_0}]) = \delta_B^{D,t} + \delta_W^{D,t} + \delta_C^{D,t} + \delta_P^{D,t} \\[10pt] \text{where} \end{equation}$$ δBD,t=B(πt0,μt0+βt)B(πt0,μt0+β0)=jπj,t0((μj,t0+βj,tjπj,t0(μj,t0+βj,t))2(μj,t0+βj,t0jπj,t0(μj,t0+βj,t0))2)δWD,t=W(πt0,σt0+λt)W(πt0,σt0+λt0)=jπj,t0((σj,t0+λj,t)2(σj,t0+λj,t0)2)δCD,t=(B(πt,μt0+βt)B(πt0,μt0+βt))(B(πt,μt)B(πt0,μt))+(W(πt,σt0+λt)W(πt0,σt0+λt))(W(πt,σt)W(πt0,σt))j(πj,tπj,t0)((μj,t0+βj,tjπj,t(μj,t0+βj,t))2(μj,tjπj,tμj,t)2+(σj,t0+λj,t)2σj,t2)δPD,t=B(πt,μt+βt)B(πt,μt0+βt)+W(πt,σt+λt)W(πt,σt0+λt)(B(πt0,μt)B(πt0,μt0)+W(πt0,σt)W(πt0,σt0))=jπj,t((μj,t+βj,tjπj,t(μj,t+βj,t))2(μj,t0+βj,tjπj,t(μj,t0+βj,t))2+(σj,t+λj,t)2(σj,t0+λj,t)2)jπj,t0((μj,tjπj,t0μj,t)2(μj,t0jπj,t0μj,t0)2+σj,t2σj,t02)\begin{equation} \begin{split} \delta_B^{D,t} &= B(\pi_{t_0},\mu_{t_0}+\beta_t) - B(\pi_{t_0},\mu_{t_0}+\beta_0) \\ &= \sum_j\pi_{j,t_0} \left( \Big(\mu_{j,t_0} + \beta_{j,t} - \sum_j\pi_{j,t_0}(\mu_{j,t_0}+\beta_{j,t}) \Big)^2 - \Big(\mu_{j,t_0} + \beta_{j,t_0} - \sum_j\pi_{j,t_0}(\mu_{j,t_0}+\beta_{j,t_0} ) \Big)^2 \right) \\[10pt] \delta_W^{D,t} &= W(\pi_{t_0},\sigma_{t_0}+\lambda_t) - W(\pi_{t_0},\sigma_{t_0}+\lambda_{t_0}) \\ &= \sum_j\pi_{j,t_0} \left( \Big(\sigma_{j,t_0}+\lambda_{j,t}\Big)^2 - \Big(\sigma_{j,t_0}+\lambda_{j,t_0}\Big)^2 \right) \\[10pt] \delta_C^{D,t} &= \Big( B(\pi_t,\mu_{t_0}+\beta_t) - B(\pi_{t_0},\mu_{t_0}+\beta_t ) \Big) - \Big( B(\pi_t,\mu_t ) - B(\pi_{t_0},\mu_t ) \Big) + \Big( W(\pi_t,\sigma_{t_0}+\lambda_t ) - W(\pi_{t_0},\sigma_{t_0}+\lambda_t ) \Big) - \Big( W(\pi_t,\sigma_t ) - W(\pi_{t_0},\sigma_t ) \Big) \\ &\approx \sum_j(\pi_{j,t}-\pi_{j,t_0} ) \left( \Big(\mu_{j,t_0}+\beta_{j,t}-\sum_j\pi_{j,t}(\mu_{j,t_0}+\beta_{j,t})\Big)^2 - \Big(\mu_{j,t}-\sum_j\pi_{j,t}\mu_{j,t}\Big)^2 + \Big(\sigma_{j,t_0}+\lambda_{j,t}\Big)^2-\sigma_{j,t}^2 \right) \\[10pt] \delta_P^{D,t} &= B(\pi_t,\mu_t+\beta_t) - B(\pi_t,\mu_{t_0}+\beta_t) + W(\pi_t,\sigma_t+\lambda_t) - W(\pi_t,\sigma_{t_0}+\lambda_t) - \Big( B(\pi_{t_0},\mu_t) - B(\pi_{t_0},\mu_{t_0}) + W(\pi_{t_0},\sigma_t) - W(\pi_{t_0},\sigma_{t_0}) \Big) \\ &= \sum_j\pi_{j,t} \left( \Big(\mu_{j,t}+\beta_{j,t}-\sum_j\pi_{j,t}(\mu_{j,t}+\beta_{j,t}) \Big)^2 - \Big(\mu_{j,t_0}+\beta_{j,t}-\sum_j\pi_{j,t}(\mu_{j,t_0}+\beta_{j,t})\Big)^2 + \Big(\sigma_{j,t}+\lambda_{j,t} \Big)^2 - \Big(\sigma_{j,t_0}+\lambda_{j,t}\Big)^2 \right) - \sum_j\pi_{j,t_0} \left( \Big(\mu_{j,t}-\sum_j\pi_{j,t_0}\mu_{j,t}\Big)^2 - \Big(\mu_{j,t_0}-\sum_j\pi_{j,t_0}\mu_{j,t_0}\Big)^2 + \sigma_{j,t}^2-\sigma_{j,t_0}^2 \right) \end{split} \tag{6} \end{equation}

2c Decomposing the change in the post-treatment variance due to treatment

Rather than decomposing the change in the effect of treatment on the variance, the change in post-treatment variance induced by the change in the effect of treatment, i.e., V[Yt(1)|Dt]V[Y(t0)(1)|D(t0)]V[Y_t(1)|D_t] - V[Y_(t_0 )(1)|D_(t_0 )], can also be decomposed.

$$\begin{equation} V[Y_t(1)|D_t] - V[Y_(t_0 )(1)|D_(t_0 )] = \delta_B^{D,t} + \delta_W^{D,t} + \delta_C^{D,t} + \delta_P^{D,t} \\[10pt] \text{where} \end{equation}$$ δBD,t=jπjt0((μjt0+βjtjπjt0(μjt0+βjt))2(μjt0+βjt0jπjt0(μjt0+βjt0))2)δWD,t=jπjt0((σjt0+λjt)2(σjt0+λjt0)2)=jπjt0(λjt2λjt02+2σjt0(λjtλjt0))δCD,t=jπjt((μjt0+βjtjπjt(μjt0+βjt))2+(σjt0+λjt)2)jπjt0((μjt0+βjtjπjt0(μjt0+βjt))2+(σjt0+λjt)2)j(πjtπjt0)((μjt+βjtjπjt(μjt+βjt))2+(σjt0+λjt)2)δPD,t=jπjt((μjt+βjtjπjt(μjt+βjt))2(μjt0+βjtjπjt(μjt0+βjt))2+(σjt+λjt)2(σjt0+λjt)2)\begin{equation} \begin{split} \delta_B^{D,t} &=\sum_j\pi_{jt_0} \left( \Big(\mu_{jt_0}+\beta_{jt}-\sum_j\pi_{jt_0}(\mu_{jt_0}+\beta_{jt})\Big)^2 - \Big(\mu_{jt_0}+\beta_{jt_0}-\sum_j\pi_{jt_0}(\mu_{jt_0}+\beta_{jt_0})\Big)^2 \right) \\[10pt] \delta_W^{D,t} &= \sum_j\pi_{jt_0}\left( \Big(\sigma_{jt_0}+\lambda_{jt}\Big)^2 - \Big(\sigma_{jt_0}+\lambda_{jt_0}\Big)^2 \right) = \sum_j\pi_{jt_0} \Big( \lambda_{jt}^2-\lambda_{jt_0}^2 + 2\sigma_{jt_0}(\lambda_{jt}-\lambda_{jt_0}) \Big) \\[10pt] \delta_C^{D,t} &= \sum_j\pi_{jt}\left( \Big(\mu_{jt_0}+\beta_{jt}-\sum_j\pi_{jt}(\mu_{jt_0}+\beta_{jt})\Big)^2 + \Big(\sigma_{jt_0}+\lambda{jt}\Big)^2 \right) - \sum_j\pi_{jt_0}\left( \Big(\mu_{jt_0}+\beta_{jt}-\sum_j\pi_{jt_0} (\mu_{jt_0}+\beta_{jt})\Big)^2 + \Big(\sigma_{jt_0}+\lambda_{jt}\Big)^2 \right) \\ &\approx \sum_j(\pi_{jt}-\pi_{jt_0}) \left( \Big(\mu_{jt}+\beta_{jt}-\sum_j\pi_{jt}(\mu_{jt}+\beta_{jt})\Big)^2 + \Big(\sigma_{jt_0}+\lambda_{jt} \Big)^2 \right) \\[10pt] \delta_P^{D,t} &=\sum_j\pi_{jt} \left( \Big(\mu_{jt}+\beta_{jt}-\sum_j\pi_{jt}(\mu_{jt}+\beta_{jt})\Big)^2 - \Big(\mu_{jt_0}+\beta_{jt}-\sum_j\pi_{jt}(\mu_{jt_0}+\beta_{jt})\Big)^2 + \Big(\sigma_{jt}+\lambda_{jt}\Big)^2 - \Big(\sigma_{jt_0}+\lambda_{jt}\Big)^2 \right) \end{split} \tag{7} \end{equation}