The exchange between me and Gordo about social mobility in the US with and without adjusting for the lower upward mobility of black people led me to try and formulate a mathematical model that describes the situation. In particular, I’m interested in the following situation:
1. A population P is uniformly distributed across two variables, x and y (for parents’ and children’s incomes); the uniform distribution makes this a question of percentiles, and doesn’t confuse changes in the level of inequality for higher social mobility.
2. P is subdivided into two subpopulations, P1 and P2, which have different means.
3. To simplify things, let’s assume P1 and P2’s distributions don’t change; that is, P1 has the same distribution of x as it does of y. For example, P1 might consist of all people with x-values in the lowest 10 percentiles, in which case its distribution of y-values will consist of the bottom 10 percentiles, too. In our concrete example, it means there’s no racial intermixing and no change in racial inequality.
If P1 consists of the bottom 12.5% of the population in both x– and y-values, but within each group there’s perfect income mobility, then the regression coefficient between x and y is 0.31.
Obviously, the real world is more complicated than that, so I’m trying to extend my model in various ways. For example, I’m going to investigate whether the exact distribution of x and y within each subpopulation matters, or just the correlation. It’s easier when there’s perfect mobility in each subgroup than when there isn’t.
And that’s the easy part. Trying other distributions than P1-hogs-the-bottom-percentiles is harder; finding distributions that reflect different means is easy, but finding ones that calculations don’t become monstrous in isn’t. And, of course, in the real world populations tend to mix over time – the US is getting less racist, and its level of racial inequality is decreasing – so rule #3 isn’t absolute.