Despite a growing tendency in public health research to augment data sets that have inadequate economic status information by appending aggregate census data, little empirical attention has been paid to estimating the validity of this approach. This study uses micro-level data from the Panel Study of Income Dynamics linked to Summary Tape File data from U.S. Censuses to estimate the reliability of using census-based proxies relative to five-year averages of micro-level income. In an illustrative example of specific relevance to infant mortality research, the authors gauge the adequacy of these proxies for estimating the direct effects of economic status and for controlling for confounding between economic status and maternal age or race. They estimate the magnitude of the biases involved in using the census-based strategy and the sensitivity of these estimates to using different pieces of census information, different levels of geographic aggregation, and different census years. The authors also estimate the magnitude of the bias involved in using single year micro-level income measures as proxies for more permanent economic status.
Results indicate that aggregate measures can be useful proxies for micro level variables, but also emphasize their weaknesses. Using aggregate level variables underestimates the effect of economic status by as much as 50 percent and eliminates an even smaller fraction (10- 30 percent) of the confounding between economic status and other covariates. Single year individual income measures are no better than aggregate proxies for estimating main effects, underlining their limitations, but are more adequate for controlling for confounding. In general, the magnitude of the biases involved when using aggregate proxies depends on both the correlation between aggregate and micro variables and the extent to which the explanatory variables of primary interest vary within geographic areas.