Multiple Imputation in Two-stage Cluster Samples Using the Weighted Finite Population Bayesian Bootstrap

Archived Abstract of Former PSC Researcher

Zhou, H., Michael R. Elliott, and Trivellore Raghunathan. 2016. "Multiple Imputation in Two-stage Cluster Samples Using the Weighted Finite Population Bayesian Bootstrap." Journal of Survey Statistics and Methodology, 4(2): 139-170.

Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in "Delta-V," a key crash severity measure.

10.1093/jssam/smv031

Browse | Search | Next

PSC In The News

RSS Feed icon

Shaefer comments on the Cares Act impact in negating hardship during COVID-19 pandemic

Heller comments on lasting safety benefit of youth employment programs

More News

Highlights

Dean Yang's Combatting COVID-19 in Mozambique study releases Round 1 summary report

Help Establish Standard Data Collection Protocols for COVID-19 Research

More Highlights


Connect with PSC follow PSC on Twitter Like PSC on Facebook