Profiling Missing Data in Electronic Health Records For Diabetes Care Research
Current guidelines for diabetes care recommend individualized treatment plans for complex patients since tight control of glycosylated hemoglobin (A1c) may not be appropriate. However, little evidence exists to support the patient-centered decisions. Electronic health records (EHRs) provide an important source for clinical evidence on improving diabetes care, but suffer from usability deficiencies. Particularly the lab measures and vital signs have intermittent missing values where the irregular visit patterns may be informative about the patients' underlying medication status. Patient characteristics are also incomplete due to linkage error. We aim to impute the missing values in EHRs and improve the data quality to strengthen the evidence base for diabetes guidelines. The proposed work is motivated by ongoing clinical research to examine the role of patient complexity in the relationship between tight A1c control and the risk of adverse events, using a pre-existing EHR dataset of 9101 patients with diabetes cared by the UW Health during 2003-2013. We propose Bayesian latent profile models under multiple imputation to account for the potentially non-ignorable visiting process, facilitate modeling a large number of EHR variables of mixed types and develop scalable computation algorithms. We will evaluate the theoretical and empirical properties of the proposal and compare with existing alternatives on missing data analysis and comparative effectiveness research. Specifically, first we build latent profiles by jointly modeling A1c values, patient characteristics and health outcomes. Second, we generalize the latent profiles by multiple pattern indices and combine the trajectories of multiple lab measures and vital signs with intermittent missing values, as well as accounting for incomplete patient sociodemographics. Third, we release open source computation software and disseminate new clinical findings to the healthcare delivery system. The investigation results will advance statistical methodology development for missing data in longitudinal studies, increase the compatibility of available patient medical records and strengthen the evidence base to support existing diabetes guidelines.
National Institute of Diabetes and Digestive and Kidney Diseases
Funding Period: 1/12/2018 to 7/31/2019