Home > Research . Search . Country . Browse . Small Grants

PSC In The News

RSS Feed icon

Stafford says exiting down stock market worsened position of low-income households

Bailey's work cited on growing income disparities in college enrollment and graduation

Murphy says mobile sensor data will allow adaptive interventions for maximizing healthy outcomes

Highlights

PSC Fall 2014 Newsletter now available

Martha Bailey and Nicolas Duquette win Cole Prize for article on War on Poverty

Michigan's graduate sociology program tied for 4th with Stanford in USN&WR rankings

Jeff Morenoff makes Reuters' Highly Cited Researchers list for 2014

Next Brown Bag

Monday, Nov 3
Melvin Stephens, Estimating Program Benefits

Michael R. Elliott photo

Addressing Disclosure Risk of Contextualized Microdata in Survey Design

a PSC Research Project

Investigators:   Michael R. Elliott, Kristine M. Witkowski, Daniel G. Brown, Trivellore Raghunathan

This project seeks to increase the availability of detailed research data about a person's neighborhood and individual characteristics, behaviors, and health outcomes, information which is crucial for research on critical national issues, such as health disparities. However, a delicate balance must be struck between providing easy access to these data and protecting the anonymity of study participants. Responding to the rising demand for Contextualized Microdata, large national Surveys typically collect meticulous information about their subjects' personal and geographic attributes. When data are prepared for public-use files, however, much of this important detail is either suppressed or coarsened to protect the anonymity of respondents. These limitations reduce opportunities for important scientific research and impose costly burdens on producers and distributors who must implement restrictive data use agreements. Little is known about how the ability to protect a respondent's identity (i.e., Disclosure Risk) is affected by releasing Microdata files that contain the contextual attributes of counties, tracts, blockgroups, and 1/2-mile geographic areas surrounding each subject. Considering factors that are determined at the outset of a study, it is not known how Disclosure Risk of Contextualized Microdata is affected by varying levels of sensitive information, or different sampling Designs and analytical purposes. Turning to factors that are usually addressed after data collection when research files are prepared for dissemination, it is not known to what extent that Disclosure Risk and the scientific value of data is affected by the selection of different variables for release or application of various statistical techniques to limit Disclosure. With a priori knowledge of these determinants, data producers will be able to anticipate how many and which respondents are at Risk of Disclosure, and adapt their data collection methods to protect them. Such adjustments will preserve and enhance the utility of the data for broad dissemination. Also, factors that affect data collection efficiencies can then be measured, allowing for the estimation of Survey costs associated with modifying sampling Designs to meet Disclosure goals. Hence this project seeks to incorporate Disclosure Risk into the conceptual and empirical frameworks used in the evaluation of Survey Designs. in so doing, we first develop and validate models that predict the composition of Survey data under different sampling Designs. Next we develop measures and methods used in the assessments of Disclosure Risk, analytical utility, and Disclosure Survey costs that are best suited for evaluating sampling and database Designs. Lastly we conduct simulations to gather estimates of Risk, utility, and cost for studies with a wide range of sampling and database Design characteristics. PUBLIC HEALTH RELEVANCE: Our project will increase the value and availability of scientific data by developing ways to assess, at the earliest stages of research, the Risks of disclosing confidential information about study subjects. Detailed data about peoples' neighborhoods, characteristics, behaviors and health are essential for informing policy and advancing science. But a balance must be struck between providing easy access to such data and protecting confidential information. By evaluating such Disclosure Risks in the Design phase of research, we will enhance investments in data collection and increase the value and availability of data on detailed subpopulations and their environments.

Funding Period: 01/16/2012 to 11/30/2017

Search . Browse