Sunghee Lee

Improving Reproducibility of Respondent Driven Sampling through Adaptive Design

Research Project Description
Sunghee Lee, Juliette Kathryn Roddy, James Wagner, Erin Bonar, Michael R. Elliott

Respondent driven sampling (RDS) is a recruitment method for hard-to-sample populations that are rare in number and/or elusive due to highly-stigmatized or illicit behaviors. For these groups, traditional probability sampling loses its feasibility, because it requires prohibitively high screening costs to locate eligible persons, and, even when eligible persons are located, their desire to hide results in false negatives. Based on the premise that people of similar traits form some type of social networks, RDS exploits the existing networks for recruitment and has been applied to numerous studies. For example, the National HIV Behavioral Surveillance by CDC uses RDS for the people who inject drugs (PWID) component.

Unlike traditional sampling, where researchers sample and recruit participants, RDS asks participants to recruit other eligible persons from their social networks. The use of organic social networks for sampling is an innovative feature of RDS. This, however, comes with one major challenge. In order for RDS to "work", participants need to cooperate with recruitment requests. This cooperation issue has profound implications for inferences as well as design of RDS. First, RDS inferences rest on a set of assumptions that recruitment follows memory-less Markov chain (e.g., a chain's overall characteristics are not dependent on its seed's characteristics) and reaches equilibrium. This requires recruitment chains formed by individual seeds to be sufficiently long. Noncooperation results in short chains, leading this assumption unmet. Existing RDS estimators are largely blind to this reality and, hence, limited in producing generalizable knowledge. Second, due to noncooperation, RDS fieldwork may not progress as expected. While examples of these are plentiful, they are reported anecdotally and rarely make to the literature. Hence, RDS data collection progress is extremely difficult to predict at the design stage, and when the progress deviates from expectations, researchers are left to make unplanned design changes (e.g., increase the amount of incentives) on the spur of the moment in hopes of making RDS "work." This approach is not replicable, the science is suspect, and the missteps are repeated.

This study attempts to improve operational and statistical reproducibility of RDS by proposing adaptive-RDS (A-RDS) as a design framework and to provide practical tools on which researchers rely for successful implementation of RDS, where success is measured through recruitment cooperation.

National Institute on Aging

Funding Period: 2/15/2019 to 12/31/2023

PSC In The News

RSS Feed icon

Shaefer comments on the Cares Act impact in negating hardship during COVID-19 pandemic

Heller comments on lasting safety benefit of youth employment programs

More News


Dean Yang's Combatting COVID-19 in Mozambique study releases Round 1 summary report

Help Establish Standard Data Collection Protocols for COVID-19 Research

More Highlights

Connect with PSC follow PSC on Twitter Like PSC on Facebook