Jason Owen-Smith

Creating a Data Quality Control Framework for Producing New Personnel-Based S&E Indicators

Research Project Description
Jason Owen-Smith, Jinseok Kim

We will develop an Automated and Stratified Entity Disambiguation (ASED) framework to resolve name ambiguity in large bibliographic data. We increase disambiguation accuracy by using stratified segmentation of entity instances and supervised machine learning trained on automatically labeled data. Second, we demonstrate the value of disambiguated data at scale by examining the involvement of U.S. science & engineering (S&E) researchers in international collaboration and citation networks using the entire corpus of Web of Science. We propose counterfactual analyses and impact simulations that compare model validity and research findings from the same data disambiguated using different methods. The approach we propose to disambiguate names and estimate ambiguity impact will contribute to sociology and management research for understanding what makes scientists and nations innovative and productive from ambiguous data, and to computer & information science for improving entity disambiguation and unstructured record linkage. The tools will be shared for reuse and improvement by scholars, and integrated into a data and codes platform open to research community for rigorous knowledge discovery from promising but messy data on S&E.

National Science Foundation

Funding Period: 9/1/2019 to 8/31/2021

PSC In The News

RSS Feed icon

Geronimus writes about her research on "weathering," or the constant presence of stress hormones in the body from our ceaseless daily grind over years & decades, & how stress is actually killing us.

'Ban the Box' Laws Could Negatively Impact Minorities, according to a study by Agan and Starr

Washington Post quotes Shaefer about wealth disparity amongst older Americans.

More News


National Study of Caregiving (NSOC) Extended

Fabian Pfeffer receives Doris Entwisle Early Career Award from American Sociological Association

More Highlights

Connect with PSC follow PSC on Twitter Like PSC on Facebook