Archive for the 'Methodology' Category

Page 4 of 9

Imagining a Census Survey Without a Mandate

This is an update to a May 17th post on challenges to the American Community Survey’s mandatory response status via a House Bill [H.R. 931] introduced by Ted Poe, (R, TX):

Imagining a Census Survey Without a Mandate
Carl Bialik | Wall Street Journal (Blog post)
June 5, 2013
This piece mentions to former ISR researchers: Leslie Kish’s role in the move away from a decennial census to the ACS and Bob Groves’ on the currency of the ACS data. However, it mostly focuses on the statistical issues, which a voluntary ACS would introduce.

Census Gets Questions on Mandatory Queries
Carl Bialik | Wall Street Journal
March 30, 2012
Old article, but the issues are the same.

The Census’s 21st-Century Challenges
Carl Bialik | Wall Street Journal (Blog)
July 30, 2010
This piece talks about Canada’s foray into a voluntary census, which we’ve also covered. A good source for quotes about response bias.

Lessons from North of the Border

Why a Voluntary ACS Could Wipe Some States off of the Map
Terri Ann Lowenthal | The Census Project Blog
May 17, 2013

This is a great re-cap of the disaster Canada has on its hands with its voluntary National Household Survey. And, it is relevant for the US, because Congressional Republicans want to allow people to ‘just say no’ to all or part of the American Community Survey. She also reminds readers of the history of the marriage question in the US Census, including the possible deletion of the “times married” question.

The PSC-Info blog has several links to recent ACS/Census funding news:

ACS to drop “Number of Times Married” question

“it’s an Alice in Wonderland moment” or “GOP Census Bill would Eliminate America’s Economic Indicators”

The Census Reform Act of 2013

The ACS Faces More Battles

SENATE: The Census Bureau has already written the reports; read them.

Nerd Alert: Dictionary of Numbers

For those of you who try to incorporate quantitative reasoning in your teaching, here’s a nice resource:

Dictionary of numbers: putting numbers in human terms
This is a Google Chrome extension that tries to make sense of numbers you encounter on the web by giving you a description of that number in human terms. Because “8 million people” means nothing, but “population of New York City” means everything.

And, here’s a blog post about it from the nerd-friendly xkcd site – a webcomic of romance, sarcasm, math, and language:

Dictionary of Numbers
May 15, 2013
Opening paragraph:

I don’t like large numbers without context. Phrases like “they called for a $21 billion budget cut” or “the probe will travel 60 billion miles” or “a 150,000-ton ship ran aground” don’t mean very much to me on their own. Is that a large ship? Does 60 billion miles take you outside the Solar System? How much is $21 billion compared to the overall budget?

Measuring Marriage & Divorce among Same-Sex Couples

For Gays, Breaking Up Is Hard to Do – or Measure
Carl Bialik | Wall Street Journal [print column]
May 3, 2013
This article touches on the personal and on the aggregate. The personal stories are couples being unable to get a divorce because they live in states that do not recognize same-sex marriages. On the other hand, states have not modified divorce forms to collect data on same-sex couples.

Same-Sex Divorce Stats Lag
Carl Bialik | Wall Street Journal [blog]
May 3, 2013
This version provides links to sources of marriage and divorce statistics. European countries do collect data on these events, but so far do not have enough dissolutions to calculate robust rates. An NIH-funded study is following a cohort of couples who were married in Vermont.

Decennial Census Data on Same Sex Couples
Census Bureau
May 2013
The Census Bureau has a website with links to technical papers, data, etc. on same-sex couples from 1990+ as measured by this agency.

Census Bureau: Flaws in Same-Sex Couple Data
D’Vera Cohn | Pew: Social and Demographic Trends
September 27, 2011
The Census Bureau announced today that more than one-in-four same-sex couples counted in the 2010 Census was likely an opposite-sex couple, and identified a confusing questionnaire as a likely culprit. The bureau released a new set of “preferred” same-sex counts, including its first tally ever of same-sex spouses counted in the census.

How Accurate Are Counts of Same-Sex Couples?
D’Vera Cohn | Pew: Social and Demographic Trends
August 25, 2011
This is a nice brief on the obstacles to accuracy in measuring same-sex couples in census data. And, it illustrates the efforts that the Census Bureau makes in measuring concepts in an era of rapid social change.

Canada’s “NSF” Problem

House Republicans are trying to implement serious changes to the evaluation and funding of NSF science [here and here].

Canada is perhaps a bit further down this road. Here’s the latest on the decision to fund research that has industry applications rather than basic science.

When science goes silent
Jonathan Gatehouse | MacLean’s
May 3, 2013
This article touches on the shift in funding from basic science to applied science, but it is more in-line with an earlier post on the muzzling of environmental scientists.

National Research Council move shifts feds’ science role
Canadian Press | CBC News
May 7, 2013
‘Job-neutral’ restructuring to make agency streamlined, efficient and functional, president says

The Harper government is telling the National Research Council to focus more on practical, commercial science and less on fundamental science that may not have obvious business applications.

The government says the council traditionally was a supporter of business, but has wandered from that in recent years — and will now get back to working on practical applications for industries.

Some folks disagree with this shift:

In a statement, the executive director of the Canadian Association of University Teachers said the government is “killing the goose that laid the golden egg.”

“By transforming the NRC into a “business-driven, industry-relevant” organization, you are denying its ability to support basic research,” said Jim Turk.

“At the same time, you are cutting support to basic research in the universities.”

And is this part of the Tory ‘war on science’? [more coverage on this]

NDP science critic Kennedy Stewart called the shift in direction for the NRC “short-sighted” and said it could actually hurt economic growth in the long run, because it scales back the kind of fundamental research that can lead to scientific breakthroughs.

Research Council to focus on commercially viable projects, rather than science for science’s sake
Jessica Hume | Sun News
May 7, 2013
Two quotes say it all:

The government of Canada believes there is a place for curiosity-driven, fundamental scientific research, but the National Research Council is not that place.

“Scientific discovery is not valuable unless it has commercial value,” John McDougall, president of the NRC, said in announcing the shift in the NRC’s research focus away from discovery science solely to research the government deems “commercially viable”.

Nature: Replication, replication, replication

This issue of Nature is a compilation of replication articles across several issues of Nature. They highlight the importance of replication and open data for science. However, some of the examples might apply more to medicine or biology than population science. Lest, readers think that this issue doesn’t apply to demographers, here’s a tweet from Justin Wolfers, advertising a piece in Bloomberg Business on the importance of replication for the field of economics. His motivation is the recent dust-up due to an error in a famous paper by Reinhart and Rogoff [See PSC-Info], but the discussion is much broader than that example.

tweet

[Link to Stevenson/Wolfers Replication article]

INTRODUCTION TO SPECIAL NATURE ISSUE
No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help.

Reducing our irreproducibility
[Editorial]
(April 25 , 2013)

Further confirmation needed
A new mechanism for independently replicating research findings is one of several changes required to improve the quality of the biomedical literature.
Nature Biotechnology 30, 806
[Editorial]
(September 10, 2012)

Error Prone
Biologists must realize the pitfalls of work on massive amounts of data.
Nature 487, 406
[Editorial]
(July 26, 2012)

Must Try Harder
Too many sloppy mistakes are creeping into scientific papers. Lab heads must look more rigorously at the data — and at themselves.
Nature 483, 509 x
[Editorial]
(March 29, 2012)

NEWS AND ANALYSIS

Independent labs to verify high-profile papers
Monya Baker
Nature News
(August 14, 2012)

Power Failure: Why small sample size undermines the reliability of neuroscience
Katherine S. Button, John P. A. Ioannidis et al.
Nature Reviews Neuroscience 14, 365-376
(April 15, 2013)

Replication studies: Bad copy
Ed Yong
Nature 485, 298-300
(May 17, 2012)

Reliability of ‘new drug target’ claims called into question
Asher Mullard
Nature Reviews Drug Discovery 10, 643-644
(September 2011)

COMMENT

If a job is worth doing, it is worth doing twice
Jonathan F. Russell
Nature 496, 7
(April 4, 2013)

Methods: Face up to false positives )
Daniel MacArthur
Nature 487, 427-429 \
(July 26, 2012)

Drug development: Raise standards for preclinical cancer research )
C. Glenn Begley & Lee M. Ellis
Nature 483, 531-533
(March 29, 2012

Believe it or not: how much can we rely on published data on potential drug targets? )
Florian Prinz, Thomas Schlange & Khusru Asadullah
Nature Reviews Drug Discovery 10, 712
(September 2011)

Tackling the widespread and critical impact of batch effects in high-throughput data
Jeffrey T. Leek, Robert B. Scharpf et al.
Nature Reviews Genetics 11, 733-739 )
(October 2010)

PERSPECTIVES AND REVIEWS

Research methods: know when your numbers are significant
David L. Vaux
Nature 492, 180-181
(December 13, 2012)

A call for transparent reporting to optimize the predictive value of preclinical research
Story C. Landis, Susan G. Amara et al.
Nature 490, 187-191
(October 11, 2012)

Next-generation sequencing data interpretation: enhancing reproducibility and accessibility
Anton Nekrutenko & James Taylor
Nature Reviews Genetics 13, 667-672
(September 2012)

The case for open computer programs
Darrel C. Ince, Leslie Hatton & John Graham-Cumming
Nature 482, 485-488
(February 23, 2012)

Reuse of public genome-wide gene expression data
ohan Rung & Alvis Brazma
Nature Reviews Genetics 14, 89-99
(February 2013)

Research from The Data Privacy Lab

Respondent re-identification is a big worry for data projects who want to share their data. And, some recent cases illustrate that can/is occurring with genetic data. But, sometimes the case is over-stated. Here is an illustration with a case that hit the press with great fanfare.

First, the fun stuff. See, if you are unique. The following link has you type in your gender, exact age of birth and your 5-digit zip code. The latter two do not meet HIPAA guidelines:

Next are several links: The first is the coverage of re-identification in the press (Forbes, The Scientist, & xxxx) followed by the researcher’s version of the story (Sweeney). The next is a rebuttal, which reminds readers that administrative matches, e.g., voting registration are not as ubiquitous as some claim. There is also a link to an article by Barth-Jones where he discusses the famous case of the re-identification of Governor William Weld, which lead to much of the HIPAA rules.

Harvard Professor Re-Identifies Anonymous Volunteers In DNA Study
Adam Tanner | Forbes
April 24, 2013

Participants in Personal Genome Project Identified by Privacy Experts
MIT Technology Review
May 1, 2013

“Anonymous” Genomes Identified
Dan Cossins | The Scientist
May 3, 2013

Identifying Participants in the Personal Genome Project by Name
Latanya Sweeney, Akua Abu, Julia Winn | Data Privacy Lab

Reporting Fail: The Reidentification of Personal Genome Project Participants
Jane Yakowitz Bambauer | Info/Law [Harvard Law Blogs]
May 1, 2013

The ‘Re-Identification’ of Governor William Weld’s Medical Information: A Critical Re-Examination of Health Data Identification Risks and Privacy Protections, Then and Now
Daniel C. Barth-Jones | Social Science Research Network (SSRN)
June 4, 2012

Special Issue on Survey Non-response

Introduction: New Challenges to Social Measurement
Douglas S. Massey and Roger Tourangeau
Abstract | PDF

Facing the Nonresponse Challenge
Frauke Kreuter
Abstract | PDF

Explaining Rising Nonresponse Rates in Cross-Sectional Surveys
J. Michael Brick and Douglas Williams
Abtract | PDF

Response Rates in National Panel Surveys
Robert F. Schoeni, Frank Stafford, Katherine A. Mcgonagle, and Patricia Andreski
Abstract | PDF

Consequences of Survey Nonresponse
Andy Peytchev
Abstract | PDF

The Use and Effects of Incentives in Surveys
Eleanor Singer and CongYe
Abstract | PDF

Paradata for Nonresponse Adjustment
Kristen Olson
Abstract | PDF

Can Administrative Records Be Used to Reduce Nonresponse Bias?
John L. Czajka
Abstract | PDF

An Assessment of the Multi-level Integrated Database Approach
Tom W. Smith and Jibum Kim
Abstract | PDF

Where Do We Go from Here? Nonresponse and Social Measurement
Douglas S. Massey and Roger Tourangeau

Abstract | PDF

The Twitter paper from PAA’s “social media” session

Using Twitter for Demographic and Social Science Research:
Tools for Data Collection

T. McCormick, H. Lee, N. Cesare and A. Shojaie | CSSS/University of Washington
April 8 2013
This is a proof of concept paper. The researchers searched through tweets for phrases that indicated an intention to “not vote” in the 2012 election. They used Amazon’s Mechanical Turk to identify the profile pictures of their sample (age, gender, race).

Folks interested in other examples of “wild data” like Google searches, Twitter, etc. should check these posts:

Wild Data: Expanding Social Science Research
Big Data: Google Flu
Using Wild Data to Estimate International Migration

The Unauthorized Immigrant Population: Two Technical Excercises

This blog entry has two nice technical pieces. The first describes how PEW Hispanic (and others) estimate the undocumented population in the US. The second is a life-table exercise, which shows how many of the undocumented population will die waiting for citizenship – assuming a 13 year wait time.

Unauthorized Immigrants: How Pew Research Counts Them and What We Know About Them
Interview with Jeff Passel | Pew Hispanic
April 17, 2013
In this interview, Passel describes how he estimates the undocumented population in the US – including the other characteristics of this population, e.g., occupation, current residence, family composition, etc. using data from the Current Population Survey.

As a note, most of the reports PEW Hispanic writes on the undocumented population have an appendix, which provides a more technical description of the methodology. See page 25 of the following report for an example: Cohen, D’Vera and Jeff Passel. 2011. “Unauthorized Immigrant Population: National and State Trends, 2010″ Pew Hispanic: February 1, 2011.

The life-table exercise is from Philip Cohen’s Family Inequality blog.

How many people should die waiting for citizenship? 319,462?
Philip Cohen | Family Inequality Blog
April 24, 2013

This is a life-table exercise, taking the current age distribution of the undocumented population in the US and applying a life-table for Hispanics to the numbers. He describes his assumptions and invites folks to re-calibrate the numbers.

Note that Cohen takes a dig at Reinhart and Rogoff [previous PSC Infoblog entry] by making his spreadsheet available. And, he notes “If you don’t like the way Excel does the maths, by all means, fix it in R.”