Author Archive for lisan

IRS Migration Data Report Tool

IRS map

This is a nice tool for getting net migration reports based on IRS tax return data. Note that because these data are based on tax returns, one can also tell whether, on average, a state is losing/gaining wealthier residents. One can generate reports for counties by state or for states. The former is really tedious because one has to generate the county reports one by one.

Tool Link
Counties | States

And here’s the link to raw data for those who find widgets tedious. Note that the site has nice explanations for the methodology, including changes over time in how these files are created: SOI Tax Stats – Migration Data

And, do you want to know how to make something like the map above? Here’s a link from Flowing Data on how to make a similar map based on 5-years of county-to-county IRS data:
Article | How To Guide

Creating Residential Histories

This is a report on the NCI/SEERS web portal on a way to create residential histories of respondents/decadents for epidemiological research. The report (below) details how three commercial vendors were able to match the residential history of a small sample of federal government employees. Also available are the algorithms and software to reconcile conflicting addresses. Interested folks might want to browse other tools/papers in the NCI Geographical Information Systems and Science for Cancer Control webiste. https://gis.cancer.gov/index.html

NCI/SEER Residential History Project
David Stinchcomb and Allison Roeser | Westat
May 2016
[pdf]

SAS residential history generation programs [3 programs]
[Summary] [Link to programs]

Are secondary data users research parasites?

Even though NIH and NSF both have data sharing requirements, there is clearly some resistance to it. The best example is an editorial from the New England Journal of Medicine. Secondary data users are characterized as “research parasites.”

A rebuttal comes from a Science editorial with the title #IAmAResearchParasite.

Data Sharing
Dan L. Longo and Jeffrey Drazen | N Engl J Med
January 21, 2016

#IAmAResearchParasite
Marcia McNutt | Science
March 4, 2016

Current reproductive trends via pregnancy rates

The drop in birth rates from 2007 through 2013 has been well documented. However, it is also important to examine total rates of pregnancy and other pregnancy outcomes (abortion and fetal loss) to provide a comprehensive picture of current reproductive trends. This NCHS Health E-Stat uses data from 2010 to update a previous NCHS report on pregnancy rates. Data on pregnancy outcomes by age and race and Hispanic origin are presented.

2010 Pregnancy Rates Among U.S. Women
Sally C. Curtin, Joyce Abma [NCHS] and Kathryn Kost [Guttmacher Institute]
December 2015
html | pdf

Graph of Race x Age specific pregnancy rates over time

Demographer vs Demographer

Monday’s Supreme Court case centered on data. The case, Evenwell v Abbot, argues that representation in Texas legislative districts ought to be based on voters rather than the total population. Currently, most states use total population for re-districting purposes and this comes from the decennial census. The decennial census does not have a citizenship question. But, the replacement for the Census long-form, the American Community Survey (ACS) does.

The former directors of the Census Bureau filed an amicus brief against the idea of using the eligible voter population (e.g., citizens 18+ years of age). A group of applied demographers also filed an amicus brief, noting that this was quite possible using the ACS. Note that Sonia Sotomayor does not think the ACS is adequate, but that is because she misunderstands the data:

tweet

As is typical with cases involving data and social science research, there are lots of supplementary links:

The Washington Post [10 or so opinions from the Opinion | In Theory section]
One Person One Vote’: A Primer
Washington Post | Opinion : In Theory
October 2015
[10 or so opinions and comments]

Argument preview: How to measure “one person, one vote”
Lyle Dunston | ScotusBlog
December 1, 2015

The Threat to Representation for Children and Non-Citizens: An Analysis of the Potential Impact of Evenwel v. Abbott on Redistricting
Andrew Beveridge | Social Explorer
December 2, 2015

Supreme Court is skeptical of challenge to Texas district lines
Maria Recio | Sacramento Bee
December 8, 2015
This is the source of the Sotomayor quote

. . . Dueling Affirmative Action Empiricism” [this is actually from Fisher vs Texas, but is included here as evidence of the Supreme Court using social science research.

Big changes coming to IRBs . . . . still time to comment

OHRP has release its notice of proposed rule making that makes significant changes to the Common Rule.

Federal Register: Federal Policy for the Protection of Human Subjects

[Comment Link]
Comments are accepted up until 12/07/2015 at 11:59 PM EST

If you need to get up to speed, The National Academy of Sciences published a book in 2014 on the first release of changes to the common rule. It is available on-line, as a pdf or as a book.

Will this be an election issue in the US?

The Canadian election campaign period is much shorter than the US. The Canadian election will take place on October 19, 2015 and the campaigning started on August 2nd of this year.

Another difference with the US is the types of issues that candidates are discussing – specifically science policy and the long-form census. Will these be issues in the US? Doubtful, but let’s watch the debates and see.

Below is recent coverage in the Canadian press about the long-form census and science policy being issues, at least among the NDP and Liberals:

Reviving the Census Debate
Donovan Vincent | The Star
September 12, 2015
The Liberals and the NDP have said they want to bring back the long-form census the Conservatives killed in 2010. Could it become an election issue?

Researchers try to make science a federal election issue
Julie Ireton | CBC News
September 3, 2015

Here is a running list of organizations that were against/in favor of the Harper government’s cancellation of the mandatory long-form census.

Here is previous coverage in this blog about Canada’s war on science and follies with their census.

Police calls and blurry neighborhood boundaries

Here’s a great piece using a mix of administrative data (complaint calls to the police), on-line forums, spatial data, and traditional census data to see what happens in the transition zones across neighborhoods. The first link is to the easy-to-read version as reported in CityLab; the second is the original piece, with more details about the methodology.

When Racial Boundaries Are Blurry, Neighbors Take Complaints Straight to 311
Laura Bliss | CityLab
August 25, 2015
In NYC, calls about noise and blocked driveways are most frequent in zones between racially homogenous neighborhoods.

Contested Boundaries: Explaining Where Ethno-Racial Diversity Provokes Neighborhood Conflict
Joscha Legewie and Merlin Schaeffer | Presentated at the American Sociological Meetings
August 21, 2015

Apple Research Kit: New Frontiers in Data Collection & Informed Consent

The Apple Research Kit allows researchers to develop an iPhone app, which interested respondents can download from the Apple Store. The respondent goes through an on-line consent form and then responds to questions, tasks (walking), etc. Some of the diagnostic tools are based on previously developed apps from the Apple Healthkit.

As of now, apps have been developed for collecting data for research projects on asthma, cardiovascular disease, diabetes, Parkinson’s, mind, body, and wellness after breast cancer, and for a population-based study, the LGBTQ population.

Here is a description of the informed consent process for these iPhone apps:
Participant-Centered Consent Toolkit

Listed below are a few press releases associated with the Pride Study – the population based study of the gay population. Following those posts are some more general critiques of this way of gathering data. The post from the Verge is probably the most critical raising issues of “on the internet no one knows you are a dog” and gaming the consent process (lying about eligibility for the study). On the plus side, the participant pool is going to be easier to sign up and won’t be limited to those who live close to research hospitals. Here is an excerpt from Business Insider to the reaction to the app launch for the Stanford Heart study:

It’s really incredible … in the first 24 hours of research kit we’ve had 11,000 people sign up for a study in cardiovascular disease through Stanford University’s app. And, to put that in perspective – Stanford has told us that it would have taken normally 50 medical centers an entire year to sign up that many participants. So, this is – research kit is an absolute game changer.

The participant pool is limited to iPhone users (no android version of these apps), although some will have a web interface (the Pride Study).

Launch of the Pride Study
UCSF Researchers Launch Landmark Study of LGBTQ Community Health
Jyoti Madhusoodanan | UCSF Press Release
June 25, 2015

A big LGBT health study is coming to the iPhone
Stephanie M. Lee | BuzzFeed
June 25, 2015

How The iPhone Is Powering A Massive LGBT Health Study
Kif Leswing | International Business Times
June 25, 2015

Critiques of the Apple ResearchKit
Apple’s new ResearchKit: ‘Ethics quagmire’ or medical research aid?
Arielle Duhaime-Ross | The Verge
March 10, 2015

In-Depth: Apple ResearchKit concerns, potential, analysis
mobilehealthnews
March 9, 2015

What’s the Matter with Polling?

What is the Matter with Polling?
Cliff Zukin | New York Times
June 20, 2015

This article focuses on political polling – and predictions from political polls, but much of the content is relevant to other sorts of telephone-based opinion surveys, many of which are used by social scientists: Survey of Consumers, Pew, Gallup, etc.

The article focuses on (a) the move from landline to cellphones; (b) the growing non-response rate; (c) costs; (d) and sample metrics, e.g., representativeness.

The decline in landline phones makes telephone surveys more expensive since cell phones cannot be reached through automatic dialers. The landline phone vs cellphone distribution comes from the National Health Interview Survey. Here’s a recent summary of the data. The article summarizes this as “About 10 years ago. . . . about 6 percent of the public used only cellphones. The N.H.I.S. estimate for the first half of 2014 found that this had grown to 43 percent, with another 17 percent “mostly” using cellphones. In other words, a landline-only sample conducted for the 2014 elections would miss about three-fifths of the American public, almost three times as many as it would have missed in 2008.”

The other issue for polling is the growing non-response rate.

When I first started doing telephone surveys in New Jersey in the late 1970s, we considered an 80 percent response rate acceptable, and even then we worried if the 20 percent we missed were different in attitudes and behaviors than the 80 percent we got. Enter answering machines and other technologies. By 1997, Pew’s response rate was 36 percent, and the decline has accelerated. By 2014 the response rate had fallen to 8 percent.

Non-response makes surveys more expensive – more numbers to call to find a respondent and many of them dialed by hand if it is a cellphone universe. And, most important, is the representativeness of the sample that the survey ends up with. So far, surveys based on probability samples seem to still be representative, at least based on comparing sample characteristics to gold-standard benchmarks like the American Community Survey (ACS). Participation in the ACS is mandatory, although for the last several years, Republicans in the House have tried to remove this requirement. Canada did away with its mandatory requirements with its census, with disastrous results. The following is a compilation of posts related to the mandatory response requirement in the US and Canada: [Older Posts]