Archive for the 'Methodology' Category

Apple Research Kit: New Frontiers in Data Collection & Informed Consent

The Apple Research Kit allows researchers to develop an iPhone app, which interested respondents can download from the Apple Store. The respondent goes through an on-line consent form and then responds to questions, tasks (walking), etc. Some of the diagnostic tools are based on previously developed apps from the Apple Healthkit.

As of now, apps have been developed for collecting data for research projects on asthma, cardiovascular disease, diabetes, Parkinson’s, mind, body, and wellness after breast cancer, and for a population-based study, the LGBTQ population.

Here is a description of the informed consent process for these iPhone apps:
Participant-Centered Consent Toolkit

Listed below are a few press releases associated with the Pride Study – the population based study of the gay population. Following those posts are some more general critiques of this way of gathering data. The post from the Verge is probably the most critical raising issues of “on the internet no one knows you are a dog” and gaming the consent process (lying about eligibility for the study). On the plus side, the participant pool is going to be easier to sign up and won’t be limited to those who live close to research hospitals. Here is an excerpt from Business Insider to the reaction to the app launch for the Stanford Heart study:

It’s really incredible … in the first 24 hours of research kit we’ve had 11,000 people sign up for a study in cardiovascular disease through Stanford University’s app. And, to put that in perspective – Stanford has told us that it would have taken normally 50 medical centers an entire year to sign up that many participants. So, this is – research kit is an absolute game changer.

The participant pool is limited to iPhone users (no android version of these apps), although some will have a web interface (the Pride Study).

Launch of the Pride Study
UCSF Researchers Launch Landmark Study of LGBTQ Community Health
Jyoti Madhusoodanan | UCSF Press Release
June 25, 2015

A big LGBT health study is coming to the iPhone
Stephanie M. Lee | BuzzFeed
June 25, 2015

How The iPhone Is Powering A Massive LGBT Health Study
Kif Leswing | International Business Times
June 25, 2015

Critiques of the Apple ResearchKit
Apple’s new ResearchKit: ‘Ethics quagmire’ or medical research aid?
Arielle Duhaime-Ross | The Verge
March 10, 2015

In-Depth: Apple ResearchKit concerns, potential, analysis
mobilehealthnews
March 9, 2015

What’s the Matter with Polling?

What is the Matter with Polling?
Cliff Zukin | New York Times
June 20, 2015

This article focuses on political polling – and predictions from political polls, but much of the content is relevant to other sorts of telephone-based opinion surveys, many of which are used by social scientists: Survey of Consumers, Pew, Gallup, etc.

The article focuses on (a) the move from landline to cellphones; (b) the growing non-response rate; (c) costs; (d) and sample metrics, e.g., representativeness.

The decline in landline phones makes telephone surveys more expensive since cell phones cannot be reached through automatic dialers. The landline phone vs cellphone distribution comes from the National Health Interview Survey. Here’s a recent summary of the data. The article summarizes this as “About 10 years ago. . . . about 6 percent of the public used only cellphones. The N.H.I.S. estimate for the first half of 2014 found that this had grown to 43 percent, with another 17 percent “mostly” using cellphones. In other words, a landline-only sample conducted for the 2014 elections would miss about three-fifths of the American public, almost three times as many as it would have missed in 2008.”

The other issue for polling is the growing non-response rate.

When I first started doing telephone surveys in New Jersey in the late 1970s, we considered an 80 percent response rate acceptable, and even then we worried if the 20 percent we missed were different in attitudes and behaviors than the 80 percent we got. Enter answering machines and other technologies. By 1997, Pew’s response rate was 36 percent, and the decline has accelerated. By 2014 the response rate had fallen to 8 percent.

Non-response makes surveys more expensive – more numbers to call to find a respondent and many of them dialed by hand if it is a cellphone universe. And, most important, is the representativeness of the sample that the survey ends up with. So far, surveys based on probability samples seem to still be representative, at least based on comparing sample characteristics to gold-standard benchmarks like the American Community Survey (ACS). Participation in the ACS is mandatory, although for the last several years, Republicans in the House have tried to remove this requirement. Canada did away with its mandatory requirements with its census, with disastrous results. The following is a compilation of posts related to the mandatory response requirement in the US and Canada: [Older Posts]

Measuring Race . . . Again

The following are collection of news stories on how the Census Bureau is planning to collect data on race. It is misleading to say that the Census Bureau will not collect data on race. Instead, of asking about Hispanic Origin and Race, the Census Bureau is likely to ask about “categories” that describe the person.

And, a new category might be “Middle Eastern or North African.”

The Census Bureau collects data on all sorts of topics, but the Office of Management and Budget (OMB) makes the final call on how the concept is measured by the Federal Statistical System. Links to the Census Bureau’s submission to OMB and a report based on internal research follow a nice summary by Pew.

Census considers new approach to asking about race – by not using the term at all
D’Vera Cohn | Pew Research Center
June 18, 2015

2010 Census Race and Hispanic Origin Alternative Questionnaire Experiment
from the 2010 Census Program for Evaluations and Experiments
Feb 28, 2013

National Content Test
Submission for OMB Review | Federal Register
May 22, 2015

Backcasting Native Hawaiian Population

The Pew Research Center Fact Tank examines findings by David Swanson which uses 1910 and 1920 Census data to estimate the population of Hawaii in 1778, the year Capt. James Cook arrived.

In this case, Swanson took a detailed look at the 1910 and 1920 U.S. Census’s Native Hawaiian counts, tracking the survival rate of each five-year age group from one census to the next. For example, he looked at how many children who were newborns to age 4 in 1910 were counted as 10- to 14-year-olds in 1920, then did the same for each successive age group. For each group, he created a “reverse cohort change ratio,” which he used to go back in time and estimate the size of each age group for each decade until he got to 1770.

The article also reports on the growth of the Native Hawaiian population since the 1980s.

App vs. Web for Surveys

The Pew Research Center has been experimenting with mobile apps for “signal-contingent experience sampling” to gather data about how Americans use their smartphones. They have just released a report examining the possibilities of this method:

This report utilizes a form of survey known as “signal-contingent experience sampling” to gather data about how Americans use their smartphones on a day-to-day basis. Respondents were asked to complete two surveys per day for one week (using either a mobile app they had installed on their phone or by completing a web survey) and describe how they had used their phone in the hour prior to taking the survey. This report examines whether this type of intensive data collection is possible with a probability-based panel and to understand the differences in participation and responses when using a smartphone app as opposed to a web browser for this type of study.

PAA President Ruggles wants you to write a letter

The is an excellent summary of the consequences of the demise of the 3-year ACS tabular products. Please follow through and contact the relevant government officials:

ACS 3-Year Summary Products: Please take action to save the ACS 3-year data products
Steve Ruggles | PAA President and Director of the Minnesota Population Center
March 4, 2015

Another take on gentrification

Gentrification in America Report
Mike Maciag | Governing
February 2015
This resource is city-specific and provides both counts and maps of gentrified census tracts for the 50 largest cities. To be eligible for gentrification a census tract’s median household income and median home value were both in the bottom 40th percentile of all tracts within a metro area at the beginning of the decade. The gentrified tracts recorded increases in the top third percentile for both measures when compared to all others in a metro area.

Methodology

And more broadly, this resource has a special issue on gentrification:

The G-Word: A Special Series on Gentrification
The titles in this series are:
Do Cities Need Kids?
The Neighborhood Has Gentrified, But Where’s the Grocery Store?
Just Green Enough
Gentrification’s Not So Black and White After All
The Downsides of a Neighborhood ‘Turnaround
Some Cities Are Spurring the End of Sprawl
Keeping Cities from Becoming “Child-Free Zones”
From Vacant to Vibrant: Cincinnati’s Urban Transformation
Can Cities Change the Face of Biking?

The Hedometer Index

This is an index of happiness created from tweets. The index provides a daily score, which can be toggled to exclude weekends, Mondays, etc.

Hedometer Index

This is an excellent resource because the creators of this happiness index describe the calculation of the index, the words used in it, provide an API, have links to articles based on the index, etc. It is a valuable resource, even if you do not care about happiness as it provides a template for many other uses of data from Twitter.

Instructions [Documenation of index via video or written – click on links]
Words [Words used in index, ranks, etc.]
Blog [The Computational Story Lab. . . mostly related to happiness]
Press [press coverage]
Papers [refereed papers by research team]
Talks [maybe you need a clip for a lecture]
API [lots of examples]

Move over Index of Consumer Sentiment/Expectations?

I ran across this in the Wall Street Journal (slide 58 of 93):

Can happiness from tweets reduce drawdowns from selling VIX?

Selling VIX futures has been profitable historically. However, the strategy can be subject to drawdowns, when there is risk aversion . . . . Using the Hedometer index as an input, we have created a Happiness Sentiment Index (HSI), which can be sued to proxy market risk sentiment. . . .

HSI index

See next post for more on the Hedometer Index.

Data Demise: ACS 3-year product

The Census Bureau has released its last 3-year ACS product with the 2011-2013 release. This is a cost-cutting move, although the Census Bureau might argue that it never meant for there to be a 3-year product in the first place.

The Census Bureau is not cutting back on data collection – it is eliminating the tabular release of the 3-year data (geographic areas of 20,000+). The 1-year data are for geographies of 65,000+ and the 5-year data have no population limits. These will continue to be released.

The microdata products have share the same release types: 1-year, 3-year, and 5-year. These all share the same geographic limit (PUMAs), but the 3-year and 5-year products are not just concatenations of the 1-year files. They have been re-weighted and income-denominated items are inflated to the last year (e.g., 2013). [See explanatory note from IPUMS].

The ACS 3-year Demographic Estimates are History
Brendan Buff | APDU Blog post
Feb 3, 2015

Census Bureau Statement on American Community Survey 3-Year Statistical Product
Stanford University Libraries | Ron Nakao’s Blog