Monthly Archive for January, 2013

Data Privacy: Some Articles on Failed Anonymizaton

January 28th is Data Privacy Day so this post provides a few articles and news reports on failed anonymization as well as a Guide to HIPAA Audits concerning publically identifiable health information.

HIPAA Audit Tips – Know What De-Identification of PHI Really Means
January 28, 2013

The ‘Re-Identification’ of Governor William Weld’s Medical Information: A Critical Re-Examination of Health Data Identification Risks and Privacy Protections, Then and Now
Daniel C. Barth-Jones | Social Science Research Network
June 4, 2012

Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization
Paul Ohm | UCLA Law Review
August 2009

Identifying Personal Genomes by Surname Inference
Melissa Gymrek, Amy L. McGuire, David Golan, Eran Halperin and Yaniv Erlich | Science
January 18, 2013

The Complexities of Genomic Identifiability
Laura L. Rodriguez, Lisa D. Brooks, Judith H. Greenberg and Eric D. Green | Science
January 18, 2013

And some articles that explain the issues to the educated public. The first article has a reference to UM football. See if you can find it:

Emperor’s New Short Tandem Repeats
John Wilbanks | del-fi.org (personal website)
January 17, 2013

Scientists Demonstrate how Hackers can unlock your Genetic Secrets
Alan Boyle, Science Editor | NBC News
January 17, 2013

Big Data Reveals Job Change

Task Specialization in U.S. Cities from 1880-2000
Guy Michaels, Ferdinand Rauch, Stephen J. Redding | NBER Working Paper 18715
January 2013

In this study, economists Guy Michaels, Ferdinand Rauch, and Stephen J. Redding analyze the verbs used to describe jobs in the U.S. Dictionary of Occupational Titles during a 120 year time period. They do this by geographic area, correlating their findings with the spread of telephone service and transportation networks. They discover “a systematic reallocation of employment over time towards interactive occupations, which involve tasks described by verbs that appear in thesaurus categories concerned with thought, communication and inter-social activity.”

Tip from @TrendCop via Twitter

An Exercise in Inefficiency: Sequestration/Threat of Sequestration

This is a collection of recent news articles on sequestration. For the most part, the articles concern NIH funding, although a few discuss all federal agencies. An earlier post links to a agency-by-agency cuts as well as UM-specific information.

Sequestration cuts no longer the ‘bad policy’ bogeyman for Congress
Jeremy Herb | The Hill
January 29, 2013
This news source focuses on Congress and the Federal government. It concludes “With the sequestration deadline a little over four weeks away, there appears to be little momentum in Congress or the White House to stop the cuts.”

Threats of automatic cuts costly to federal agencies
Lisa Rein |Washington Post
January 27, 2013

Paul Ryan Insists Republicans are ready to let the Sequester Happen
Suzy Khimm | Wonk Blog (Washington Post)
January 27, 2013

Ryan: No Sequestration had Romney and I Won
Pema Levy | Talking Points Memo
January 27, 2013

Sequestration means mass furloughs in April
Stephen Losey | Federal Times
January 25, 2013

NIH Director Francis Collins: Medical Research at Risk
Paige Winfield Cunningham | Politico
January 16, 2013

Sequestering Science
Michael D. Purugganan | Huffington Post
January 16, 2013

Call for Papers: Epidemiologic Reviews

Epidemiologic Reviews is a sister publication of American Journal of Epidemiology and publishes critical reviews on specific themes once a year. The theme in 2014 will be Women’s Health and manuscript submissions are being solicited.

More information can be found here.

For Americans Under 50, Stark Findings on Health

By: Sabrina Tavernise
Source: New York Times

From article:

Younger Americans die earlier and live in poorer health than their counterparts in other developed countries, with far higher rates of death from guns, car accidents and drug addiction, according to a new analysis of health and longevity in the United States.

Researchers have known for some time that the United States fares poorly in comparison with other rich countries, a trend established in the 1980s. But most studies have focused on older ages, when the majority of people die.

This article is based on U.S. Health in International Perspective: Shorter Lives, Poorer Health from the Institute of Medicine and the National Research Council. The pre-publication edition is available to read online for free here.

An interactive graph comparing the United States and 16 “peer” countries is here and the project website is here.

Big Data: Google Flu

Google Flu trends uses aggregated Google search data to estimate current flu activity in near real-time as compared to the results from confirmed cases via CDC epidemiologists. Bob Groves mentioned this site in his 50th Anniversary talk at PSC as an example of “wild” data, which can be merged with or compared to data collected via traditional methods. [See this post for more examples of wild data in social science research.]

Google Flu Trends | United States
This site allows one to see trends over time for the US, e.g., how this year, compares to the pattern in previous years. One can also get reports for specific states or metro areas. Note that this is the methodology used in an Economics dissertation on the under performance of Obama in selected states.

Google Flu Trends | World
This shows the same results, but based on the entire world. The first thing that is clear is the striking difference in flu trends between the Northern and Southern hemispheres (totally expected). One can select a specific country, included the United States, and examine the flu trends for that country.

Google Dengue Trends | World
This shows the results of an aggregated search for dengue fever. The first thing that is apparent is that dengue fever is not a search that folks in the US or England make or make often enough to register.

250,000 Social Media Users in U.S. Said They Got the Flu
Chris Taylor | Mashable
January 16, 2013
If you mentioned you had the flu on either Twitter or Facebook, your post got analyzed by Crimson Hexagon, a firm that does sentiment analysis.

Below are several articles that describe the methodology and usefulness of these big data techniques for disease surveillance purposes:

Detecting influenza epidemics using search engine query data
Ginsberg, Jeremy, et.al. | Nature
February 2009

Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance
Chan, Emily, et.al. | PlosOne
May 2011

The following articles are nice examples to use in a class, illustrating the concept, without going into the details of the above articles:

Unless You Live in Takoma Park, Beverly Hills, or Reno, You’re Probably Going to Get the Flu
Henry Grabar | The Atlantic Cities
January 10, 2013

The Year’s Flu Season is the Worst in a Long Time, Google GIF Edition
Alexis Madrigal | The Atlantic
January 9, 2013

How to calculate life expectancy & why it matters

Social Security: It’s Worse Than You Think
Gary King and Samir Soneji | New York Times
January 5, 2013

This Opinion piece in the Sunday Times is a summary of a Demography article where the authors argue that the Social Security Administration underestimates how long Americans live. That means the trust fund will run out two years earlier than the government has predicted.

Here’s the link to the original article in Demography:

Statistical Security for Social Security
Samir Soneji and Gary King | Demography
August 2012
[html] | [pdf]

This article has quite a few references to former PSC post-doc, John Wilmoth, who has also written on mortality projections and life expectancy.

Finally, for one more set of mortality estimates and methodology, see the Census Bureau’s National Population Projections web page. They have always projected higher life expectancy than the Social Security Administration, but their methodology is not the same as these authors:

2012 National Population Projections
Census Bureau
December 12, 2012