For Americans Under 50, Stark Findings on Health

By: Sabrina Tavernise
Source: New York Times

From article:

Younger Americans die earlier and live in poorer health than their counterparts in other developed countries, with far higher rates of death from guns, car accidents and drug addiction, according to a new analysis of health and longevity in the United States.

Researchers have known for some time that the United States fares poorly in comparison with other rich countries, a trend established in the 1980s. But most studies have focused on older ages, when the majority of people die.

This article is based on U.S. Health in International Perspective: Shorter Lives, Poorer Health from the Institute of Medicine and the National Research Council. The pre-publication edition is available to read online for free here.

An interactive graph comparing the United States and 16 “peer” countries is here and the project website is here.

Big Data: Google Flu

Google Flu trends uses aggregated Google search data to estimate current flu activity in near real-time as compared to the results from confirmed cases via CDC epidemiologists. Bob Groves mentioned this site in his 50th Anniversary talk at PSC as an example of “wild” data, which can be merged with or compared to data collected via traditional methods. [See this post for more examples of wild data in social science research.]

Google Flu Trends | United States
This site allows one to see trends over time for the US, e.g., how this year, compares to the pattern in previous years. One can also get reports for specific states or metro areas. Note that this is the methodology used in an Economics dissertation on the under performance of Obama in selected states.

Google Flu Trends | World
This shows the same results, but based on the entire world. The first thing that is clear is the striking difference in flu trends between the Northern and Southern hemispheres (totally expected). One can select a specific country, included the United States, and examine the flu trends for that country.

Google Dengue Trends | World
This shows the results of an aggregated search for dengue fever. The first thing that is apparent is that dengue fever is not a search that folks in the US or England make or make often enough to register.

250,000 Social Media Users in U.S. Said They Got the Flu
Chris Taylor | Mashable
January 16, 2013
If you mentioned you had the flu on either Twitter or Facebook, your post got analyzed by Crimson Hexagon, a firm that does sentiment analysis.

Below are several articles that describe the methodology and usefulness of these big data techniques for disease surveillance purposes:

Detecting influenza epidemics using search engine query data
Ginsberg, Jeremy, et.al. | Nature
February 2009

Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance
Chan, Emily, et.al. | PlosOne
May 2011

The following articles are nice examples to use in a class, illustrating the concept, without going into the details of the above articles:

Unless You Live in Takoma Park, Beverly Hills, or Reno, You’re Probably Going to Get the Flu
Henry Grabar | The Atlantic Cities
January 10, 2013

The Year’s Flu Season is the Worst in a Long Time, Google GIF Edition
Alexis Madrigal | The Atlantic
January 9, 2013

How to calculate life expectancy & why it matters

Social Security: It’s Worse Than You Think
Gary King and Samir Soneji | New York Times
January 5, 2013

This Opinion piece in the Sunday Times is a summary of a Demography article where the authors argue that the Social Security Administration underestimates how long Americans live. That means the trust fund will run out two years earlier than the government has predicted.

Here’s the link to the original article in Demography:

Statistical Security for Social Security
Samir Soneji and Gary King | Demography
August 2012
[html] | [pdf]

This article has quite a few references to former PSC post-doc, John Wilmoth, who has also written on mortality projections and life expectancy.

Finally, for one more set of mortality estimates and methodology, see the Census Bureau’s National Population Projections web page. They have always projected higher life expectancy than the Social Security Administration, but their methodology is not the same as these authors:

2012 National Population Projections
Census Bureau
December 12, 2012

Is the NIH Funding Model Efficient?

Money and Science: To He that Hath
The Economist
December 8th, 2012
This analysis is based on the publication and funding record of the most highly cited biomedical papers and concluded that NIH may not support the best researchers. The link to the original article, in Nature, is provided below.

Research grants: Conform and be funded
Joshua M. Nicholson & John P.A. Ioannidis | Nature
December 5, 2012
Tag line: Too many US authors of the most innovative and influential papers in the life sciences do not receive NIH funding.

And, perhaps related to the above, NIH is considering anonymity for grant applicants:

NIH Considers Anonymity for Grant Applicants
Paul Basken | The Chronicle of Higher Education
December 10, 2012

And, be careful about unattributed text. Some federal agencies are using software to detect unattributed copying in research proposals. See below.

Plagiarism in Grant Proposals
Karen M. Markin | The Chronicle of Higher Education
December 10, 2012

Why Don’t Parents Name Their Daughters Mary Anymore?

Why Don’t Parents Name Their Daughters Mary Anymore?
Philip Cohen | the Atlantic
December 12, 2012

This article is by Philip Cohen, a professor at the University of Maryland. The Atlantic has picked up his blog, Family Inequality, where he posts short, but scholarly snippets.

This piece illustrates the decline in the name Mary via the Social Security Administration’s names database. He posits that this is due to a rise in the cultural value of individuality. Accordingly, people value names that are not common, perhaps even unique. A repercussion of this is that there were only 21,695 baby girls named Sophia (most popular name in 2011) whereas back in 1961, there were 47,655 girls name Mary.

Health at a Glance: Europe 2012

Source: OECD, Directorate for Employment, Labour and Social Affairs

From publication website:

This second edition of Health at a Glance: Europe presents a set of key indicators of health status, determinants of health, health care resources and activities, quality of care, health expenditure and financing in 35 European countries, including the 27 European Union member states, 5 candidate countries and 3 EFTA countries.

The selection of indicators is based largely on the European Community Health Indicators (ECHI) shortlist, a set of indicators that has been developed to guide the reporting of health statistics in the European Union. It is complemented by additional indicators on health expenditure and quality of care, building on the OECD expertise in these areas.

Each indicator is presented in a user-friendly format, consisting of charts illustrating variations across countries and over time, a brief descriptive analysis highlighting the major findings conveyed by the data, and a methodological box on the definition of the indicator and any limitations in data comparability.

Full text (PDF)

Using Wild Data to Estimate International Migration

A previous post described several studies based on non-survey data, which inform demographic events. The following is another very creative example:

You are where you e-mail: using e-mail data to estimate international migration rates
Emilio Zagheni and Ingmar Weber | Max Planck Institute & Yahoo! Research
Proceedings of the 3rd Annual ACM Web Science Conference [Pages 348-351]
June 22-24, 2012

Wild Data: Expanding Social Science Resources

Most researchers use survey data, but more and more researchers are using “wild” data, which is defined as data not produced for research purposes. In fact, several PSC researchers are part of an NSF/Census project, which explores the usefulness of “wild” data ranging from administrative data (Social Security death index, Social Security earnings data) to data harvested from the web.

Below are several examples of informative posts based on web-based data:

The New Secessionists: Plotting whitehouse.gov secession petitions
Neal Caren | Big Data blog
November 14, 2012

This post shows the origin of each of the signers of the wave of secession petitions on the whitehouse.gov website via a county-based map. It also includes an explanation of how this was done. Many of the posts on Caren’s Big Data blog are excellent tutorials for the fundamentals of quantitative text analysis for social scientists.

It is also useful to refer to the history of secession petitions in the US, provided here:

10 facts about Secession
Kevin Robillard | Politico
November 14, 2012

The second example of an application of wild data, comes from a post about ‘mapping racist tweets’ based on content on Twitter immediately after Obama was re-elected to his second term:

Mapping Racist Tweets in Response to President Obama’s Re-election
floatingsheep.org
November 8, 2012

Note that a Harvard Ph.D. student used Google search data to study the under performance of Obama in 2008, which he atttributed to racial animus.

The Effects of Racial Animus on a Black Presidential Candidate: Using Google Search Data to Find What Surveys Miss
Seth Stephens-Davidowitz | Harvard
June 9, 2012

The popular press version of this is here:

Can Google Predict the Impact of Racism on a Presidential Election?
Garance Franke-Ruta | The Atlantic
June 11, 2012

And finally, the Google NGram project has useful data for researchers. Here’s an article from the Economist where the “data is” vs “data are” question is examined. More pertinent to researchers might be the evolution of “Negro man” to “colored man” to “African-American man” in common usage.

Data or datum?
K.N.C. | The Economist
July 13, 2012

And, here’s the link to the Google Ngram Viewer. Of course, you’ll want to have access to the data. Here’s the raw data is available for download link

Re-visiting the Hispanic identity question

The measurement of race in the federal statistical system was last changed just before the 2000 Census. That change allowed respondents to identify with more than one race. While this might have improved the collection of data on multi-race individuals, it did not solve the issue of racial identity among Hispanics or other groups whose identity is not listed as a race, such as Middle Easterners. Back to Hispanics, many chose “other” as race even when the Hispanic origin question comes before the race question. Clearly, the white, African American, etc. choices are not resonating with this population.

So, the Census Bureau is proposing a change in how race and Hispanic origin are collected. OMB will have the final say on this.

Changing the Way U.S. Hispanics are Counted
Carl Haub | Population Reference Bureau
November 2012

A previous PSC Info blog entry, covered the Census Bureau press conference on this:

Census Bureau: Race/Hispanic Origin Experimental Questions
Lisa Neidert | PSC Info Blog
August 8, 2012

A summary of the press conference findings can be found here:
Census Bureau Considers Changing its Race/Hispanic Questions
D’Vera Cohn | Pew Social & Demographic Trends
August 7, 2012

Getting to the Root of Aging

Getting to the Root of Aging by Annette Baudisch and James W. Vaupel
from recent issue of Science

As people live longer, the question arises of how malleable aging is and whether it can be slowed or postponed. The classic evolutionary theories of aging (1—4) provide the theoretical framework that has guided aging research for 60 years. Are the theories consistent with recent evidence?