Archive for the 'Data' Category

Page 4 of 18

Wild Data: Expanding Social Science Resources

Most researchers use survey data, but more and more researchers are using “wild” data, which is defined as data not produced for research purposes. In fact, several PSC researchers are part of an NSF/Census project, which explores the usefulness of “wild” data ranging from administrative data (Social Security death index, Social Security earnings data) to data harvested from the web.

Below are several examples of informative posts based on web-based data:

The New Secessionists: Plotting whitehouse.gov secession petitions
Neal Caren | Big Data blog
November 14, 2012

This post shows the origin of each of the signers of the wave of secession petitions on the whitehouse.gov website via a county-based map. It also includes an explanation of how this was done. Many of the posts on Caren’s Big Data blog are excellent tutorials for the fundamentals of quantitative text analysis for social scientists.

It is also useful to refer to the history of secession petitions in the US, provided here:

10 facts about Secession
Kevin Robillard | Politico
November 14, 2012

The second example of an application of wild data, comes from a post about ‘mapping racist tweets’ based on content on Twitter immediately after Obama was re-elected to his second term:

Mapping Racist Tweets in Response to President Obama’s Re-election
floatingsheep.org
November 8, 2012

Note that a Harvard Ph.D. student used Google search data to study the under performance of Obama in 2008, which he atttributed to racial animus.

The Effects of Racial Animus on a Black Presidential Candidate: Using Google Search Data to Find What Surveys Miss
Seth Stephens-Davidowitz | Harvard
June 9, 2012

The popular press version of this is here:

Can Google Predict the Impact of Racism on a Presidential Election?
Garance Franke-Ruta | The Atlantic
June 11, 2012

And finally, the Google NGram project has useful data for researchers. Here’s an article from the Economist where the “data is” vs “data are” question is examined. More pertinent to researchers might be the evolution of “Negro man” to “colored man” to “African-American man” in common usage.

Data or datum?
K.N.C. | The Economist
July 13, 2012

And, here’s the link to the Google Ngram Viewer. Of course, you’ll want to have access to the data. Here’s the raw data is available for download link

Re-visiting the Hispanic identity question

The measurement of race in the federal statistical system was last changed just before the 2000 Census. That change allowed respondents to identify with more than one race. While this might have improved the collection of data on multi-race individuals, it did not solve the issue of racial identity among Hispanics or other groups whose identity is not listed as a race, such as Middle Easterners. Back to Hispanics, many chose “other” as race even when the Hispanic origin question comes before the race question. Clearly, the white, African American, etc. choices are not resonating with this population.

So, the Census Bureau is proposing a change in how race and Hispanic origin are collected. OMB will have the final say on this.

Changing the Way U.S. Hispanics are Counted
Carl Haub | Population Reference Bureau
November 2012

A previous PSC Info blog entry, covered the Census Bureau press conference on this:

Census Bureau: Race/Hispanic Origin Experimental Questions
Lisa Neidert | PSC Info Blog
August 8, 2012

A summary of the press conference findings can be found here:
Census Bureau Considers Changing its Race/Hispanic Questions
D’Vera Cohn | Pew Social & Demographic Trends
August 7, 2012

Data Citation Index from Thomson Reuters

In October 2012, Thomson Reuters will release the Data Citation Index on the Web of Knowledge platform. See a video introduction here.

According to Thomson Reuters, researchers can:

  • Maximize your research efforts with access to the most influential repositories, data sets and studies from a single destination
  • Speed the time to discovery by building upon previous, quality digital research
  • Understand data in context through summary information connected to the work it informed
  • Track the use and importance of research data across multiple disciplines
  • Get a complete view of scholarly research output
  • Support proper attribution to data research through standard citation format.
  • Warning: An end to the Social Security Death Master File?

    Research is Hampered by New Limits on Death Records
    Kevin Sack | The New York Times
    October 8, 2012

    A shift by the Social Security Administration to limit access to its death records amid concerns about identity theft is beginning to hamper a broad swath of research, including assessments of hospital safety and financial industry efforts to spot consumer fraud.

    A quote by the senior project manager of the Nurses’ Health Study is apt:

    the new policy ha(s) “thrown us back to the pre-Internet era where you’d start looking in the phone book for someone with a similar name and sending out a bunch of letters.”

    Demography 101: Do not ignore age structure

    In a campaign speech, Romney announced that the unemployment rate was really 11 percent. He was driven to come up with that number since the unemployment rate fell below 8 percent with the October jobs report. But, he made an error. He ignored the changing age structure, e.g., the leading edge of the baby boomers, who have retired.

    This is a good example for a quantitative reasoning class. A fuller explanation of the issue follows in the post by Mulligan.

    Fact-Check: An 11 Percent Unemployment Rate?
    Catherine Rampell | Economix Blog, The New York Times
    October 5, 2012

    The Baby Boom and Economic Recovery
    Casey Mulligan | Economix Blog, The New York Times
    October 10, 2012

    The Politicization of Data

    On the first Friday of this month (October 5, 2012), the BLS released its job figures just like it does every month. This report had unemployment dropping from 8.1 percent to 7.8 percent. Immediately, there was an outcry that somehow BLS had cooked the figures to help the Obama campaign, the most notable of which was a tweet by Jack Welch, former CEO of General Electric:


    Below are a few examples of that view or commentary on it as well as some more thoughtful posts on the noisy nature of the data.

    Enabling the jobs report conspiracy theory
    Brendan Nyhan | The Swing States Project Blog, Columbia Journalism Review
    October 8, 2012

    The jobs truther movement
    Patrick Reis | Politico.com
    October 5, 2012

    Steep drop in unemployment rate spawns conspiracy
    Scott Mayerowitz and Christopher Rugaber | AP
    October 5, 2012
    Great opening paragraph:

    Sasquatch might as well have traipsed across the White House lawn Friday with a lost Warren Commission file on his way to the studio where NASA staged the moon landing.

    Conservatives Jobs Conspiracy is Nuts
    Robert Schlesinger | Newsweek
    October 5, 2012

    Don’t Trust this Number in This Jobs Report (or Any Jobs Report)
    Derek Thompson | The Atlantic
    October 5, 2012

    How Bureau of Labor Statistics Tames Volatile Raw Data for Jobs Reports
    Catherine Rampell | Economix Blog, New York Times
    October 5, 2012

    Never Mind: Continuing Resolution negates riders to bills

    The House (September 13) and the Senate (September 22) passed the FY 2013 Continuing Resolution (CR) that will fund the agencies and programs of the Federal government until March 27, 2013.

    This means that all the riders to bills that abolished programs, some of which had a serious impact on social scientists have been negated. This includes the elimination of funding for: the American Community Survey (ACS), the political science program at NSF, and economics research at NIH.

    The CR includes an across-the-board increase of 0.6 percent above FY 2012.

    For further details, here’s the full 158 page text.

    Text drawn from an APDU Data Update, September 27, 2012

    Household Change in the United States

    Household Change in the United States
    Linda Jacobsen, Mark Mather, and Genevieve Dupuis | Population Bulletin
    September 2012

    [Synopsis] [Full Report] [Data Finder]

    This Population Bureau report describes changes in household structure in the United States from 1940 to 2010. It covers various living arrangements: married couples, single-head families, living alone, cohabiting couples, etc. with some discussion of these relationships by age, race, education.

    Across the Pond: To have a census or not

    The United Kingdom has examined whether or not to have a traditional census or not. The report is cautious saying the ’social science could suffer if the census was discontinued without serious consideration as to how this data would be replaced.” The report looked at administrative data and existing surveys, any of which might be a useful replacement for the census for a local area. However, the country also needs to have a snapshot of the entire nation.

    The results of the inquiry are below.

    Short summary:
    Decision to scrap Census could hit UK social science according to MPs

    The Census and social science
    Volume 1: Third Report of Session 2012-13

    The Census and social science
    Volume II: Third Report of Session 2012-13

    The Census and social science
    Oral evidence (uncorrected)
    Written evidence

    Oh Canada! Look before you leap

    Canada is in a fix with its 2016 mid-decade Census on the horizon. It made a major change to its census operations with the 2011 Census, choosing to make The National Household Survey (long-form) voluntary. This was done on the behest of the Conservative government, not the advice of the statisticians in StatsCan [See here, here and here for links to this issue].

    Final Report on 2016 Census Options: Proposed Content Determination Framework and Methodology Options
    [Executive Summary] [Full report]

    Stick with old-school census, says StatsCan
    Jennifer Ditchburn | iPolitics
    August 30,2012
    Reaction from the press is pretty damning, showing that Canada still doesn’t know the full impact of the decision to go with a voluntary census. Here’s a comment on whether or not there is response bias in the National Household Survey:

    “Are we totally off, slightly off, right on? That would be difficult to determine,” said Marc Hamel, manager of the census program at Statistics Canada.

    Harper government’s assault on reason, scientists, ‘Orwellian’ and ‘alarming,’ warns pollster
    Alice Funke | The Hill Times
    September 10, 2012

    This article is really belongs in a collection of “Death of Evidence” articles. But, it so nicely aligns with the above article as the first ‘assault on reason’ occurred when the government cancelled the mandatory long-form census.