Archive for the 'Data' Category

Page 2 of 18

Risk factor for a stroke? Living in the stroke-belt as a teen

This study is based on a cohort study most demographers are probably not familiar with, “The Reasons for Geographic and Racial Differences in Stroke study.” It is a relatively large study with residential histories of panel participants. If you are interested in finding out more about these data, here’s a link to the researcher portal to the project website.

Maybe this should be replicated and extended with the PSID as it covers a longer time period. Stroke mortality patterns have also experienced a shift according to Casper ML, Wing S, Anda RF, Knowles M, Pollard RA (May 1995).”The shifting stroke belt. Changes in the geographic pattern of stroke mortality in the United States, 1962 to 1988″. Stroke 26 (5): 755–60. PMID 7740562.

Teenage Years in the Stroke Belt
Nicholas Bakalar | The New York Times
April 29, 2013

Effect of duration and age at exposure to the Stroke Belt on incident stroke in adulthood
Virginia Howard, et.al. | Neurology
April 29, 2013
Abstract | pdf

Special Issue on Survey Non-response

Introduction: New Challenges to Social Measurement
Douglas S. Massey and Roger Tourangeau
Abstract | PDF

Facing the Nonresponse Challenge
Frauke Kreuter
Abstract | PDF

Explaining Rising Nonresponse Rates in Cross-Sectional Surveys
J. Michael Brick and Douglas Williams
Abtract | PDF

Response Rates in National Panel Surveys
Robert F. Schoeni, Frank Stafford, Katherine A. Mcgonagle, and Patricia Andreski
Abstract | PDF

Consequences of Survey Nonresponse
Andy Peytchev
Abstract | PDF

The Use and Effects of Incentives in Surveys
Eleanor Singer and CongYe
Abstract | PDF

Paradata for Nonresponse Adjustment
Kristen Olson
Abstract | PDF

Can Administrative Records Be Used to Reduce Nonresponse Bias?
John L. Czajka
Abstract | PDF

An Assessment of the Multi-level Integrated Database Approach
Tom W. Smith and Jibum Kim
Abstract | PDF

Where Do We Go from Here? Nonresponse and Social Measurement
Douglas S. Massey and Roger Tourangeau

Abstract | PDF

The Twitter paper from PAA’s “social media” session

Using Twitter for Demographic and Social Science Research:
Tools for Data Collection

T. McCormick, H. Lee, N. Cesare and A. Shojaie | CSSS/University of Washington
April 8 2013
This is a proof of concept paper. The researchers searched through tweets for phrases that indicated an intention to “not vote” in the 2012 election. They used Amazon’s Mechanical Turk to identify the profile pictures of their sample (age, gender, race).

Folks interested in other examples of “wild data” like Google searches, Twitter, etc. should check these posts:

Wild Data: Expanding Social Science Research
Big Data: Google Flu
Using Wild Data to Estimate International Migration

The Unauthorized Immigrant Population: Two Technical Excercises

This blog entry has two nice technical pieces. The first describes how PEW Hispanic (and others) estimate the undocumented population in the US. The second is a life-table exercise, which shows how many of the undocumented population will die waiting for citizenship – assuming a 13 year wait time.

Unauthorized Immigrants: How Pew Research Counts Them and What We Know About Them
Interview with Jeff Passel | Pew Hispanic
April 17, 2013
In this interview, Passel describes how he estimates the undocumented population in the US – including the other characteristics of this population, e.g., occupation, current residence, family composition, etc. using data from the Current Population Survey.

As a note, most of the reports PEW Hispanic writes on the undocumented population have an appendix, which provides a more technical description of the methodology. See page 25 of the following report for an example: Cohen, D’Vera and Jeff Passel. 2011. “Unauthorized Immigrant Population: National and State Trends, 2010″ Pew Hispanic: February 1, 2011.

The life-table exercise is from Philip Cohen’s Family Inequality blog.

How many people should die waiting for citizenship? 319,462?
Philip Cohen | Family Inequality Blog
April 24, 2013

This is a life-table exercise, taking the current age distribution of the undocumented population in the US and applying a life-table for Hispanics to the numbers. He describes his assumptions and invites folks to re-calibrate the numbers.

Note that Cohen takes a dig at Reinhart and Rogoff [previous PSC Infoblog entry] by making his spreadsheet available. And, he notes “If you don’t like the way Excel does the maths, by all means, fix it in R.”

Living Apart Together: Data & Research

Living Apart Together: Uncoupling Intimacy and Co-Residence
S. Duncan, M. Phillips, S. Roseneil, J. Carter & M. Stoilova | NatCen Social Research Policy Brief
Winter 2013
Major conclusions from the research are (a) some “singles” are in LAT relationships; (b) living alone doesn’t always means being alone; and (c) intimacy doesn’t always imply co-residence

Note, a similar policy brief for the Canadian LAT population is in an earlier PSC-Info blog entry.

The Census Reform Act of 2013

This proposed legislation is really radical [H.R. 1638]. It would eliminate all surveys collected by the Census Bureau: Economic Census, Census of Governments, Census of Agriculture and a non-existent mid-decade census. Furthermore, it would limit the census to a population count.

In short:

(a) Notwithstanding any other provision of law–
(1) the Secretary may not conduct any survey, sampling, or other questionnaire, and may only conduct a decennial census of population as authorized under section 141; and
(2) any form used by the Secretary in such a decennial census may only collect information necessary for the tabulation of total population by States
(b) Repeal of Survey, Questionnaire, or Sampling Authority- Sections 182, 193, and 195 of title 13, United States Code, are repealed.

The Census Project Blog discusses this in more detail:
What We Don’t Know Can’t Hurt Us (Right?)
Teri Ann Lowenthal | The Census Project Blog
April 23, 2013

If Congress only wants a head-count census, will they fund a ‘mandatory population register?’ This is something New Zealand is considering:

National Census Could be Scrapped
National News | TVNZ
April 23, 2013

Microsoft Excel: The Ruiner of Global Economies?

This is a series of articles on the news that a well-cited and influential paper by Carmen Reinhart and Ken Rogoff had an Excel error in it, which led to an overstating of the association between debt and growth. There are other more fundamental problems with the paper – see comments by economists below.

From a training viewpoint, it is relevant to note that this was discovered by a graduate student, working on a class assignment: find a famous study and replicate it.

This entry has four sections: (a)the student; (b)comments by other economists; (c)replication & programming; and (d)coverage from the press.

The Story of the Student
Meet the 28-Year-Old Grad Student Who Just Shook the Global Austerity Movement
Kevin Roose | The New York Magazine
April 18, 2013

How a student took on eminent economists on debt issue – and won
Edward Krudy | Reuters
April 18, 2013

‘They Said at First That They Hadn’t Made a Spreadsheet Error, When They Had’
Peter Monagham | Chronicle of Higher Education
April 24, 2013
My favorite Q & A from this interview with Thomas Herndon is:
Q. This is more than a spreadsheet error, then?

A. Yes. The Excel error wasn’t the biggest error. It just got everyone talking about this. It was an emperor-has-no-clothes moment.

Comments/Analysis by Economists
Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogo ff
Thomas Herndon, Michael Ash and Robert Pollin | Political Economy Research Institute
April 15, 2013
easier to read pdf of paper, but link above includes data, code, etc.

Researchers Finally Replicated Reinhart-Rogoff, and There Are Serious Problems
Michael Konczal | Next New Deal (blog of the Roosevelt Institute)
April 16, 2013

Reinhart and Rogoff are wrong about austerity
Robert Pollin and Michael Ash | Financial Times
April 17, 2013

Reinhart/Rogoff and Growth in a Time Before Debt
Arindrajit Dube | Next New Deal (blog of the Roosevelt Institute)
April 17, 2013

Reinhart, Rogoff, and How the Macroeconomic Sausage Is Made
Justin Fox | Harvard Business Review
April 17, 2013

The Excel Depression
Paul Krugman | New York Times
April 19, 2013

Replication & Programming
The Mysterious Powers of Microsoft Excel
Colm O’Regan | BBC News Magazine
April 20, 2013

What the Reinhart & Rogoff Debacle Really Shows: Verifying Empirical Results Needs to be Routine
Victoria Stodden | The Monkey Cage Blog
April 19, 2013

What Reinhart-Rogoff Means for the Replication Debate
Political Science Replication Blog
April 19, 2013

Microsoft Excel: The ruiner of global economies?
Peter Bright | Ars Technica
April 16, 2013
This piece describes the Excel error, but also discusses other issues with the paper, including the interesting tidbit that the original Reinhart-Rogoff paper was published in the American Economic Review proceedings issue(May), which are not peer reviewed.

Two clever economists have looked to see if researchers pad their resumes by hiding their AER proceedings publications. The University of Michigan economics department was included in their sample.

Research: Bad math rampant in family budgets and Harvard studies
Jeremy Olshan | Wall Street Journal (Market Watch blog)
April 17, 2013
88% of spreadsheets have errors

On the accuracy of statistical procedures in Microsoft Excel 2007
B.D. McCullough and David A. Heiser | Computational Statistics and Data Analysis
March 2008
These authors criticize Excel for its use in statistical analysis because of its failures in statistical distributions, random number generation, and the NIST StRD(Statistical Reference Datasets). I suspect most users of Excel are using the simpler tools: summation, product, etc., but on occasion faculty have used Excel as a rudimentary statistical analysis tool.

What We Know about Spreadsheet Errors
Raymond Panko | Journal of End User Computing
May 2008

Come to Jesus Slides: Use Script-Based Analysis, not Excel
Matt Frost | Charlottesville, Virginia
The author is recommending R or more specifically R Studio, but his point applies to any script-based statistical package.

The Press
Too many to link to for the moment, but here’s a sampling:
[Search Link]

Essay: Linking, Exploring and Understanding Population Health Data

This is a nice data essay by former PSC trainee Michael Bader. He discusses multiple sources of data that one might use to understand population health. I especially like his point about the need to archive neighborhood conditions – after all neighborhoods change. But he also touches on the range of data available for analysis from focus groups to big data.

Linking, Exploring and Understanding Population Health Data
Michael Bader | Human Capital Blog (RWJ)
June 25 2012

The opening paragraph deserves a highlight, but read the entire entry. It is worth it:

Data are the sustenance of population health research, and like the food that sustains us, it comes in many forms, shapes and sizes. Also like food, it’s best appreciated in combination. A single data source in the absence of context is unfulfilling; but combining datasets that are rich with information and contours — now that’s a meal!

The ACS Faces More Battles

The source for this entry comes from “The Option of Ignorance: Gutting the ACS Puts Democracy at Risk” from The Census Project Blog. http://bit.ly/YGHp86

The funding for the American Community Survey (ACS) will be covered by the 2013 Continuing Resolution, H.R. 933. However, two bills have been introduced in the House (H.R. 1078) and the Senate (S.530) to make the ACS voluntary.

The House Bill provides a Constitutional Statement of Authority, e.g., Fourth Amendment. Note that one of the co-sponsors of this bill is Tim Walburg from the 7th Congressional District, e.g., just west of Ann Arbor.

We have multiple links in this blog on the shortsighted reasoning of this proposition. And, the Census Bureau has researched the issue. A voluntary ACS will be more expensive and will produce less reliable data.

The links are highlighted below:

SENATE: The Census Bureau has already written the reports; read them.
Oh Canada! Look Before you Leap
More on the Idea of a Voluntary ACS
Small Government Folks and the Federal Statistical System

SENATE: The Census Bureau has already written the reports; read them.

New Congress, Old Attacks on the Census
By Jason Jordan | APA Director of Policy and Government Affairs
March 15, 2013

The House and Senate are working in earnest now to pass a new Continuing Resolution to provide funding for the rest of the fiscal year and avoid a potential government shutdown. Fortunately, neither the House nor the Senate versions of the extension include language on ACS. However, the Senate version does ask the Census Bureau to submit a report on ACS, including an analysis of the costs and benefits of a voluntary ACS.

The Census Bureau has evaluated a voluntary American Community Survey (ACS). This was done at the behest of Congress back in 2003. New reports were posted on the Census Bureau website in 2011. The Senate (and House) needs to read them.

Comparison of the American Community Survey Voluntary versus Mandatory Estimates
Alfredo Navarro, Karen King, and Michael Starsinic | Census Bureau
September 2011

Quality Measures Associated with a Voluntary American Community Survey
Deborah H. Griffin and David Raglin | Census Bureau
August 2011

Cost and Workload Implications of a Voluntary American Community Survey
Deborah Griffin | Census Bureau
June 2011