Archive for the 'Methodology' Category

Page 3 of 9

Visualizing Births and Deaths in Real-Time

Data visualizations are becoming more and more popular and sometimes they include demographic concepts. The following are two simulations of births and deaths – one for the US and the other for the world.

Click on the images to start the simulations. To read more about how these were made see references below:

us_map world_map

Watch This Anxiety-Provoking Simulation of U.S. Births and Deaths
John Metcalfe | The Atlantic Cities
December 11, 2012

This Map Shows Where in the World People Are Dying and Being Born
John Metcalfe | The Atlantic Cities
October 14, 2013

World Births/Deaths Simulation – Adding World Cities
Brad Lyon | Nowhere Near Ithaca Blog
October 9, 2013

Reproducibility Initiative: It’s not just for cancer

Reproducibility Initiative logo

The following are links to related efforts in Open Science. The first is about funding for a “Reproducibility Initiative” to validate 50 landmark cancer studies. Frankly, this can/should apply to population research as well. Included are links from The Economist and Nature about the importance of replication.

In general, there is a move towards “Open Science” across all disciplines. In fact, a different initiative, “The Reproducibility Project” is an effort to identify the predictors of reproducibility among published studies in psychology – a field that contributes far too much to the “Retraction Watch” website.

Reproducibility Initiative
Science Exchange News
October 16, 2013

Initiative gets $1.3 million to verify findings of 50 high-profile cancer papers
Richard Van Noorden | Nature News Blog
October 16, 2013

Unreliable research: Trouble at the lab
The Economist
October 19, 2013
Scientists like to think of science as self-correcting. To an alarming degree, it is not.

The governments of the OECD, a club of mostly rich countries, spent $59 billion on biomedical research in 2012, nearly double the figure in 2000. One of the justifications for this is that basic-science results provided by governments form the basis for private drug-development work. If companies cannot rely on academic research, that reasoning breaks down. When an official at America’s National Institutes of Health (NIH) reckons, despairingly, that researchers would find it hard to reproduce at least three-quarters of all published biomedical findings, the public part of the process seems to have failed.

If a job is worth doing, it is worth doing twice
Jonathan Russell | Nature
April 3, 2013

Reproducibility Project
Large-scale open collaboration to estimate the reproducibility of a sample of studies in psychology

Retraction Watch
Tracking retractions as a windo into the scientific process

Center for Open Science
A non-profit organization, which provides infrastructure tools for open science.

Time for a Legal Prohibition on Data Re-identification?

This is a very thorough blog post on respondent re-identification issues. The author takes to task the re-identification rainmakers, who have made careers out of exposing re-identification risks – often overstating the risks. He calls for a well-designed legal prohibition on data re-identification.

In fact, may of the restricted data contracts PSC users operate under have an “inadvertent discovery” clause. Here’s the language from LAFANS, which prohibits broadcasting the “find” to others.

Ethical Concerns, Conduct and Public Policy for Re-Identification and De-identification Practice: Part 3 (Re-Identification Symposium)
Daniel Barth-Jones | Columbia University
October 2, 2013

Counting Prisoners

The New York Times had another editorial on this issue:
Prison-Based Gerrymandering
Editorial Board | New York Times
September 26, 2013

A search on its site shows that this has been a common editorial/story topic
[Counting Prisoners Editorials/Stories]

The PSC Infoblog has had a previous post on this topic as well, which included the Census Bureau’s response to the issue. The Census Bureau released group quarters data in time for redistricting.

Another excellent source on this topic is the National Academy of Sciences book, which is available in the PSC library:
Once, Only Once and in the Right Place: Residence Rules in the Decennial Census
Daniel Cork and Paul Voss, Editors | The National Academies
2006

Two updates from The Census Project

Sorry, Come Back Later (Make an Educated Guess in the Meantime)
Terri Ann Lowenthal | The Census Project Blog
October 7, 2013
This piece reflects on the government shutdown – reflecting that Congress has done what the House couldn’t do – shut down the federal statistical infrastructure.

Thanks to the government shutdown, the Census Bureau’s work has come to a grinding halt. No harassing phone calls to unwitting, over-burdened citizens. No pesky, door-knocking surveyors invading the privacy of hard-working Americans who just want to live a quiet, government-free life (as soon as someone fills that pothole down the street). Even the duty-bound who want to cooperate (however grudgingly) from the comfort of their own computers are out of luck; online survey response is closed for business.

Losing Sleep (While Counting Sheep)
Terri Ann Lowenthal | The Census Project Blog
October 1, 2013
This is a timely piece on the first day of a government shutdown. The Census Bureau needs funding from Congress so that it can do the necessary research for a smarter 2020 Census. The review of the fiscal situation and Congressional gridlock is not pretty.

budget uncertainty is causing significant concerns for the 2020 census program as we enter that period during which it is crucial to conduct tests so that we can begin applying new technologies and methods … We have already delayed planned research and testing activities to later years … We cannot further delay critical research that will help us make critical design decisions for those systems. [John Thompson]

. . . the Census Bureau needs money to figure all of this out in time. The bureau can execute a fundamentally redesigned 2020 census for the 2010 census price tag (plus inflation), Director Thompson says. Invest now, save later – that’s the bottom line.

And, a nice closing line: “Did I mention that the next census starts in less than six years? The Census Bureau can do a lot of things, but it cannot stop the clock. I bet Director Thompson is having a few sleepless nights, too.”

I might note that according to this link, 87% of Commerce employees are furloughed. Obviously, Census2020 planning is not happening today. I wonder if it shuts down the data collection operations for the ACS, CPS, etc.?

census.gov is #shutdown but you can read about Census 2010 research

The Census Bureau website is down with the government shutdown:

Census shutdown message

But, you can read all about some research based on the 2010 Census. Here is a sampling:

Misclassifying New York’s Hidden Units as Vacant in 2010: Lessons Gleaned for the 2020 Census
Joe Salvo and Peter Lobo | Population Research and Policy Review
August 6, 2013
This is a great article if you are interested in the details of the history of the Census Bureau’s master address file; how it gets created, corrected, updated, etc.

This piece traces the puzzling number of vacancies in two areas of New York City during the 2010 Census, which resulted in a lower census count than New York City had expected. It is a nice piece of detective work. As a reminder, Peter Lobo was a PSC trainee who I always quote as saying “I worked with Ren Farley at Michigan, and the time I spent there were some of the best years of my life.”

Quality and the 2010 Census
Hogan, Howard et.al. | Population Research and Policy Review
April 5, 2013
This is a nice summary of ways to evaluate census quality – 2 of the 5 authors are PSC trainees (Howard Hogan and Victoria Velkoff).

There is a companion press conference on the Census Bureau website, which will be linked to when the #shutdown is over.

The rest of the articles in this special issue devoted to the 2010 Census are here:

Population Research and Policy Review
Volume 32, Issue 5, October 2013
Special issue on New Findings from the 2010 Census
Guest Editor: William P. O’Hare

The Census as a Luxury

The following are a collection of news articles, editorials, and reports on the likely possibility that the UK will be scrapping its census – to be replaced by a survey and a sweep of commercial data sources and administrative records.

In for the count: Arguments for scrapping UK census do not add up
Editorial | Fiscal Times
September 2, 2013

Quotes:

The census is Whitehall’s window on British society. If you are not counted, you do not count.

Yet the government believes that the census is a luxury that Britain can no longer afford. When it was last conducted in 2011 it required a 35,000-strong army of researchers and cost £480m. This is cheap by international standards – the US census costs more than three times as much per head – and a drop in the £7tn-odd ocean of public spending that will, over the course of a decade, be influenced by the results. Census data can save ministers from costly mistakes. Abolishing it may prove penny-wise but pound-foolish.

And, the Canadian problem:

At stake is not only the accuracy of the census itself, but also that of countless sample-based surveys that it is used to calibrate.

And, a nice, pithy conclusion:

The carpenter’s rule is to measure twice and cut once. The government should reflect on this advice before ditching an indispensable yardstick of social change.

Ending the national census would make us blind to our society
Danny Dorling | The Guardian
September 2, 2013

This is a shorter version of a paper published in Radical Statistics. This paper has useful references.

The 2011 Census: What surprises are emerging . . . cancellation is stupid
Danny Dorling | Radical Statistics

The next two articles are from the Fiscal Times. This is not an open access journal. You may be able to read the articles by answering a survey question (on cloud computing, smart phone usage, etc.).

Researchers in UK count cost of plan to scrap census
Kate Allen | Fiscal Times
September 1, 2013
This piece emphasizes that neither a survey or currently available data (administrative/proprietary commercial) would provide the geographical detail or the content scope that a census would.

Loss of census seen as threat to UK historical insight
Kate Allen | Fiscal Times
September 1, 2013
This piece emphasizes what a loss of a census would mean for historical research, including genealogy.

Big data and the death of polls

The rumors of the death of polls might be greatly exaggerated. Recent coverage of a Twitter-based study ignores the weak effects in the original paper. ["For instance, being an incumbent predicts almost a 50,000 vote contribution to the Republican margin in their statistical model, whereas receiving 100 percent (all!) of tweet-mentions gets you only 155 votes"]. But, one of the authors of the papers even goes so far as to say “In the future, you will not need a polling organization to understand how your elected representative will fare at the ballot box. Instead, all you will need is an app on your phone.”

How Twitter can predict an election
Fabio Rojas | Opinions, Washington Post
August 11, 2013

Original Paper
More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior
J. DiGrazia, K. McKelvey, J. Bollen and F. Rojas | SSRN
February 21, 2013

Counter-views
How Twitter can Predict Elections: A Rebuttal
Rob Santos | Washington Post
August 16, 2013

Can Twitter Predict Elections? Not so Fast
Mark Blumenthal & Ariel Edwards-Levy | Huffington Post
August 16, 2013

Let’s Calm Down about Twitter Being Able to Predict Elections, Guys
Jason Linkins | HuffingtonPost
August 14, 2013

Popular Press
How Twitter can help predict an election – in one eye-catching study
Sean Sullivan | Washington Post
August 14, 2013
Want to figure out who is going to win a congressional race? Find out which candidate received the lion’s share of tweets in the lead-up to Election Day.

[BUT]

Some high-profile misses are also illustrative of the challenge of using tweets to reliably project elections. Anthony Weiner’s nearly 250,000 mentions on twitter (according to topsy.com) are unlikely to revive his downward spiral in the New York mayoral race – current front-runner Bill de Blasio has received barely 10,000 mentions in the same period. And while then-Rep. Ron Paul (R-Tex.) received wide recognition on Twitter during the 2012 Republican presidential primaries, he failed to win a single contest.

A New Study Say Twitter Can Predict US Elections
Robinson Meyer | The Atlantic
August 13, 2013

[Unrelated Visualization of Tweets before the 2010 Election]

The Perils of Administrative Censuses

Some who are against a mandatory census argue that the government already has this information and is wasting money re-collecting data. Of course, not all information on individuals is tied to their residence and the census needs to know the location of the population for reapportionment purposes. Others who are against the census are also against big government so probably are not in favor of administrative records as a data collection device.

The following is the record of the German administrative census as compared to a population count. Some of the sources are US-based research from the Census Bureau, which is looking to use administrative records to supplement its address-based census.

Germany Counts Heads and Finds 1.5 Million Fewer Residents Than It Expected
Press Release | Statistisches Bundesampt [German Federal Statistical Office]
May 21, 2013

Lessons from the German Census
D’Vera Cohn | Fact Tank: Pew Research Center
June 20, 2013

When the results of the 2011 German census were announced recently, they included an embarrassing error – at least in the demographics world. It showed the German population was 1.5 million people short of what the government had expected. The news dealt a blow to Germany’s reputation for efficient record-keeping, and it’s also relevant to how the next U.S. Census is conducted.

2010 Census Administrative Records Use for Coverage Problems Evaluation Report
Sheppard, Dave, et.al. | Census Bureau
March 18, 2013

2020 Census: Local Administrative Records and Their Use in the Challenge Program and Decennial
GAO-13-269

February 21, 2013
Highlights | Full Report

And Now for Something a Little Different. . .
Bob Groves | Director’s Blog: Census Bureau
June 27, 2012

Toward a Vision: Official Statistics and Big Data
C. Capps and T. Wright | AmStatNews
August 1, 2013
This piece even references Herman Hollerith:

The Census has a long history of innovation. Herman Hollerith invented the punch card for the 1890 Census; the first civilian computer was used for the 1950 Census. The first official sample survey was used by the Census Bureau to measure unemployment in 1937. Some of the basic technology for GIS was developed in the Dual Independent Map Encoding/Graphic Base Files efforts for the 1970 Census and TIGER for the 1990 Census.

Each of these innovations was done to reduce escalating cost and to preserve official statistical integrity. For these same reasons, the Census Bureau will continue to explore the possibility of using the explosion of Big Data to reduce cost, reduce reporting burden, and increase the effectiveness of national statistical estimation.

These benefits will accrue only if the Census Bureau can continue to preserve individual and corporate confidentiality, working to earn and preserve the public’s trust.

Cautionary Tale about Big Data Sampling

Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose
F. Martatter, J. Pfeffer, H Liu, K. Carley | arXiv.org
June 2013
[Abstract] [Paper]

These authors compare metrics based on the data one gets from Twitter’s free API vs the full universe (Firehose) and samples drawn from the Firehose. And, as an added bonus there is an excellent supply of references in this emerging field of big data/real-time data.