Archive for the 'Data' Category

Page 4 of 24

Do You Still Trust the Census Bureau?

The New York Daily News had a typically provacative headline “Census ‘faked’ 2012 election jobs report” two nights ago. This is a serious charge and even more, it contributes fodder to those who do not trust or support the federal data infrastructure in the first place. The following is the banner above the comments section for the New York Post article – and this sentiment probably represents the early coverage of this story.

trust census bureau logo

The following is the coverage of this in chronological order (as much as possible). Note that there are some references to Jack Welch. He famously tweeted his disbelief of this particular jobs report back in 2012 [See previous coverage.]

Census ‘faked’ 2012 election jobs report
John Crudele | New York Post
November 18, 2013

If these claims by ‘reliable sources’ are proven true, the Obama administration will be dealing with another huge scandal
Becket Adams | The Blaze (founded by Glenn Beck)
November 18, 2013

Census Bureau Statement on Collection of Survey Data
November 19, 2013

Here Are Some Issues With That Report About How The Unemployment Rate Was Faked Before The 2012 Election
Joe Weisenthal | Business Insider
November 19, 2013

Was Jack Welch right? Jobs numbers under fire
Jeff Cox | CNN
November 19, 2013

Did the Census Bureau Really Fake the Jobs Report?
Jordan Weissmann | The Atlantic
November 19, 2013

Five questions about the New York Post’s unemployment story
Erik Wemple | Washington Post
November 19, 2013

Census Sees No ‘Systemic Manipulation’ of U.S. Jobs Data
Michelle Jamrisko | Bloomberg News
Nov 19, 2013

House panel to investigate unemployment data
Annalyn Kurtz | CNN Money
November 19, 2013

House probes Census over ‘fake’ results
John Crudele | New York Post
November 19, 2013

Rep. Issa gets involved in alleged Census data fabrication, demands documents: ‘These allegations are shocking’
Becket Adams | The Blaze
November 19, 2013

Monthly jobs numbers from Census Bureau may have been manipulated since ‘10 – report
RT USA
November 19, 2013

Republican House leaders to look into report on faked jobs data
Reuters News Service
November 20, 2013

Political Questions About the Jobs Report
Nelson Schwartz | New York Times
November 20, 2013

Census Bureau: No systematic manipulation of jobs data
Paul Davidson | USA Today
November 20, 2013

Count your blessings; you could live in Canada

The following are articles, mostly from the Canadian press about the (a) the quality of data in the National Household Survey (NHS); and (b) the politicization of funding for basic science research. Much of the poor quality of the NHS data has to do with design changes at the behest of the prime minister’s office, rather than the statistical experts at Statistics Canada.

[Criticism of the National Household Survey]
To restore faith in Statscan, free the Chief Statistician
Munir Sheikh | The Globe and Mail
October 24, 2013
This op-ed is written by the former Chief Statistician who resigned amid the changes in the design of the National Household Survey. He could not agree with the statements coming from the Prime Minister that a voluntary survey can be a substitute for a mandatory survey. Here’s his resignation letter with the famous “It can not” sentence:

And that’s all he wrote. . . Munir Sheikh resigns as Chief Statistician
Kady O’Malley | CBC
July 21, 2010
[Resignation letter]

Canada’s voluntary census is worthless. Here’s why
D. Hulchanski, R. Murdie, A. Walks, and L. Bourne | Globe and Mail
October 4, 2013
Data from the NHS show that Canada’s income inequality has dropped. But, this may have more to do with the flawed NHS than reality. The authors compare tax receipt data to NHS data to illustrate the problem.

Canadian income data ‘is garbage’ without census, experts say
Tavia Grant | The Globe and Mail
October 4, 2013

[Politicization of Science Funding]
Blinded to science: The plight of basic research in Canada
Josh D. Neufeld iPolitics Insight
October 21, 2013
This piece is a good summary of the move by the Canadian government towards funding applied research instead of basic research. This statement summarizes the issue:

Basic research is the seed corn of the economy, generating the applications and economic benefits of tomorrow … Trouble is, it’s very difficult to predict which basic research programs and projects will lead to the innovations of tomorrow.

Others from the series of posts on science policy in Canada can be found here:

Series of Posts on Science Policy in Canada
to be published in iPolitics

Quantitative Text Analysis: Michael vs Jacob

Most of the data demographers use are numeric and are easily handled via statistical packages. Text data via Google NGrams or names from the Social Security Names Database are more commonly analyzed using Python.

CSCAR and ARC are sponsoring free Python training Friday, November 8th. Space is limited.

In case you miss the workshop, here’s a link to some Big Data Tutorials by Neal Caren at the University of North Carolina, Chapel Hill.

And, to get back to the title of this blog entry, below are three data visualizations on names. The first two are the most common name by state & gender from 1960 to 2010.

Click on the images to activate the gifs.

us_map_gnames us_map_bnames

Notice that for the girls, Lisa dominated the US in 1965, which means I was born 10+ years too early to have that name. And for the boys, watch the epic battle for Michael vs Jacob. Also note that Jose is the dominant male name in Texas in 1996. Arizona also has two Hispanic names (Jose and Angel) in the recent past.

The third data visualization explores unisex names:
unisex names

Finally, think of these as data. We have a link to research on black first names as well as a post on the declining popularity of Mary.

Resources:
Big Data Tutorials, Neal Caren (University of North Carolina, Chapel Hill).

Google NGrams Viewer

Google NGrams Data
Note, we have downloaded quite a bit of this. See Data Service before you download another copy.

Social Security Names Database

A Wondrous GIF Shows the Most Popular Baby Names for Girls Since 1960
Rebecca Rosen | The Atlantic
October 18, 2013

America’s Most Popular Boys’ Names Since 1960, in 1 Spectacular GIF
Megan Garber |The Atlantic
October 24, 2013

The most unisex names in US history
Data Underload | FlowingData Blog
September 25, 2013

Visualizing Births and Deaths in Real-Time

Data visualizations are becoming more and more popular and sometimes they include demographic concepts. The following are two simulations of births and deaths – one for the US and the other for the world.

Click on the images to start the simulations. To read more about how these were made see references below:

us_map world_map

Watch This Anxiety-Provoking Simulation of U.S. Births and Deaths
John Metcalfe | The Atlantic Cities
December 11, 2012

This Map Shows Where in the World People Are Dying and Being Born
John Metcalfe | The Atlantic Cities
October 14, 2013

World Births/Deaths Simulation – Adding World Cities
Brad Lyon | Nowhere Near Ithaca Blog
October 9, 2013

Open Access Week: The Science Sting & Response

In celebration of Open Access week, it is probably instructive to re-visit the recent sting of Open Access journals reported in Science earlier this month. The purpose of the sting was to expose shoddy peer review in open access journals. This sting is criticized on many points mostly by open access advocates: (a) this was not a fair experiment, e.g., the sample was predominantly comprised of predatory open access journals; (b) open access ≠ no peer review; (c) did this sting have IRB approval?; and (d) Science has a pretty poor record of publishing flawed papers and has a higher than average retraction rate.

Who’s Afraid of Peer Review?
John Bohannon | Science
October 4, 2013
A spoof paper concocted by Science reveals little or no scrutiny at many open-access journals.

Some Online Journals Will Publish Fake Science, For a Fee
Richard Knox | NPR
October 3, 2013
NPR was not critical of the study. It did interview Jeffrey Beall, an open access watchdog who maintains a list of predatory publishers and predatory journals:

Predatory Publishers | Predatory Journals

I confess, I wrote the Arsenic DNA paper to expose flaws in peer-review in subscription based journals
Michael Eisen | it is NOT junk blog
October 3, 2013
This starts out as a sarcastic post about a recent episode in Science’s history where it published an extraordinary paper about a species that uses arsenic in its DNA instead of phosphorus. He then criticizes the author for not including controls in the experiment like subscription-based publishers. Eisen agrees that the peer review process is broken, but says the problem is not open access journals.

Who’s Afraid of Open Access?
Ernesto Priego | The Comics Grid Blog
October 4, 2013
This article reiterates the unscientific nature of the Science sting and then discusses open access journals in more detail

Science Magazine Rejects Data, Publishes Anecdote
Bjorn Brembs | bjorn.brembs.blog
October 4, 2013
Bremb’s claim is that Science published a news story, not a peer-reviewed paper. He provides evidence that Science has one of the highest retraction rates in the entire industry (read for link) and does not want to publish scientific evidence of this. He also paints Nature with the same brush in a separate post.

The Troubled . . . & . . . . & the Blurry Line Between Human Subjects Research & Investigative Journalism
The Faculty Lounge
October 4, 2013
The title of this blog entry is way too long, but it has an IRB angle. The author suspects that Science regards this sting as investigative journalism rather than human subjects research.

Reproducibility Initiative: It’s not just for cancer

Reproducibility Initiative logo

The following are links to related efforts in Open Science. The first is about funding for a “Reproducibility Initiative” to validate 50 landmark cancer studies. Frankly, this can/should apply to population research as well. Included are links from The Economist and Nature about the importance of replication.

In general, there is a move towards “Open Science” across all disciplines. In fact, a different initiative, “The Reproducibility Project” is an effort to identify the predictors of reproducibility among published studies in psychology – a field that contributes far too much to the “Retraction Watch” website.

Reproducibility Initiative
Science Exchange News
October 16, 2013

Initiative gets $1.3 million to verify findings of 50 high-profile cancer papers
Richard Van Noorden | Nature News Blog
October 16, 2013

Unreliable research: Trouble at the lab
The Economist
October 19, 2013
Scientists like to think of science as self-correcting. To an alarming degree, it is not.

The governments of the OECD, a club of mostly rich countries, spent $59 billion on biomedical research in 2012, nearly double the figure in 2000. One of the justifications for this is that basic-science results provided by governments form the basis for private drug-development work. If companies cannot rely on academic research, that reasoning breaks down. When an official at America’s National Institutes of Health (NIH) reckons, despairingly, that researchers would find it hard to reproduce at least three-quarters of all published biomedical findings, the public part of the process seems to have failed.

If a job is worth doing, it is worth doing twice
Jonathan Russell | Nature
April 3, 2013

Reproducibility Project
Large-scale open collaboration to estimate the reproducibility of a sample of studies in psychology

Retraction Watch
Tracking retractions as a windo into the scientific process

Center for Open Science
A non-profit organization, which provides infrastructure tools for open science.

Dear Congress: Why are you doing this?

It is likely that the government shutdown and debt ceiling crisis will be resolved this week, but there has still been harm to the data and research infrastructure. Bookmark this post and use it as notes for your next letter to your representatives.

The Government Shutdown was Temporary, Its Damage to Science Permanent
Andrew Rosenberg | Scientific American
October 18, 2013

Federally funded science allows us to do things as a country that we could never do alone. But the threat of shutdown, combined with inconsistent funding from Congress, leaves America’s scientific enterprise in the lurch.

Shutdown: It ain’t over when it’s over
Jeff Neal | Federal News Radio
October 15, 2013
Author notes that the shutdown is not a toggle switch, where we can easily switch the government back to “on.” There rare many repercussions of the shutdown, detailed in the post.

Sunday Shutdown Reader: Harold Varmus on Self-Destruction in the Sciences
James Fallows | The Atlantic
October 13, 2013

Closed Question
Editorial | Nature
October 9, 2013
The US shutdown is damaging science, and Congress must be called to account.
There are more specific stories, linked to the end of this editorial. In case, they don’t remain linked, here they are:
NASA missions struggle to cope with shutdown
08 October 2013
US Antarctic research season is in jeopardy
04 October 2013
NIH shutdown effects multiply
02 October 2013
US government shuts down
01 October 2013

Cancelled NIH study sections: a subtle, yet disastrous, effect of the government shutdown
Rafael Irizarry | StatsBlogs
October 10, 2013
(This article was originally published at Simply Statistics, and syndicated at StatsBlogs.)

The New York Times has a series of editorials, all tagged with “Government Shutdown.” I’ll link to one of them on funding the Census Bureau.

To Stop the Craziness in Washington, Fund the Census
Teresa Tritch | New York Times
October 4, 2013

And, finally, most readers of this blog probably received an Action Alert from Population Association of America (PAA). When it shows up on the PAA website, I’ll link to it here.

Time for a Legal Prohibition on Data Re-identification?

This is a very thorough blog post on respondent re-identification issues. The author takes to task the re-identification rainmakers, who have made careers out of exposing re-identification risks – often overstating the risks. He calls for a well-designed legal prohibition on data re-identification.

In fact, may of the restricted data contracts PSC users operate under have an “inadvertent discovery” clause. Here’s the language from LAFANS, which prohibits broadcasting the “find” to others.

Ethical Concerns, Conduct and Public Policy for Re-Identification and De-identification Practice: Part 3 (Re-Identification Symposium)
Daniel Barth-Jones | Columbia University
October 2, 2013

A Sad Day for Demographers

The government shutdown has affected many government websites and data operations. And, in fact, it is useful to look at this sortable table to see how many furloughed workers there are by agency (Commerce is 87%). Below are links to some compilations – some might be useful as images to use in presentations:

[Slideshow of shuttered government websites] via @phylogenomics
[Fact-checking websites shutdown] via @PolitiFact

Here’s a re-cap by the WSJ on how to keep track of the economy during the government shutdown. Losing access to the federal statistical system is quite crippling.

How to Track the Economy During A Government Shutdown
Josh Mitchell and Jeffrey Sparshott | Wall Street Journal
October 1, 2013

Finally, there is a work-around to reaching shuttered government websites via the internet archive. However, the access is slow and some of the data-access tools don’t work. For instance CDC Wonder gets stuck in an “I agree” loop.

internet archive feature

The internet archive has even hard-coded archival links to many of the shuttered websites, but I find this access slow – perhaps because everyone is using the same portal. Still, this is a useful link for posterity:

Blacked Out Government Websites Available Through Wayback Machine
Posted on October 2, 2013 by brewster

Counting Prisoners

The New York Times had another editorial on this issue:
Prison-Based Gerrymandering
Editorial Board | New York Times
September 26, 2013

A search on its site shows that this has been a common editorial/story topic
[Counting Prisoners Editorials/Stories]

The PSC Infoblog has had a previous post on this topic as well, which included the Census Bureau’s response to the issue. The Census Bureau released group quarters data in time for redistricting.

Another excellent source on this topic is the National Academy of Sciences book, which is available in the PSC library:
Once, Only Once and in the Right Place: Residence Rules in the Decennial Census
Daniel Cork and Paul Voss, Editors | The National Academies
2006