Monthly Archive for December, 2017

Deja vu: Citizenship question or not?

This question comes up almost every decade. Should there be a citizenship question on the census short form? See previous posts on this:
A Trump Executive Order
Legislative districts should be based on voters not people
Vitter amendment
Louisiana loses a seat and a suit
Louisiana’s arguments for excluding non-citizens
Constitutionality of Excluding Aliens via CRS

This time around, the argument is that the Trump administration needs counts of citizens to make sure the Voting Rights Act is properly administered. See below for details:
Trump Justice Department Pushes for Citizenship Question on Census, Alarming Experts
Justin Elliot | ProPublica
December 29, 2017
“This is a recipe for sabotaging the census”

Re-printed in HuffingtonPost and Salon

DOJ pushing for citizenship question on census forms: report
Julia Manchester | The Hill
December 29, 2017

The Department of Justice (DOJ) is asking the Census Bureau if a question on citizenship status could be added to 2020 census forms, according to a letter first reported by ProPublica on Friday.

The DOJ letter, dated Dec. 12, said including a question on citizenship would allow the the department to better enforce the Voting Rights Act.

“To fully enforce those requirements, the Department needs a reliable calculation of the citizen voting-age population in localities where voting rights violations are alleged or suspected,” the letter said.

However, critics say including a question on immigration could prevent immigrants from participating in the census due to fears the government could use the information against them.

And, here’s the letter – notice the addressee’s title. The Census Bureau needs a director.

Data Visualization: Intro with Data

book jacket

Data Visualizaation in Social Science
Kieran Healy
Princeton University Press (2018) or on-line via

This book is a hands-on introduction to the principles and practice of looking at and presenting data using R and ggplot. The on-line version includes links to the data used in the exercises – as well as how to fetch it and load all the other R packages you’ll need. The book will be published by Princeton University Press in 2018.

The author will be visiting UM this winter as part of the Computational Social Science Initiative at Michigan. [See post on “text as data”].

If you are not familiar with the early data visualization work of W.E.B. Du Bois, the book’s cover is using one of the posters from the Negro Exhibit of the American Section at the Paris Exposition. We have an earlier PSC Info Post about this.

image stamp

[Link to W.E.B. DuBois infographics at the Library of Congress]

Time for some 2020 Apportionment Scenarios

The Census Bureau released its 2017 Population Estimates yesterday and the results show that the Mountain West states are the fastest growing states. Watch out Rhode Island. You may not hang on to that 2nd Congressional Seat much longer. But, will it go to Montana, which lost its 2nd Congressional seat in 2000?

Try out the PSC Census Apportionment Calculator and experiment with various scenarios. One nice scenario is giving Florida 500,000 Puerto Ricans. That tips it to gaining 2 Congressional Seats – assuming July 2017 = December 2020. Here’s a spreadsheet to start with – be sure to delete D.C.

apportionment calculator

Click here for the PSC Apportionment Calculator

Data & Tools to study Global Inequality

The following is a news post based on the rich database from the World Wealth & Income Database. Read the article and then replicate it. And, then do some independent work.

It’s an Unequal World. It Doesn’t Have to Be.
Eduardo Porter and Karl Russell | New York Times
December 14, 2017

More coverage of the data in the popular press here.

World Wealth & Income Database
Click for access to data, methodology, country trends, working papers, FAQ, etc.

World Inequality Report, comes from this team of researchers.

Archiving Social Media Data

Documenting the Now
This website has user-friendly means of collecting and preserving digital content. There are several tools & resources:
Hydrator | Twarc | Diff Engine | Tweet Catalog

This blurb from the Tweet Catalog link says it all:

Twitter’s terms of service don’t allow tweet datasets to be published on the web, but they do allow tweet identifier datasets to be shared. This speaks to users rights as content creators, while also allowing researchers to share their data with others.

This site is a catalog of datasets that are publicly available on the web. If you would like to turn these tweet identifier datasets back into the original JSON first download the dataset and then use the Hydrator desktop application, or Twarc if you are comfortable working at the command line.

You can add your own datasets to the catalog by following these instructions. If you’d like updates when datasets are added please subscribe to the RSS feed. All metadata listed here is licensed CC0. You may want to refer to our code of conduct if you have questions or concerns about the datasets we list here.

Bonus Material
Here are two recent articles that address the ethics of archiving data from Twitter as well as strategies for ethically archiving social media posts:

Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation
M. Williams, P. Burnap and L. Sloan | Sociology
May 26, 2017

Archiving information from geotagged tweets to promote reproducibility and comparability in social media research
K. Kinder-Kerlanda, K. Weller and M. Zenk-Moltgen | Big Data & Society
November 1, 2017

Is your research protected?

Demography is not as politicized a field as climate science, but sometimes demographers/social scientists study topics that the public does not like. Or sometimes the public does not like the findings.

The following is a state-by-state report card on how states treat data and communication by researchers in the research process. How much do states protect researchers from open data requests above and beyond the FOIA protections included in some DUAs?

This is not bedtime reading, but good to have in your back pocket for reference. And, the author(s) are a good resource, even if their subject matter is climate science.

50 state report card
Executive Summary
Full report

Text as Data

The UM campus was lucky to have one of the developers of the Structured Topical Model (stm) come for a 1-day workshop on this R package. Participants got access to more than can be publically posted on the PSC Info Blog, but there is still plenty to explore below. And, if I find I can post more of the workshop materials, I will update this post at a later time.

stm Vignette
This vignette is an annotated guide to using the stm package. Participants in the workshop got a more interactive vignette, which didn’t assume as much knowledge about R. But, assuming you have familiarity with R, this vignette walks the user through stm.

Publications based on stm
[scroll about halfway down for methods papers and two thirds for substantive papers]

The stm package is assuming that the several other packages have been installed. Here are a few:
Getting started with quanteda – An R package for managing and analyzing text

Additional Quanteda vignettes
These are additional vignettes. The more you work through these, the more you learn. Even better, is if you substitute your own data.

Getting started with tidyverse in r

Corpus – an R package for managing text

New Directions in Analyzing Text as Data Workshop
Materials for a 2-day workshop by Ken Benoit

Quantitative Text Analysis
Materials and exercises for a 4 session short course by Ken Benoit

Vital Statistics: the Puerto Rico edition

When President Trump visited Puerto Rico after Hurricane Maria, he noted that only 16 people had died and compared that to the death toll after Hurricane Katrina:

“Sixteen people certified.” Trump said on October 3 during his visit to the island, repeating a figure confirmed by the territory’s governor. “Everybody watching can be very proud of what’s taken place in Puerto Rico.”

This was an improbably low number and immediately there were some posts suggesting that looking at vital statistics records could clarify this:

Everything that’s been reported about deaths in Puerto Rico is at odds with the official count
Eliza Barclay and Alexia Fernandez Campbell | Vox
October 11, 2017
According to this article, the method in use for the 16 certified deaths:

. . . “every death must be confirmed by the Institute of Forensic Science, which means either the bodies have to be brought to San Juan to do an autopsy or a medical examiner must be dispatched to the local municipality to verify the death”

Methodology suggested by John Mutter, Columbia University

. . . count all the deaths in the time since the event, and then compare that number to the average number of deaths in the same time period from previous years. Subtract the average number from the current number and that’s the death toll.

Rather than waiting for a year for Puerto Rican deaths to show up on the CDC website, investigators went to Puerto Rico to look at the vital registration system. Here’s a sampling.

Estimates of excess deaths in Puerto Rico following Hurricane Maria
Alexis Santo-Lozada and Jeffrey Howard | SocArXiv Papers
November 21, 2017

Nearly 1,000 More People Died in Puerto Rico after Hurricane Maria
Center for Investigative Journalism
December 7, 2017

Official Toll in Puerto Rico: 64. Actual Deaths May be 1,052
Frances Robles, Kenan Davis, Sheri Fink, and Sarah Almukhtar | New York Times
December 9, 2017
And, this one wins the prize for the best graphics. The first graph shows the excess deaths over previous years; the second is a table that shows causes of death – not just drownings and electrocutions, which are typical for hurricane events.

excess death graph

table showing causes of death

PAA writes a letter

The Population Association of American and the Association of Population Centers wrote a letter to the Trump administration on behalf of 40 federally funded population centers and 3,000 scientists in opposition to the potential nomination of Thomas Brunell to be the Deputy Director of the Census Bureau. For more details about the politicization of this appointment see this earlier post.

[Deputy Director Letter: December 5, 2017]

We join other scientific organizations in calling on the Administration to promptly submit to the United States Senate a qualified nominee to serve as the Director of the U.S. Census Bureau and to reserve the agency’s Deputy Director position for a qualified candidate who can help lead the agency during these critical years leading up to the 2020 Census.

This letter was also noted in the Washington Post, which is good as most people don’t go trolling the PAA website.

Apparent White House pick to lead census sparks concern about partisanship
Tara Bahrampour | Washington Post
December 7, 2017