Author Archive for lisan

Page 2 of 29

Missing girls in China maybe weren’t missing after all

China has had a highly unbalanced sex ratio at birth for years leading to an estimate of 30 to 60 million missing girls. The traditional explanation was male preference, exacerbated by the one-child policy, which led to sex selective abortion and/or infanticide. New research presents evidence that maybe the missing girls were never missing after all.

Researchers may have ‘found’ many of China’s 30 million missing girls
Simon Denyer | Washington Post
November 30, 2016

Delayed Registration and Identifying the “Missing Girls” in China
Yaojiang Shi and John James Kennedy | China Daily
November 15, 2016

Detroitography Mapping Seminar

blog header

A 2-hour workshop on mapping data from Detroit is offered on Thursday, November 3rd on campus. Perhaps as useful is meeting the presenter who is the founder of Detroitography.com a group that is all about maps and geography of Detroit. And, that also means geographically-referenced data.

Useful Links
Workshop: Link to workshop
Website: Detroitography.com
Twitter: https://twitter.com/detroitography
Detroit Opendata: http://detroitdata.org/

Creating a travel time polygon

blog header
Using the TravelTime Search API to Generate an Isochrone
GIS Lounge | GIS contributor
July 9, 2016
Using the TravelTime platform and some simple code, researchers can map how far people can travel in 30 minutes by public transportation from a specific address. This is more realistic than radius circles because these don’t take into account roads, bus routes, etc.

The TravelTime platform includes several countries, including US coasts.

Accidental Data Librarian Webinar Series

Help! I’m an Accidental Government Information Librarian Webinars

These monthly webinars out of the North Carolina Library Association provide a good introduction introduction to all sorts of data products by subject experts: APIs, mapping, UN data, global trade, court records, etc. Folks can sign up and watch the presentation in real time or as a recording. Slides are available for all presentations.

Jeremy Darrington’s webinar on election data is up on YouTube. You can also see his slides and links on Slideshare.

Altmetrics: What are they and/or should I care?

Altmetrics are metrics and qualitative data that are complementary to citation-based metrics. Some argue that these metrics should be considered in tenure decisions, along with the more traditional metrics of publishing in a high impact journal with many citations. Almetrics cover a wider range of materials than just those in professional journals – websites, blogs, materials in repositories like figshare or GitHub. It also covers more than citations, such as views and downloads.

How to Use Altmetrics to Showcase Engagement Efforts for Promotion and Tenure
Stacy Konkiel | Altmetric Blog
October 18, 2016

This blogpost from the Altmetric site, shows how Altmetrics can be incorporated into a traditional tenure document.

And an even more informative article on Altmetrics is a summary written by Yan Fu in a PSC news report:
Altmetrics: New Ways to Measure the Impact of Research Products
Yan Fu | PSC Center News
April 2014

The Undercover Historian

blog header
[Link to Undercover Historian blog]

The Undercover Historian
Beatrice Cherrier | blog
since 2011

This is a blog by Beatrice Cherrier, an historian of economics. It has been in existence since 2011 and has a wealth of information about the history of the field of economics. And, no I don’t know what her quote about “pig-headed” is referencing.

Weighting Makes a Difference

How One 19-Year-Old Illinois Man Is Distorting National Polling Averages
New York Times | Nate Cohn @Nate_Cohn
October 12, 2016

Trump support

[Link to NYT article]

This is a nice illustration of the decisions polls make when they weight their respondents. The authors disagree with the decisions of the USC/LA-Times pollsters, but applaud them for transparency:

It’s worth noting that this analysis is possible only because the poll is extremely and admirably transparent: It has published a data set and the documentation necessary to replicate the survey.

The article has multiple illustrations of what the trend of national Trump support would have been with different weighting decisions. Check it out.

Big Data is not about the Data

Gary King, Director for the Institute for Quantitative Social Science at Harvard University spoke at a recent Michigan Institute for Data Science (MIDAS) symposium. Below are links to the slides and a video of the presentation.
Slides | Video

For those who don’t want to watch the entire presentation, here are links to specific papers and/or software he mentions in the presentation.

Automated Text Analysis
VA: Verbal Autopsy [software]
Evaluating U.S. Social Security Administration Forecasts
Learning Catalytics [commercial start-up]
Crimson Hexagon: Social Media Insights [commercial start-up]
Perusall [commercial start-up, e-book platform to increase student engagement]

And, it might be more productive to just go through King’s personal website to find the content yourself. The above is just a fraction of his productivity.

Data Sleuths at the Census Bureau

The Census Bureau gathered data on fertility by asking a “children ever born” question from 1940 to 1990 in the decennial census. The 2000 Census did not ask a fertility question at all. With the advent of the American Community Survey, fertility was covered but with a different question. It asked if a woman had given birth to a child in the past year. This allows researchers to compute a total fertility rate. It performs reasonably well against the measure produced from the vital statistics system. And, given that geography is not readily available with the natality detail files anymore, this is a welcome solution. The main drawback to the ACS question is that the reference year will not span the calendar year that the vital statistics system is based on. Only the December respondents are referencing a January to December calendar year. See the Background section below for a further discussion of this.

However, recently, the Census Bureau noticed some anomalies in the data for selected areas and determined that some interviewers had been sloppy and asked “Have you given birth” rather than “Have you given birth in the last year.” Many more women will answer yes to the former and inflate the numerator. This is a good illustration of how much effort the Census Bureau goes to for producing accurate and robust statistics.

Data Sleuthing
Addressing Data Collection Errors in the Fertility Question in the American Community Survey
Tavia Simmons | Census Bureau
August 2016

In recent years, a few geographic areas in the American Community Survey (ACS) data had unusually high percentages of women reported as giving birth in the past year, quite unlike what was seen in previous years for those areas. This paper describes the issue that was discovered, and the measures taken to address it.

Background
Indicators of Marriage and Fertility in the United States from the American Community Survey: 2000 to 2004
T. Johnson and J. Dye | Census Bureau
May 2005
[ppt]

Slides 23 to 26 discuss and illustrate how the ACS and Vital Statistics estimates diverge from each other.

IRS Migration Data Report Tool

IRS map

This is a nice tool for getting net migration reports based on IRS tax return data. Note that because these data are based on tax returns, one can also tell whether, on average, a state is losing/gaining wealthier residents. One can generate reports for counties by state or for states. The former is really tedious because one has to generate the county reports one by one.

Tool Link
Counties | States

And here’s the link to raw data for those who find widgets tedious. Note that the site has nice explanations for the methodology, including changes over time in how these files are created: SOI Tax Stats – Migration Data

And, do you want to know how to make something like the map above? Here’s a link from Flowing Data on how to make a similar map based on 5-years of county-to-county IRS data:
Article | How To Guide