Meta Search
search across all the following databases:
Data Catalog
Data and documentation
KnowledgeBase
Common questions and answers.
Resources
Entire collection of data resources.
Imagining a Census Survey Without a Mandate
Lessons from North of the Border
Nerd Alert: Dictionary of Numbers
Navigation Warning:
Use the navigation tools in the program to move between forms. Do not use the back button on your browser as the results will be inconsistent across differnt browsers and configuration settings.
User ID
The registration information is necessary for communication. When the extract is complete, an e-mail will be sent to the user's e-mail address. The other information we collect is for development purposes. We need to know what kind of users we are serving so that we can serve you better.
Data Selection
The census is collected on an annual basis (2000, 1990, 1980, etc.). Within each decennial year, there are choices of different sampling frames (.1%, 1%, and 5%). The two major distinctions between these sampling frames are size and geography.
The 5% files are large. For example, the number of person records in 1990 for the samples are as follows:
Samples Person records
.1% 250,013
1% 2,500,052
5% 12,501,046
If one is looking at a fairly specialized population, such as black female nurses between ages 40 and 49 (n=1,426 in the 1990 5% file) or the white population within a selected geography, such as Central Harlem (n=2,864 in the 1990 5% file), one probably will want to use the 5% file. Otherwise, one often will have enough records with the 1% or .1% files.
The second distinction between the 1% and the 5% files is census geography. The 1% files are best used for metropolitan or national analyses. The smaller building blocks for the sub-metropolitan areas (PUMAS in 1990 and county groups in 1970 and 1980) do not usually cross metropolitan boundaries. However, these sub-metropolitan areas can cross state boundaries. That means these files are less suited for state-level analyses.
The 5% and .1% files are state files. If one is interested in making comparisons across states, use the 5% or .1% files. If the 5% file is too large and the .1% file is too small, contact us about sampling from the 5% file to create a 1% state file.
The Census Bureau's Plans for the Census 2000 Public Use Microdata Sample Files describes the differences between the 1% and 5% files in 2000. Specific examples of some of the item differences can be found in Differences in Item Detail across the National (1%) and State (5%) Files.
State Selection
One can select all states, a single state, or several states, which ever best meets your needs.
Variable Display Options
There are over 200 items in the census. To simplify the variable display, we have provided three sort orders and two levels of detail on variables (some items or all items).
Subject order groups the variables in categories so that similar variables are grouped with each other.
Codebook order follows the order that the items are presented in the technical documentation.
Alphabetical order puts the items in ascending order, using the item name as the sorting key.
The depth choice controls the number of items presented:
Favorites is a pre-selection of the most popular or widely-used variables. This ends up being about half of the items. It is the recommended choice as it cuts down on the number of items in the display, but users are encouraged to look at the all items choice to see what sorts of variables are being excluded. For instance, allocation items are not in favorites.
All items includes all the variables.
Select the items you wish to extract by clicking in the box next to the variable name. If you are unsure of the coding of an item, you can get the coding by clicking on the variable name. This will give a link to the documentation associated with an item.
For instance, using items from the 2000 data, if one clicks on RELATE the following information would be in a pop-up window:
RELAT Relationship RELAT Relationship 01 Householder 02 Husband/wife 03 Natural born son/daughter 04 Adopted son/daughter 05 Stepson/Stepdaughter 06 Brother/sister 07 Father/mother 08 Grandchild 09 Parent-in-law 10 Son-in-law/daughter-in-law 11 Other relative 12 Brother-in-law/sister-in-law 13 Nephew/niece 14 Grandparent 15 Uncle/aunt 16 Cousin 17 Roomer/boarder 18 Housemate/roommate 19 Unmarried partner 20 Foster child 21 Other nonrelative 22 Institutionalized GQ person 23 Noninstitutionalized GQ person
If you are completely unfamiliar with the data, you should probably print the data dictionary out or scroll through the on-line data dictionary.
The full technical documentation [PDF 4.3MB] provides definitions of items, explains geographic terms and concepts, describes the accuracy of the data including a discussion of disclosure limitation, data swapping, allocation, and sampling errors. The full technical documentation also has the questionnaire, the data dictionary, and all appendices.
This feature allows you to restrict the extract to a sub-population. For instance, one can restrict the population to males, between 25 and 64. To do so, the user has to have selected sex and age in the variable selection step.
In the filtering step, indicate the codes that represent males (1) and ages 25 to 64 (25) and (64).
Sex 1 1 Age 25 64
Note, that one must use the width of the item as represented in the documentation. If one wanted to restrict the sample to householder (01) and their husband/wife (02) on RELATE, one must do the following
RELATE 01 02
The program is making character comparisons, not numeric comparisons, so the following would not work:
RELATE 1 2
If you forgot to include a variable that you want to select on, use the navigation tool at the bottom of the filtering form.
Select the additional filtering variable(s) that you need and then continue on to [Sample Selection].
In this section, you have one more opportunity to edit your set-up file. If everything is correct, submit the job. Otherwise, use the navigation tools to modify your choices.
When your job is complete, you will receive an e-mail notification. Download the data and data map. The data map describes the location of the variables in the extract.
At the moment, we do not provide SAS, SPSS, or STATA statements.
If your job fails, read the suggested solutions in the e-mail notification. If you still have problems or would like to make suggestions to us, send e-mail to psc-census@umich.edu.
Next: Create an Extract Job