Provider: Interuniversity Consortium for Political and Social Research/National Archive of Criminal Justice Data
Publication: Disclosure Requirements
Language from Contract:
VI. G. To avoid inadvertent disclosure of private persons by being knowledgeable about what factors constitute disclosure risk and by using disclosure risk guidelines, such as but not limited to, the following guidelines in the release of statistics or other content derived from the Confidential Data.
1. No release of a sample unique for which only one record in the Confidential Data obtained through sampling (e.g., not a census) provides a certain combination of values from key variables. For example, in no table should all cases in any row or column be found in a single cell.
2. No release of a sample rare for which only a small number of records (e.g., 3, 5, or 10 depending on sample characteristics) in the Confidential Data provide a certain combination of values from key variables. For example, in no instance should the cell frequency of a cross-tabulation, a total for a row or column of a cross-tabulation, or a quantity figure be fewer than the appropriate threshold as determined from the sample characteristics. In general, assess empty cells and full cells for disclosure risk stemming from sampled records of a defined group reporting the same characteristics.
3. No release of a population unique for which only one record in the Confidential Data that represents the entire population (e.g., from a census) provides a certain combination of values from key variables. For example, in no table should all cases in any row or column be found in a single cell.
4. No release of the statistic if the total, mean, or average is based on fewer cases than the appropriate threshold as determined from the sample characteristics.
5. No release of the statistic if the contribution of a few observations dominates the estimate of a particular cell. For example, in no instance should the quantity figures be released if one case
contributes more than 60 percent of the quantity amount.
6. No release of data that permits disclosure when used in combination with other known data. For example, unique values or counts below the appropriate threshold for key variables in the Confidential Data that are continuous and link to other data from ICPSR or elsewhere.
7. No release of minimum and maximum values of identifiable characteristics (e.g., income, age, household size, etc.) or reporting of values in the "tails," e.g., the 5th or 95th percentile, from a
variable(s) representing highly skewed populations.
8. Release only weighted results if specified in the data documentation.
9. No release of ANOVAs and regression equations when the analytic model that includes categorical covariates is saturated or nearly saturated. In general, variables in analytic models should conform to disclosure rules for descriptive statistics (e.g., see #7 above) and appropriate weights should be applied.
10. In no instance should data on an identifiable case, or any of the kinds of data listed in preceding items 1-9, be derivable through subtraction or other calculation from the combination of
11. No release of sample population information or characteristics in greater detail than released or published by the researchers who collected the Confidential Data. This includes but is not limited to
publication of maps.
12. No release of anecdotal information about a specific private person(s) or case study without prior approval.
13. The above guidelines also apply to charts as they are graphical representations of cross- tabulations. In addition, graphical outputs (e.g., scatterplots, box plots, plots of residuals) should adhere to the above guidelines.