Home > Data Services > Catalog . Restricted Data . Census . ACS

Search Data Services

Meta Search
search across all the following databases:

Data Catalog
Data and documentation

Common questions and answers.

Entire collection of data resources.

Latest Data News

RSS Feed icon

Census Bureau director resigns

In the nick of time?

The title says it all

Anything you can’t measure you can’t manage

New word of the day: frugging

Merging data files with duplicate variable names

If you are merging two files that have variables with duplicate names, the duplicate variable names will be dropped from the first file. This happens even if the duplicate variable names do not have duplicate values.

To prevent this from happening, be aware of your variables across both data files. Rename variables if there are duplicate names that don't have duplicate values.

The following is an example of two data files that have a variable 'black.' In one file, a new variable, bdummy is created so that when the files are merged, nothing is lost.

data a;
infile 'a.dat';
input id state race black age sex earnings;
bdummy = black;
data b;
infile 'aggreg.dat';
input state white black asian namer other;
data c;
merge a b; by state;

Note that the file order is not determined by the order in which the files are read by SAS. The file order is determined by the merge statement.

data merge1;
merge a b; by id;
data merge2;
merge b a; by id;

In the first example, 'a' is the first data file and 'b' is the second. In the second example, the reverse is true.

If you read your log file carefully, you might notice when you have duplicate variable names across data files. The total number of variables in a merged file should be one less than the sum of the number of variables in both data sets. If you have dropped more than one variable, you have a duplicate variable problem.