Home > Data Services > Catalog . Restricted Data . Census . ACS

Search Data Services

Meta Search
search across all the following databases:

Data Catalog
Data and documentation

Common questions and answers.

Entire collection of data resources.

Latest Data News

RSS Feed icon

Using Stata Effectively

Changing Causes of Death in Poor Countries

Data USA

Are secondary data users research parasites?

Demographic and Economic Profiles of the Super Tuesday States

Data Services Knowledge Base


I am merging two data sets and end up with too few variables. What is going on?


The most likely problem is that you have duplicate variable names across your two files. It is probably best to use a suffix or prefix in variable names so that you can tell which items come from which file. For instance, phght and mhght might represent the person’s self-reported height and m_hght might represent height from a medical record.

In SAS, the duplicate variable is dropped from the first file in the merge statement. The following provides more detail about how SAS handles duplicate variables with merges:


stata handles duplicate variable names the reverse of SAS. In other words, it drops the duplicate variable from the second file named or the ‘using’ data file.

  use temps
  merge id using tempm

In this case, the height variable from the temporary medical file (tempm) would be dropped and the height variable from the self-reported file would be kept.

 use tempm 
 merge id using temps

In this case, the height variable from the temporary self-report file will be dropped.

Annotated Resources:

Direct Links:

Related Question Groups: