Creating Dummy Variables from Continuous Variables in Stata

Dec 28

Introduction

A dummy variable designates subgroups within your analysis. In Stata, one simple way to create dummy variables is to use the i.prefix, as we have shown you. However, you will not be able to use this command to automatically generate a dummy variable from a continuous variable. In this blog, we’ll show you how to create a dummy variable from a continuous variable in Stata.

Load Data

Let’s load Stata’s prebuilt auto dataset:

sysuse auto
describe

Create a Dummy Variable from a Continuous Variable

In this dataset, weight is a continuously distributed variable with the following characteristics:

sum weight, det

Let’s say you wanted to create a dummy variable for all cars that weighed over 3,190 pounds. In such an approach, you might to designate all cars over 3,190 pounds as = 1 and all other cars as = 0. Try the following code:

clonevar weight_dummy = weight
replace weight_dummy = 1 if weight > 3190
replace weight_dummy = 0 if weight < = 3190
list weight weight_dummy in 1/30

You can confirm that you created a new dummy variable, weight_dummy, that follows your directions:

You can now use this dummy variable for statistical analysis—for example, to conduct an independent samples t-test on the relationship between weight and price, you could try: