Introduction
Sometimes, you will need to create a single variable from two or more existing variables in Stata. In this blog, we’ll show you some options for doing so.
Create Data
Let’s assume that you administered a survey with four questions that measure a construct such as confidence. Now you need to add the numerical answers to these questions together in order to generate a single confidence score for each participant. First, we’ll create mock data, then we’ll show you how to manipulate them.
set obs 30
gen subj = _n
label variable subj "Subject #"
gen q1_a = runiform(1,7)
gen q2_a = runiform(1,7)
gen q3_a = runiform(1,7)
gen q4_a = runiform(1,7)
gen q1 = round(q1_a)
gen q2 = round(q2_a)
gen q3 = round(q3_a)
gen q4 = round(q4_a)
drop q1_a q2_a q3_a q4_a
edit
Add Data Values
You can create a new value, total, that sums the values of answers to your four survey questions as follows:
gen total = q1 + q2 + q3 + q4
You can achieve the same goal with the following code as well:
egen total_a = rowtotal(q1 q2 q3 q4)
Average Data Values
You can create a new value, average, that averages the values of answers to your four survey questions as follows:
gen average = (q1 + q2 + q3 + q4) / 4
You can achieve the same goal with the following code as well:
egen average_a = rowmean(q1 q2 q3 q4)
Retain the Highest Data Value
Now let’s assume that the values for these four questions reflect performance levels on a test, and what you want is to be able to retain the highest score that each participant achieved. Try the following code:
egen max = rowmax(q1 q2 q3 q4)
Retain the Lowest Data Value
What if you want to retain the lowest score that each participant achieved? Try the following code:
egen min = rowmin(q1 q2 q3 q4)
Multiply Data Values
You can create a new value, mult, that multiplies the values of answers to your four survey questions as follows:
gen mult = q1 * q2 * q3 * q4
BridgeText can help you with all of your statistical analysis needs.