Introduction
Analysis of variance (ANOVA) exists in order to determine whether there is an effect of an independent variable with more than two levels on a dependent variable that is continuously distributed. ANOVA can be accompanied by Tukey’s post hoc test in order to identify pairwise differences between individual levels of the independent variable. Don’t worry if that sounds complicated: This blog entry shows you how simple it is to carry out an ANOVA followed by Tukey’s test in Stata.
Example
Let’s say you want to compare the earning levels of college graduates from three distinct majors: Philosophy, business, and English. The code below will create a Stata dataset that we can use to run and then interpret an ANOVA followed by Tukey’s test.
set obs 150
gen major = 1
replace major = 2 in 51/100
replace major = 3 in 101/150
label define major 1 "English" 2 "Philosophy" 3 "Business"
label value major major
label variable major "College major"
drawnorm a, mean(40000) sd(5000)
drawnorm b, mean(70000) sd(10000)
replace a = . in 101/150
replace b = . in 1/100
egen income = rowmax(a b)
drop a b
label variable income "Income"
anova income major
pwmean income, over(major) mcompare(tukey) effects
ciplot income, by(major)
Results and Interpretation
The ANOVA model is significant, F(2, 147) = 250.09, p < .0001.
However, on its own, this result only tells us that there is some effect of college major on income. We need the Tukey pairwise comparison in order to identify the precise income differences between the college majors:
Here we see that (a) philosophy and English are not significantly different from each other, (b) business majors make $30,204.96 more than English majors, and (c) business majors make $30,828.51 more than philosophy majors. The 95% confidence interval plot below illustrates the income differences between major:
BridgeText can help you with all of your statistical analysis needs.