Introduction
One of the standard post-regression diagnostic tests is a test for multicollinearity. In this blog, we’ll show you to test for multicollinearity after a regression using Stata’s vif command. We’ll show you to run and interpret vif, then adjust your model to eliminate multicollinearity.
Load Data
Let’s load Stata’s prebuilt auto dataset:
sysuse auto
describe
Run the Regression
Try:
regress mpg weight length i.foreign
Here’s what you get:
Test VIF
Next, test the multicollinearity of this regression model with the vif command:
vif
Here’s what you get:
A variance inflation factor (VIF) of over 5 strongly suggests multicollinearity. In this context of this regression, multicollinearity simply means that the two predictors with high VIFs (weight and length) are likely to be highly correlated with each other, which we can check as follows:
pwcorr weight length, sig
Weight and length are highly positively correlated, r = .946, p < .0001.
Rerun the Model with Only One High-VIF Variable
Now, because weight and length are highly positively correlated with each other, we can remove one and retest for multicollinearity. Try:
regress mpg weight i.foreign
vif
And, as you can confirm, the regression remains significant:
However, this time, multicollinearity is not a problem:
BridgeText can help you with all of your statistical analysis needs.