In this paper, we address data collinearity problems in multiple linear regression from an optimization perspective. We propose a novel linearly constrained quadratic programming model, based on the concept of the variance inflation factor (VIF). We employ the perturbation method that involves imposing a general symmetric non-diagonal perturbation matrix on the correlation matrix. The proposed VIF-based model reduces the largest VIF by minimizing the resulting biases. The VIF-based model can mitigate the harm from data collinearity through the reduction in both the condition number and VIFs, meanwhile improving the statistical significance. The resulting estimator has bounded biases under an iterative framework and hence is termed the least accumulative bias estimator. Certain potential statistical properties can be further considered as the side constraints for the proposed model. Various numerical examples validate the proposed approach.
- Convex optimization
- Variance inflation factor