A heuristic algorithm to combat outliers and multicollinearity in regression model analysis

Document Type : Original Article

Authors

1 Faculty of Mathematics, Statistics and Computer Science, P.O. Box: 35195-363, Semnan University, Semnan, Iran.

2 Faculty of Mathematics, Statistics and Computer Science, Semnan University, Semnan, Iran

Abstract

As known, outliers and multicollinearity in the data set are among the important diffculties in regression models, which badly affect the leastsquares estimators. Under multicollinearity and outliers’ existence in the data set, the prediction performance of the least-squares regression method is decreased dramatically. Here, proposing an approximation for the condition number, we suggest a nonlinear mixed-integer programming model to simultaneously control inappropriate effects of the mentioned problems. The model can be effectively solved by popular metaheuristic algorithms. To shed light on importance of our optimization approach, we make some numerical experiments on a classic real data set as well as a simulated data set.

Keywords

Main Subjects


1. Alfons, A., Croux, C. and Gelper, S., Sparse least trimmed squares regression for analyzing high dimensional large data set, Ann. Appl. Stat. 7 (2013), 226–248.
2. Amini M. and Roozbeh, M.,
Optimal partial ridge estimation in restricted semiparametric regression models, J. Multivar. Anal. 136 (2015), 26–40.
3. Arashi M., Roozbeh, M., Hamzah, N.A. and Gasparini, M.,
Ridge regression and its applications in genetic studies, PLoS ONE 16(4) (2021), e0245376.
4. Bertsimas, D. and Tsitsiklis, J.N.,
Introduction to linear optimization, Athena Scientific, Massachusetts, 1997.
5. Buhlmann, P., Kalisch, M. and Meier, L.,
High-dimensional statistics with a view towards applications in biology, Annu. Rev. Stat. Appl. 1 (2014), 255–278.
6. Efron, B. and Hastie, T.,
Computer age statistical inference, Cambridge University Press, Cambridge, 2017.
7. Roozbeh, M., Babaie-Kafaki, S. and Arashi, M., A class of biased estimators based on QR decomposition, Linear Algebra Appl. 508 (2016), 190–205.
8. Roozbeh, M., Babaie-Kafaki, S. and Naeimi Sadigh, A.,
A heuristic approach to combat multicollinearity in least trimmed squares regression analysis, Appl. Math. Model. 57 (2018), 105–120.
9. Rousseeuw, P.J. and Leroy, A.M.,
Robust regression and outlier detection, John Wiley and Sons, New York, 1987.
10. Sheather, S.J.,
A modern approach to regression with R, Springer, New York, 2009.
11. Tibshirani, R.,
Regression shrinkage and selection via the LASSO, J. R. Stat. Soc. Ser. B, 58 (1996), 267–288.
12. Watkins, D.S.,
Fundamentals of matrix computations, John Wiley and Sons, New York, 2002.
CAPTCHA Image