The majority of the studies on credit risk assessment models for financial institutions during recent years focus on the improvement of imbalanced data or on the enhancement of classification accuracy with multistage modeling. Whilst multistage modeling and data pre-processing can boost accuracy somewhat, the heterogeneous nature of data may affects the classification accuracy of classifiers. This paper intends to use the classifier, eXtreme gradient boosting tree (XGBoost), to construct a credit risk assessment model for financial institutions. Cluster-based under-sampling is deployed to process imbalanced data. Finally, the area under the receiver operative curve and the accuracy of classifications are the assessment indicators, in the comparison with other frequently used single-stage classifiers such as logistic regression, self-organizing algorithms and support vector machine. The results indicate that the XGBoost classifier used by this paper achieve better results than the other three and can serve as a superior tool for the development of credit risk models for financial institutions.
- Credit risk assessment model
- eXtreme gradient boosting tree
- Receiver operative curve
- Support vector machine