The varying threshold values of logistic regression and linear discriminant for classifying fraudulent firm

Samingun Handoyo*, Ying Ping Chen, Gugus Irianto, Agus Widodo

*此作品的通信作者

研究成果: Article同行評審

14 引文 斯高帕斯(Scopus)

摘要

The aim of the research is to find the best performance both of logistic regression and linear discriminant which their threshold uses some various values. The performance tools used for evaluating classifier model are confusion matrix, precision-recall, F1 score and receiver operation characteristic (ROC) curve. The Audit-risk data set are used for the implementation of the proposed method. The screening data and dimension reduction by using principal component analysis (PCA) are the first step that must be conducted before the data are divided into the training and testing set. After the training process for obtaining the classifier model parameters has been completed, the calculation of performance measures is done only on the testing set where the various constants are added to the threshold value of both classifier models. The logistic regression classifier has the best performance of 94% on the precision-recall, 91.7% on the F1-score, and 0.906 on the area under curve (AUC) where the threshold values are on the interval between 0.002 and 0.018. On the other hand, the linear discriminant classifier has the best performance when the threshold value is 0.035 and its performance value is respectively the precision-recall of 94%, the F1-score of 91.7%, and the AUC of 0.846.

原文English
頁(從 - 到)135-143
頁數9
期刊Mathematics and Statistics
9
發行號2
DOIs
出版狀態Published - 2021

指紋

深入研究「The varying threshold values of logistic regression and linear discriminant for classifying fraudulent firm」主題。共同形成了獨特的指紋。

引用此