摘要
The number of novel biomarkers is booming. However, a simple predictive score is more feasible to evaluate the clinical outcome and provide better accuracy. However, the optimal linear combination of correlated biomarkers demands comprehensive methodological research. This research aims to develop a novel approach for interpretable optimization. This research proposes the gradient boost machine with the Youden Index (GBYI) as the target function. The rationale is that the gradient boost machine demonstrates superior prediction ability and provides excellent interpretations according to the linear model. In addition, the Youden Index could effortlessly estimate the optimal cutoff point of the diagnostic test and evaluate the overall accuracy. Simulation studies evaluate the performance of the GBYI with linear and nonlinear structured datasets. We also demonstrate an application in the Bupa Liver Disease Data, which revealed that our optimal combination of correlated biomarkers shows an improved prediction with higher accuracy. This research proposes a novel machine-learning strategy using the powerful statistical boosting technique of the Youden Index. The new machine could optimize the combination of high-dimensional data and provide attractive interpretable coefficients.
| 原文 | English |
|---|---|
| 頁(從 - 到) | 7515-7526 |
| 頁數 | 12 |
| 期刊 | Communications in Statistics - Theory and Methods |
| 卷 | 54 |
| 發行號 | 23 |
| DOIs | |
| 出版狀態 | Published - 2025 |