Machine learning-based method for obesity risk evaluation using single-nucleotide polymorphisms derived from next-generation sequencing

Hsin Yao Wang, Shih Cheng Chang, Wan Ying Lin, Chun Hsien Chen, Szu Hsien Chiang, Kai Yao Huang, Bo Yu Chu, Jang Jih Lu*, Tzong Yi Lee

*此作品的通信作者

研究成果: Article同行評審

28 引文 斯高帕斯(Scopus)

摘要

Obesity is a major risk factor for many metabolic diseases. To understand the genetic characteristics of obese individuals, single-nucleotide polymorphisms (SNPs) derived from next-generation sequencing (NGS) provide comprehensive insight into genome-wide genetic investigation. However, interpretation of these SNP data for clinical application is difficult given the high complexity of NGS data. Hence, in this study, obesity risk prediction models based on SNPs were designed using machine learning (ML) methods, namely support vector machine (SVM), k-nearest neighbor, and decision tree (DT). This investigation obtained clinicopathological features, including 130 SNPs, sex, and age, from 139 eligible individuals. Various feature selection methods, such as stepwise multivariate linear regression (MLR), DT, and genetic algorithms, were applied to select informative features for generating obesity prediction models. Multivariate logistic regression was used to evaluate the importance of the selected features. The models trained from various features evaluated their predictive performances based on fivefold cross-validation. Three measures, namely accuracy, sensitivity, and specificity, were used to examine and compare the predictive power among various models. To design obesity prediction models using ML methods, nine SNPs, including rs10501087, rs17700144, rs2287019, rs534870, rs660339, rs7081678, rs718314, rs9816226, and rs984222, were selected based on stepwise MLR. In evaluation of model performance, the SVM model significantly outperformed other classifiers based on the same training features. The SVM model exhibits 70.77% accuracy, 80.09% sensitivity, and 63.02% specificity. This investigation has demonstrated that the selected SNPs were effective in the detection of obesity risk. Additionally, the ML-based method provides a feasible mean for conducting preliminary analyses of genetic characteristics of obesity.

原文English
頁(從 - 到)1347-1360
頁數14
期刊Journal of Computational Biology
25
發行號12
DOIs
出版狀態Published - 12月 2018

指紋

深入研究「Machine learning-based method for obesity risk evaluation using single-nucleotide polymorphisms derived from next-generation sequencing」主題。共同形成了獨特的指紋。

引用此