TY - JOUR
T1 - Evaluating the performance of machine learning models for automatic diagnosis of patients with schizophrenia based on a single site dataset of 440 participants
AU - Lee, Lung Hao
AU - Chen, Chang Hao
AU - Chang, Wan Chen
AU - Lee, Po Lei
AU - Shyu, Kuo Kai
AU - Chen, Mu Hong
AU - Hsu, Ju Wei
AU - Bai, Ya Mei
AU - Su, Tung Ping
AU - Tu, Pei Chi
N1 - Publisher Copyright:
©
PY - 2022/12/23
Y1 - 2022/12/23
N2 - Background Support vector machines (SVMs) based on brain-wise functional connectivity (FC) have been widely adopted for single-subject prediction of patients with schizophrenia, but most of them had small sample size. This study aimed to evaluate the performance of SVMs based on a large single-site dataset and investigate the effects of demographic homogeneity and training sample size on classification accuracy. Methods The resting functional Magnetic Resonance Imaging (fMRI) dataset comprised 220 patients with schizophrenia and 220 healthy controls. Brain-wise FCs was calculated for each participant and linear SVMs were developed for automatic classification of patients and controls. First, we evaluated the SVMs based on all participants and homogeneous subsamples of men, women, younger (18-30 years), and older (31-50 years) participants by 10-fold nested cross-validation. Then, we hold out a fixed test set of 40 participants (20 patients and 20 controls) and evaluated the SVMs based on incremental training sample sizes (N = 40, 80, ..., 400). Results We found that the SVMs based on all participants had accuracy of 85.05%. The SVMs based on male, female, young, and older participants yielded accuracy of 84.66, 81.56, 80.50, and 86.13%, respectively. Although the SVMs based on older subsamples had better performance than those based on all participants, they generalized poorly to younger participants (77.24%). For incremental training sizes, the classification accuracy increased stepwise from 72.6 to 83.3%, with >80% accuracy achieved with sample size >240. Conclusions The findings indicate that SVMs based on a large dataset yield high classification accuracy and establish models using a large sample size with heterogeneous properties are recommended for single subject prediction of schizophrenia.
AB - Background Support vector machines (SVMs) based on brain-wise functional connectivity (FC) have been widely adopted for single-subject prediction of patients with schizophrenia, but most of them had small sample size. This study aimed to evaluate the performance of SVMs based on a large single-site dataset and investigate the effects of demographic homogeneity and training sample size on classification accuracy. Methods The resting functional Magnetic Resonance Imaging (fMRI) dataset comprised 220 patients with schizophrenia and 220 healthy controls. Brain-wise FCs was calculated for each participant and linear SVMs were developed for automatic classification of patients and controls. First, we evaluated the SVMs based on all participants and homogeneous subsamples of men, women, younger (18-30 years), and older (31-50 years) participants by 10-fold nested cross-validation. Then, we hold out a fixed test set of 40 participants (20 patients and 20 controls) and evaluated the SVMs based on incremental training sample sizes (N = 40, 80, ..., 400). Results We found that the SVMs based on all participants had accuracy of 85.05%. The SVMs based on male, female, young, and older participants yielded accuracy of 84.66, 81.56, 80.50, and 86.13%, respectively. Although the SVMs based on older subsamples had better performance than those based on all participants, they generalized poorly to younger participants (77.24%). For incremental training sizes, the classification accuracy increased stepwise from 72.6 to 83.3%, with >80% accuracy achieved with sample size >240. Conclusions The findings indicate that SVMs based on a large dataset yield high classification accuracy and establish models using a large sample size with heterogeneous properties are recommended for single subject prediction of schizophrenia.
KW - Automatic classification
KW - functional connectivity
KW - homogeneous
KW - schizophrenic disorder
KW - support vector machine
KW - training sample size
UR - http://www.scopus.com/inward/record.url?scp=85121905893&partnerID=8YFLogxK
U2 - 10.1192/j.eurpsy.2021.2248
DO - 10.1192/j.eurpsy.2021.2248
M3 - Article
C2 - 34937587
AN - SCOPUS:85121905893
SN - 0924-9338
VL - 65
JO - European Psychiatry
JF - European Psychiatry
IS - 1
M1 - e1
ER -