TY - GEN
T1 - Evolutionary learning-derived lncRNA signature with biomarker discovery for predicting stage of colon adenocarcinoma
AU - Ho, Yann Lin
AU - Ho, Yann Jen
AU - Ko, Fang Yu
AU - Ho, Shinn Ying
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - In recent years, long non-coding RNAs (lncRNAs) have emerged as potential regulators of biological processes and genes, with the potential to serve as valuable biomarkers for cancer diagnosis and prognosis prediction. This work proposes an evolutionary learning-based method, EL-COAD, to identify a robust lncRNA signature with biomarker discovery for predicting stages of colon adenocarcinoma (COAD). The COAD patient cohorts were obtained from both the Cancer Genome Atlas and Gene Expression Omnibus (gse17536) databases. EL-COAD incorporates a bi-objective combinatorial genetic algorithm with a support vector machine for selecting a minimal number of lncRNAs while maximizing prediction accuracy. EL-COAD identified a 15-lncRNA signature and achieved a five-fold cross-validation and area under receiver operating characteristic curve of 79.4% and 0.792, respectively. Utilising the 10 lncRNAs from the signature for an independent dataset gse17536, the Sequential Minimal Optimization model achieved a test accuracy of 64.15%. Furthermore, the lncRNAs of the signature were prioritized, with the top five being TMEM105, DUXAP8, APCDD1L-DT, PCAT6, and a novel transcript, ENSG00000226308. Furthermore, both Kyoto Encyclopedia of Genes and Genomes pathway and Disease Ontology analyses provided strong support for the viability of this model-independent signature, emphasising ENSG00000226308 as a promising biomarker.
AB - In recent years, long non-coding RNAs (lncRNAs) have emerged as potential regulators of biological processes and genes, with the potential to serve as valuable biomarkers for cancer diagnosis and prognosis prediction. This work proposes an evolutionary learning-based method, EL-COAD, to identify a robust lncRNA signature with biomarker discovery for predicting stages of colon adenocarcinoma (COAD). The COAD patient cohorts were obtained from both the Cancer Genome Atlas and Gene Expression Omnibus (gse17536) databases. EL-COAD incorporates a bi-objective combinatorial genetic algorithm with a support vector machine for selecting a minimal number of lncRNAs while maximizing prediction accuracy. EL-COAD identified a 15-lncRNA signature and achieved a five-fold cross-validation and area under receiver operating characteristic curve of 79.4% and 0.792, respectively. Utilising the 10 lncRNAs from the signature for an independent dataset gse17536, the Sequential Minimal Optimization model achieved a test accuracy of 64.15%. Furthermore, the lncRNAs of the signature were prioritized, with the top five being TMEM105, DUXAP8, APCDD1L-DT, PCAT6, and a novel transcript, ENSG00000226308. Furthermore, both Kyoto Encyclopedia of Genes and Genomes pathway and Disease Ontology analyses provided strong support for the viability of this model-independent signature, emphasising ENSG00000226308 as a promising biomarker.
KW - biomarker discovery
KW - colon adenocarcinoma
KW - evolutionary learning
KW - lncRNA
KW - signature
UR - http://www.scopus.com/inward/record.url?scp=85215008809&partnerID=8YFLogxK
U2 - 10.1109/EMBC53108.2024.10781761
DO - 10.1109/EMBC53108.2024.10781761
M3 - Conference contribution
AN - SCOPUS:85215008809
T3 - Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
BT - 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024
Y2 - 15 July 2024 through 19 July 2024
ER -