TY - GEN
T1 - Deep learning with evolutionary and genomic profiles for identifying cancer subtypes
AU - Lin, Chun-Yu
AU - Li, Ruiming
AU - Akutsu, Tatsuya
AU - Ruan, Peiying
AU - See, Simon
AU - Yang, Jinn-Moon
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/6
Y1 - 2018/12/6
N2 - Cancer subtype identification is an unmet need in precision diagnosis. Recently, evolutionary conservation has been indicated containing understandable signatures for functional significance in cancers. However, the importance of evolutionary conservation in distinguishing cancer subtypes remains unclear. Here, we identified the evolutionarily conserved genes (i.e., core gene) and observed that they are mainly involved in the pathways relevant to cell growth and metabolisms. By using these core genes, we integrated their evolutionary and genomic profiles with deep learning to develop a feature-based strategy (FES) and an image-based strategy (IMS). In comparison with FES using the random set and the strategy using the PAM50 classifier, core gene set-based FES has higher accuracy for identifying breast cancer subtypes. Moreover, the IMS with data augmentation yields better performance than the other strategies. Comprehensive analysis of eight TCGA cancer data demonstrates that our evolutionary conservation-based models provide a valid and helpful approach to identify cancer subtypes and the core gene set offers distinguishable clues of cancer subtypes.
AB - Cancer subtype identification is an unmet need in precision diagnosis. Recently, evolutionary conservation has been indicated containing understandable signatures for functional significance in cancers. However, the importance of evolutionary conservation in distinguishing cancer subtypes remains unclear. Here, we identified the evolutionarily conserved genes (i.e., core gene) and observed that they are mainly involved in the pathways relevant to cell growth and metabolisms. By using these core genes, we integrated their evolutionary and genomic profiles with deep learning to develop a feature-based strategy (FES) and an image-based strategy (IMS). In comparison with FES using the random set and the strategy using the PAM50 classifier, core gene set-based FES has higher accuracy for identifying breast cancer subtypes. Moreover, the IMS with data augmentation yields better performance than the other strategies. Comprehensive analysis of eight TCGA cancer data demonstrates that our evolutionary conservation-based models provide a valid and helpful approach to identify cancer subtypes and the core gene set offers distinguishable clues of cancer subtypes.
KW - Cancer genomics
KW - Cancer subtype
KW - Convolutaional neural network
KW - Copy number alteration
KW - Deep learning
KW - Evolutionary conservation
KW - Gene expression
UR - http://www.scopus.com/inward/record.url?scp=85060375487&partnerID=8YFLogxK
U2 - 10.1109/BIBE.2018.00035
DO - 10.1109/BIBE.2018.00035
M3 - Conference contribution
AN - SCOPUS:85060375487
T3 - Proceedings - 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering, BIBE 2018
SP - 147
EP - 150
BT - Proceedings - 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering, BIBE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2018
Y2 - 29 October 2018 through 31 October 2018
ER -