TY - JOUR
T1 - Incorporating significant amino acid pairs and protein domains to predict RNA splicing-related proteins with functional roles
AU - Hsu, Justin Bo Kai
AU - Huang, Kai Yao
AU - Weng, Tzu Ya
AU - Huang, Chien Hsun
AU - Lee, Tzong Yi
N1 - Funding Information:
Acknowledgments The authors would like to sincerely thank the National Science Council of the Republic of China for financially supporting this research under Contract No. 101-2628-E-155-002-MY2 and 102-2221-E-155-069.
PY - 2014/1
Y1 - 2014/1
N2 - Machinery of pre-mRNA splicing is carried out through the interaction of RNA sequence elements and a variety of RNA splicing-related proteins (SRPs) (e.g. spliceosome and splicing factors). Alternative splicing, which is an important post-transcriptional regulation in eukaryotes, gives rise to multiple mature mRNA isoforms, which encodes proteins with functional diversities. However, the regulation of RNA splicing is not yet fully elucidated, partly because SRPs have not yet been exhaustively identified and the experimental identification is labor-intensive. Therefore, we are motivated to design a new method for identifying SRPs with their functional roles in the regulation of RNA splicing. The experimentally verified SRPs were manually curated from research articles. According to the functional annotation of Splicing Related Gene Database, the collected SRPs were further categorized into four functional groups including small nuclear Ribonucleoprotein, Splicing Factor, Splicing Regulation Factor and Novel Spliceosome Protein. The composition of amino acid pairs indicates that there are remarkable differences among four functional groups of SRPs. Then, support vector machines (SVMs) were utilized to learn the predictive models for identifying SRPs as well as their functional roles. The cross-validation evaluation presents that the SVM models trained with significant amino acid pairs and functional domains could provide a better predictive performance. In addition, the independent testing demonstrates that the proposed method could accurately identify SRPs in mammals/plants as well as effectively distinguish between SRPs and RNA-binding proteins. This investigation provides a practical means to identifying potential SRPs and a perspective for exploring the regulation of RNA splicing.
AB - Machinery of pre-mRNA splicing is carried out through the interaction of RNA sequence elements and a variety of RNA splicing-related proteins (SRPs) (e.g. spliceosome and splicing factors). Alternative splicing, which is an important post-transcriptional regulation in eukaryotes, gives rise to multiple mature mRNA isoforms, which encodes proteins with functional diversities. However, the regulation of RNA splicing is not yet fully elucidated, partly because SRPs have not yet been exhaustively identified and the experimental identification is labor-intensive. Therefore, we are motivated to design a new method for identifying SRPs with their functional roles in the regulation of RNA splicing. The experimentally verified SRPs were manually curated from research articles. According to the functional annotation of Splicing Related Gene Database, the collected SRPs were further categorized into four functional groups including small nuclear Ribonucleoprotein, Splicing Factor, Splicing Regulation Factor and Novel Spliceosome Protein. The composition of amino acid pairs indicates that there are remarkable differences among four functional groups of SRPs. Then, support vector machines (SVMs) were utilized to learn the predictive models for identifying SRPs as well as their functional roles. The cross-validation evaluation presents that the SVM models trained with significant amino acid pairs and functional domains could provide a better predictive performance. In addition, the independent testing demonstrates that the proposed method could accurately identify SRPs in mammals/plants as well as effectively distinguish between SRPs and RNA-binding proteins. This investigation provides a practical means to identifying potential SRPs and a perspective for exploring the regulation of RNA splicing.
KW - Amino acid pair composition
KW - RNA splicing
KW - Spliceosome
KW - Splicing-related protein
KW - Support vector machine
UR - http://www.scopus.com/inward/record.url?scp=84896738473&partnerID=8YFLogxK
U2 - 10.1007/s10822-014-9706-6
DO - 10.1007/s10822-014-9706-6
M3 - Article
C2 - 24442949
AN - SCOPUS:84896738473
SN - 0920-654X
VL - 28
SP - 49
EP - 60
JO - JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
JF - JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
IS - 1
ER -