MDD-carb: A combinatorial model for the identification of protein carbonylation sites with substrate motifs

Hui Ju Kao, Shun Long Weng, Kai Yao Huang, Fergie Joanda Kaunang, Justin Bo Kai Hsu, Chien Hsun Huang*, Tzong Yi Lee

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

20 Scopus citations

Abstract

Background: Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic lung disease, Parkinson's disease, and Alzheimer's disease. Due to the lack of computational methods dedicated to exploring motif signatures of protein carbonylation sites, we were motivated to exploit an iterative statistical method to characterize and identify carbonylated sites with motif signatures. Results: By manually curating experimental data from research articles, we obtained 332, 144, 135, and 140 verified substrate sites for K (lysine), R (arginine), T (threonine), and P (proline) residues, respectively, from 241 carbonylated proteins. In order to examine the informative attributes for classifying between carbonylated and non-carbonylated sites, multifarious features including composition of twenty amino acids (AAC), composition of amino acid pairs (AAPC), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM) were investigated in this study. Additionally, in an attempt to explore the motif signatures of carbonylation sites, an iterative statistical method was adopted to detect statistically significant dependencies of amino acid compositions between specific positions around substrate sites. Profile hidden Markov model (HMM) was then utilized to train a predictive model from each motif signature. Moreover, based on the method of support vector machine (SVM), we adopted it to construct an integrative model by combining the values of bit scores obtained from profile HMMs. The combinatorial model could provide an enhanced performance with evenly predictive sensitivity and specificity in the evaluation of cross-validation and independent testing. Conclusion: This study provides a new scheme for exploring potential motif signatures at substrate sites of protein carbonylation. The usefulness of the revealed motifs in the identification of carbonylated sites is demonstrated by their effective performance in cross-validation and independent testing. Finally, these substrate motifs were adopted to build an available online resource (MDD-Carb, http://csb.cse.yzu.edu.tw/MDDCarb/) and are also anticipated to facilitate the study of large-scale carbonylated proteomes.

Original languageEnglish
Article number137
JournalBMC Systems Biology
Volume11
DOIs
StatePublished - 21 Dec 2017

Keywords

  • Maximal dependence decomposition
  • Profile hidden Markov model
  • Protein carbonylation
  • Reactive oxygen species (ROS)
  • Substrate motifs

Fingerprint

Dive into the research topics of 'MDD-carb: A combinatorial model for the identification of protein carbonylation sites with substrate motifs'. Together they form a unique fingerprint.

Cite this