Computing Motif Correlations in Proteins

Jorng Tzong Horng*, Hsien Da Huang, Shih Hsien Wang, Ming You Chen, Shir Ly Huang, Jenn Kang Hwang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


Protein motifs, which are specific regions and conserved regions, are found by comparing multiple protein sequences. These conserved regions in general play an important role in protein functions and protein folds, for example, for their binding properties or enzymatic activities. The aim here is to find the existence correlations of protein motifs. The knowledge of protein motif/domain sharing should be important in shedding new light on the biologic functions of proteins and offering a basis in analyzing the evolution in the human genome or other genomes. The protein sequences used here are obtained from the PIR-NREF database and the protein motifs are retrieved from the PROSITE database. We apply data mining approach to discover the occurrence correlations of motif in protein sequences. The correlation of motifs mined can be used in evolution analyses and protein structure prediction. We discuss the latter, i.e., protein structure prediction in this study. The correlations mined are stored and maintained in a database system. The database is now available at

Original languageEnglish
Pages (from-to)2032-2043
Number of pages12
JournalJournal of Computational Chemistry
Issue number16
StatePublished - Dec 2003


  • Data mining
  • Database
  • Motif
  • Protein
  • Structural genomics


Dive into the research topics of 'Computing Motif Correlations in Proteins'. Together they form a unique fingerprint.

Cite this