TY - JOUR
T1 - Method for identifying transcription factor binding sites in yeast
AU - Tsai, Huai Kuang
AU - Huang, Grace Tzu Wei
AU - Chou, Meng Yuan
AU - Lu, Henry Horng Shing
AU - Li, Wen Hsiung
PY - 2006/7/15
Y1 - 2006/7/15
N2 - Motivation: Identifying transcription factor binding sites (TFBSs) is helpful for understanding the mechanism of transcriptional regulation. The abundance and the diversity of genomic data provide an excellent opportunity for identifying TFBSs. Developing methods to integrate various types of data has become a major trend in this pursuit. Results: We develop a TFBS identification method, TFBSfinder, which utilizes several data sources, including DNA sequences, phylogenetic information, microarray data and ChIP-chip data. For a TF, TFBSfinder rigorously selects a set of reliable target genes and a set of non-target genes (as a background set) to find overrepresented and conserved motifs in target genes. A new metric for measuring the degree of conservation at a binding site across species and methods for clustering motifs and for inferring position weight matrices are proposed. For synthetic data and yeast cell cycle TFs, TFBSfinder identifies motifs that are highly similar to known consensuses. Moreover, TFBSfinder outperforms well-known methods.
AB - Motivation: Identifying transcription factor binding sites (TFBSs) is helpful for understanding the mechanism of transcriptional regulation. The abundance and the diversity of genomic data provide an excellent opportunity for identifying TFBSs. Developing methods to integrate various types of data has become a major trend in this pursuit. Results: We develop a TFBS identification method, TFBSfinder, which utilizes several data sources, including DNA sequences, phylogenetic information, microarray data and ChIP-chip data. For a TF, TFBSfinder rigorously selects a set of reliable target genes and a set of non-target genes (as a background set) to find overrepresented and conserved motifs in target genes. A new metric for measuring the degree of conservation at a binding site across species and methods for clustering motifs and for inferring position weight matrices are proposed. For synthetic data and yeast cell cycle TFs, TFBSfinder identifies motifs that are highly similar to known consensuses. Moreover, TFBSfinder outperforms well-known methods.
UR - http://www.scopus.com/inward/record.url?scp=33747871502&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btl160
DO - 10.1093/bioinformatics/btl160
M3 - Article
C2 - 16644789
AN - SCOPUS:33747871502
SN - 1367-4803
VL - 22
SP - 1675
EP - 1681
JO - Bioinformatics
JF - Bioinformatics
IS - 14
ER -