TY - JOUR
T1 - RRSM with a data-dependent threshold for miRNA target prediction
AU - Hsieh, Wan J.
AU - Wang, Hsiuying
PY - 2013/11/1
Y1 - 2013/11/1
N2 - Predicting miRNA target genes is one of the important issues in bioinformatics. The correlation analysis is a widely used method for exploring miRNA targets through microarray data. However, the experimental results show that correlation analysis leads to large false positive or negative results. In addition, the correlation analysis is not appropriate when multiple miRNAs simultaneously regulate a gene. Recently, the relative R squared method (RRSM) has been proposed for miRNA target prediction, which is shown to be superior to some existing methods. To adopt the RRSM, we need first to set thresholds to select a proportion of potential targets. In the previous studies, the threshold is set to be fixed, which does not depend on the characteristic of a gene. Due to the diversity of the functions of genes, a data-dependent threshold may be more feasible in real data applications than a data-independent threshold. In this study, we propose a threshold selection method which is based on the distribution of the relative R squared statistic. The proposed method is shown to significantly improve the previous prediction results by selecting more experimentally validated targets.
AB - Predicting miRNA target genes is one of the important issues in bioinformatics. The correlation analysis is a widely used method for exploring miRNA targets through microarray data. However, the experimental results show that correlation analysis leads to large false positive or negative results. In addition, the correlation analysis is not appropriate when multiple miRNAs simultaneously regulate a gene. Recently, the relative R squared method (RRSM) has been proposed for miRNA target prediction, which is shown to be superior to some existing methods. To adopt the RRSM, we need first to set thresholds to select a proportion of potential targets. In the previous studies, the threshold is set to be fixed, which does not depend on the characteristic of a gene. Due to the diversity of the functions of genes, a data-dependent threshold may be more feasible in real data applications than a data-independent threshold. In this study, we propose a threshold selection method which is based on the distribution of the relative R squared statistic. The proposed method is shown to significantly improve the previous prediction results by selecting more experimentally validated targets.
KW - Correlation analysis
KW - P-value
KW - Regression model
KW - The relative R squared method
UR - http://www.scopus.com/inward/record.url?scp=84883338971&partnerID=8YFLogxK
U2 - 10.1016/j.jtbi.2013.08.002
DO - 10.1016/j.jtbi.2013.08.002
M3 - Article
C2 - 23948551
AN - SCOPUS:84883338971
SN - 0022-5193
VL - 337
SP - 54
EP - 60
JO - Journal of Theoretical Biology
JF - Journal of Theoretical Biology
ER -