Abstract
In the nucleotide substitution model for molecular evolution, a major task in the exploration of an evolutionary process is to estimate the substitution number per site of a protein or DNA sequence. The usual estimators are based on the observation of the difference proportion of the two nucleotide sequences. However, a more objective approach is to report a confidence interval with precision rather than only providing point estimators. The conventional confidence intervals used in the literature for the substitution number are constructed by the normal approximation. The performance and construction of confidence intervals for evolutionary models have not been much investigated in the literature. In this article, the performance of these conventional confidence intervals for one-parameter and two-parameter models are explored. Results show that the coverage probabilities of these intervals are unsatisfactory when the true substitution number is small. Since the substitution number may be small in many situations for an evolutionary process, the conventional confidence interval cannot provide accurate information for these cases. Improved confidence intervals for the one-parameter model with desirable coverage probability are proposed in this article. A numerical calculation shows the substantial improvement of the new confidence intervals over the conventional confidence intervals.
Original language | English |
---|---|
Pages (from-to) | 472-479 |
Number of pages | 8 |
Journal | Molecular Phylogenetics and Evolution |
Volume | 60 |
Issue number | 3 |
DOIs | |
State | Published - 1 Sep 2011 |
Keywords
- Binomial distribution
- Confidence interval
- Coverage probability
- One-parameter model
- Substitution rate
- Two-parameter model