Confidence intervals for the substitution number in the nucleotide substitution models

Hsiuying Wang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


In the nucleotide substitution model for molecular evolution, a major task in the exploration of an evolutionary process is to estimate the substitution number per site of a protein or DNA sequence. The usual estimators are based on the observation of the difference proportion of the two nucleotide sequences. However, a more objective approach is to report a confidence interval with precision rather than only providing point estimators. The conventional confidence intervals used in the literature for the substitution number are constructed by the normal approximation. The performance and construction of confidence intervals for evolutionary models have not been much investigated in the literature. In this article, the performance of these conventional confidence intervals for one-parameter and two-parameter models are explored. Results show that the coverage probabilities of these intervals are unsatisfactory when the true substitution number is small. Since the substitution number may be small in many situations for an evolutionary process, the conventional confidence interval cannot provide accurate information for these cases. Improved confidence intervals for the one-parameter model with desirable coverage probability are proposed in this article. A numerical calculation shows the substantial improvement of the new confidence intervals over the conventional confidence intervals.

Original languageEnglish
Pages (from-to)472-479
Number of pages8
JournalMolecular Phylogenetics and Evolution
Issue number3
StatePublished - 1 Sep 2011


  • Binomial distribution
  • Confidence interval
  • Coverage probability
  • One-parameter model
  • Substitution rate
  • Two-parameter model


Dive into the research topics of 'Confidence intervals for the substitution number in the nucleotide substitution models'. Together they form a unique fingerprint.

Cite this