Categorizing host-dependent RNA viruses by principal component analysis of their codon usage preferences.

Ming Wei Su*, Hsiu Man Lin, Hanna S. Yuan, Woei Chyn Chu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

25 Scopus citations


Viruses have to exploit host transcription and translation mechanisms to replicate in a hostile host cellular environment, and therefore, it is likely that the infected host may impose pressure on viral evolution. In this study, we investigated differences in codon usage preferences among the highly mutable single strain RNA viruses which infect vertebrate or invertebrate hosts, respectively. We incorporate principal component analysis (PCA) and k-mean methods to clustering viruses infected with different type of hosts. The relative synonymous codon usage (RSCU) indices of all genes in 32 RNA viruses were calculated, and the correlation of the RSCU indices among different viruses was analyzed by the PCA. Our results show a positive correlation in codon usage preferences among viruses that target the same host category. Results of k-means clustering analysis further confirmed the statistical significance of this study, demonstrating that viruses infecting vertebrate hosts have different codon usage preferences to those of invertebrate viruses. Based on the analysis of the effective number of codons (ENC) in relation to the GC-content at the synonymous third codon position (GC3s), we further identified that mutational pressure was the dominant evolution driving force in making the different codon usage preferences. This study suggests a new and effective way to characterize host-dependent RNA viruses based on the codon usage pattern.

Original languageEnglish
Pages (from-to)1539-1547
Number of pages9
JournalJournal of Computational Biology
Issue number11
StatePublished - Nov 2009


Dive into the research topics of 'Categorizing host-dependent RNA viruses by principal component analysis of their codon usage preferences.'. Together they form a unique fingerprint.

Cite this