TY - JOUR
T1 - Categorizing host-dependent RNA viruses by principal component analysis of their codon usage preferences.
AU - Su, Ming Wei
AU - Lin, Hsiu Man
AU - Yuan, Hanna S.
AU - Chu, Woei Chyn
PY - 2009/11
Y1 - 2009/11
N2 - Viruses have to exploit host transcription and translation mechanisms to replicate in a hostile host cellular environment, and therefore, it is likely that the infected host may impose pressure on viral evolution. In this study, we investigated differences in codon usage preferences among the highly mutable single strain RNA viruses which infect vertebrate or invertebrate hosts, respectively. We incorporate principal component analysis (PCA) and k-mean methods to clustering viruses infected with different type of hosts. The relative synonymous codon usage (RSCU) indices of all genes in 32 RNA viruses were calculated, and the correlation of the RSCU indices among different viruses was analyzed by the PCA. Our results show a positive correlation in codon usage preferences among viruses that target the same host category. Results of k-means clustering analysis further confirmed the statistical significance of this study, demonstrating that viruses infecting vertebrate hosts have different codon usage preferences to those of invertebrate viruses. Based on the analysis of the effective number of codons (ENC) in relation to the GC-content at the synonymous third codon position (GC3s), we further identified that mutational pressure was the dominant evolution driving force in making the different codon usage preferences. This study suggests a new and effective way to characterize host-dependent RNA viruses based on the codon usage pattern.
AB - Viruses have to exploit host transcription and translation mechanisms to replicate in a hostile host cellular environment, and therefore, it is likely that the infected host may impose pressure on viral evolution. In this study, we investigated differences in codon usage preferences among the highly mutable single strain RNA viruses which infect vertebrate or invertebrate hosts, respectively. We incorporate principal component analysis (PCA) and k-mean methods to clustering viruses infected with different type of hosts. The relative synonymous codon usage (RSCU) indices of all genes in 32 RNA viruses were calculated, and the correlation of the RSCU indices among different viruses was analyzed by the PCA. Our results show a positive correlation in codon usage preferences among viruses that target the same host category. Results of k-means clustering analysis further confirmed the statistical significance of this study, demonstrating that viruses infecting vertebrate hosts have different codon usage preferences to those of invertebrate viruses. Based on the analysis of the effective number of codons (ENC) in relation to the GC-content at the synonymous third codon position (GC3s), we further identified that mutational pressure was the dominant evolution driving force in making the different codon usage preferences. This study suggests a new and effective way to characterize host-dependent RNA viruses based on the codon usage pattern.
UR - http://www.scopus.com/inward/record.url?scp=77952299145&partnerID=8YFLogxK
U2 - 10.1089/cmb.2009.0046
DO - 10.1089/cmb.2009.0046
M3 - Article
C2 - 19958082
AN - SCOPUS:77952299145
SN - 1066-5277
VL - 16
SP - 1539
EP - 1547
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 11
ER -