TY - JOUR
T1 - Genotype imputation accuracy with different reference panels in admixed populations
AU - Huang, Guan-Hua
AU - Tseng, Yi Chi
N1 - Publisher Copyright:
© 2014 Huang and Tseng; licensee BioMed Central Ltd.
PY - 2014/6/17
Y1 - 2014/6/17
N2 - Genome-wide association studies have successfully identified common variants that are associated with complex diseases. However, the majority of genetic variants contributing to disease susceptibility are yet to be discovered. It is now widely believed that multiple rare variants are likely to be associated with complex diseases. Using custom-made chips or next-generation sequencing to uncover the effects of rare variants on the disease can be very expensive in current technology. Consequently, many researchers use the genotype imputation approach to predict the genotypes at these rare variants that are not directly genotyped in the study sample. One important question in genotype imputation is how to choose a reference panel that will produce high imputation accuracy in a population of interest. Using whole genome sequence data from the Genetic Analysis Workshop 18 data set, this report compares genotype imputation accuracy among reference panels representing different degrees of genetic similarity to a study sample of admixed Mexican Americans. Results show that a reference panel that closely matches the ancestry of the study population can increase imputation accuracy, but it can also result in more missing genotype calls. Having a larger-size reference panel can reduce imputation error and missing genotype, but the improvement may be limited. We also find that, for the admixed study sample, the simple selection of a single best-reference panel among HapMap African, European, or Asian population is not appropriate. The composite reference panel combining all available reference data should be used.
AB - Genome-wide association studies have successfully identified common variants that are associated with complex diseases. However, the majority of genetic variants contributing to disease susceptibility are yet to be discovered. It is now widely believed that multiple rare variants are likely to be associated with complex diseases. Using custom-made chips or next-generation sequencing to uncover the effects of rare variants on the disease can be very expensive in current technology. Consequently, many researchers use the genotype imputation approach to predict the genotypes at these rare variants that are not directly genotyped in the study sample. One important question in genotype imputation is how to choose a reference panel that will produce high imputation accuracy in a population of interest. Using whole genome sequence data from the Genetic Analysis Workshop 18 data set, this report compares genotype imputation accuracy among reference panels representing different degrees of genetic similarity to a study sample of admixed Mexican Americans. Results show that a reference panel that closely matches the ancestry of the study population can increase imputation accuracy, but it can also result in more missing genotype calls. Having a larger-size reference panel can reduce imputation error and missing genotype, but the improvement may be limited. We also find that, for the admixed study sample, the simple selection of a single best-reference panel among HapMap African, European, or Asian population is not appropriate. The composite reference panel combining all available reference data should be used.
UR - http://www.scopus.com/inward/record.url?scp=85018193345&partnerID=8YFLogxK
U2 - 10.1186/1753-6561-8-S1-S64
DO - 10.1186/1753-6561-8-S1-S64
M3 - Article
AN - SCOPUS:85018193345
SN - 1753-6561
VL - 8
JO - BMC Proceedings
JF - BMC Proceedings
M1 - S64
ER -