Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells

Yen Jung Chiu, Yi Hsuan Hsieh, Yen Hua Huang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

18 Scopus citations


Background: To facilitate the investigation of the pathogenic roles played by various immune cells in complex tissues such as tumors, a few computational methods for deconvoluting bulk gene expression profiles to predict cell composition have been created. However, available methods were usually developed along with a set of reference gene expression profiles consisting of imbalanced replicates across different cell types. Therefore, the objective of this study was to create a new deconvolution method equipped with a new set of reference gene expression profiles that incorporate more microarray replicates of the immune cells that have been frequently implicated in the poor prognosis of cancers, such as T helper cells, regulatory T cells and macrophage M1/M2 cells. Methods: Our deconvolution method was developed by choosing ϵ-support vector regression (ϵ-SVR) as the core algorithm assigned with a loss function subject to the L1-norm penalty. To construct the reference gene expression signature matrix for regression, a subset of differentially expressed genes were chosen from 148 microarray-based gene expression profiles for 9 types of immune cells by using ANOVA and minimizing condition number. Agreement analyses including mean absolute percentage errors and Bland-Altman plots were carried out to compare the performances of our method and CIBERSORT. Results: In silico cell mixtures, simulated bulk tissues, and real human samples with known immune-cell fractions were used as the test datasets for benchmarking. Our method outperformed CIBERSORT in the benchmarks using in silico breast tissue-immune cell mixtures in the proportions of 30:70 and 50:50, and in the benchmark using 164 human PBMC samples. Our results suggest that the performance of our method was at least comparable to that of a state-of-the-art tool, CIBERSORT. Conclusions: We developed a new cell composition deconvolution method and the implementation was entirely based on the publicly available R and Python packages. In addition, we compiled a new set of reference gene expression profiles, which might allow for a more robust prediction of the immune cell fractions from the expression profiles of cell mixtures. The source code of our method could be downloaded from

Original languageEnglish
Article number169
JournalBMC Medical Genomics
StatePublished - 20 Dec 2019


  • Bulk gene expression profiles
  • Deconvolution
  • Immune cells


Dive into the research topics of 'Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells'. Together they form a unique fingerprint.

Cite this