Efficient construction and utilization of k-ordered FM-indexes with kISS for ultra-fast read mapping in large genomes

Zheng Dao Yang, Hsuan Yu Kuo, Po Wei Hsieh, Jui Hung Hung*

*此作品的通信作者

研究成果: Article同行評審

摘要

Motivation: The Full-text index in Minute space (FM-index) is a memory-efficient data structure widely used in bioinformatics for solving the fundamental pattern-matching task of searching for short patterns within a long reference. With the demand for short query patterns, the k-ordered concept has been proposed for FM-indexes. However, few construction algorithms in the state of the art fully exploit this idea to achieve significant speedups in the pan-genome era. Results: We introduce the k-ordered induced suffix sorting (kISS) for efficient construction and utilization of k-ordered FM-indexes. We present an algorithmic workflow for building k-ordered suffix arrays, incorporating two novel strategies to improve time and memory efficiency. We also demonstrate the compatibility of integrating k-ordered FM-indexes with locate operations in FMtree. Experiments show that kISS can improve the construction time, and the generated k-ordered suffix array can also be applied to FMtree without any additional in computation or memory usage.

原文English
文章編號btae409
期刊Bioinformatics
40
發行號7
DOIs
出版狀態Published - 1 7月 2024

指紋

深入研究「Efficient construction and utilization of k-ordered FM-indexes with kISS for ultra-fast read mapping in large genomes」主題。共同形成了獨特的指紋。

引用此