DP2: A Highly Parallel Range Join for Genome Analysis on Distributed Computing Platform

Aman Sinha, Bo Cheng Lai

研究成果: Conference contribution同行評審

摘要

Rapid growth of the sheer amount of genome data and intense computation become great challenges for downstream genome analytics. Efficient parallel processing and distributed computing are the two effective schemes to address the analysis of big data. Range join is a widely used, effective, yet time-consuming operation that finds the overlap between two different sets of genome features. The current widely adopted BEDTools [6] pipeline adopts single-node binary tree approach, while the distributed GenAp scheme fails to exploit the massive parallel computation on modern throughput processors, such as GPU (Graphic Processing Unit). This paper proposes a novel Distributed Parallel P-ary search (DP2) that applies novel P-ary analysis to enable high parallelism at algorithmic level, and extensively utilize multiple GPUs at system and architecture level. Efficient computation allocation is implemented to leverage the distributed computing on clusters. The proposed framework can be well integrated with current BEDTools [6] pipeline, and achieves an average of 25x speedup for the actual range-join operation when compared with Binary tree approach of GenAp and a 13x end-to-end (total execution time) speedup in comparison to ADAM.

原文English
主出版物標題2019 International Conference on High Performance Computing and Simulation, HPCS 2019
發行者Institute of Electrical and Electronics Engineers Inc.
頁面358-362
頁數5
ISBN(電子)9781728144849
DOIs
出版狀態Published - 7月 2019
事件2019 International Conference on High Performance Computing and Simulation, HPCS 2019 - Dublin, Ireland
持續時間: 15 7月 201919 7月 2019

出版系列

名字2019 International Conference on High Performance Computing and Simulation, HPCS 2019

Conference

Conference2019 International Conference on High Performance Computing and Simulation, HPCS 2019
國家/地區Ireland
城市Dublin
期間15/07/1919/07/19

指紋

深入研究「DP2: A Highly Parallel Range Join for Genome Analysis on Distributed Computing Platform」主題。共同形成了獨特的指紋。

引用此