Thread affinity mapping for irregular data access on shared cache GPGPU

Hsien Kai Kuo*, Kuan Ting Chen, Bo-Cheng Lai, Jing Yang Jou

*此作品的通信作者

    研究成果: Conference contribution同行評審

    10 引文 斯高帕斯(Scopus)

    摘要

    Memory Coalescing and on-chip shared Cache are two effective techniques to alleviate the memory bottleneck in modern GPGPUs. These two techniques are very useful on applications with regular memory accesses. However, they become ineffective on concurrent threads with large numbers of uncoordinated accesses and the potential performance benefit could be significantly degraded. This paper proposes a thread affinity mapping methodology to coordinate the irregular data accesses on shared cache GPGPUs. Based on the proposed affinity metrics, threads are congregated into execution groups which are able to fully exploit the memory coalescing and data sharing within an application. An average of 3.5x runtime speedup is achieved on a Fermi GPGPU. The speedup scales with the sizes of test cases, which makes the proposed methodology an effective and promising solution for the continually increasing complexities of applications in the future many-core systems.

    原文English
    主出版物標題ASP-DAC 2012 - 17th Asia and South Pacific Design Automation Conference
    頁面659-664
    頁數6
    DOIs
    出版狀態Published - 2012
    事件17th Asia and South Pacific Design Automation Conference, ASP-DAC 2012 - Sydney, NSW, 澳大利亞
    持續時間: 30 1月 20122 2月 2012

    出版系列

    名字Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC

    Conference

    Conference17th Asia and South Pacific Design Automation Conference, ASP-DAC 2012
    國家/地區澳大利亞
    城市Sydney, NSW
    期間30/01/122/02/12

    指紋

    深入研究「Thread affinity mapping for irregular data access on shared cache GPGPU」主題。共同形成了獨特的指紋。

    引用此