TY - GEN
T1 - Scalable mutli-layer barrier synchronization on NoC
AU - Tseng, Yu Lun
AU - Huang, Kun Hua
AU - Lai, Bo-Cheng
PY - 2016/5/31
Y1 - 2016/5/31
N2 - Barrier is a widely used synchronization mechanism adopted in different scales of parallel systems. Being a global operation in a system, scalability has become a critical design concern of the barrier implementation. Reducing the number of messages and hop count are main challenges for attaining a well-scalable barrier design. This paper proposes an efficient control mechanism and communication scheme for barrier operations and exploits novel multi-layer barrier algorithms on NoC (Network on Chip) based multiprocessor systems. A novel barrier controller and communication unit are introduced to enable efficient barrier synchronization on NoC. The proposed modules improve the cooperative communication between synchronization messages, and can be easily integrated into a general NoC switch. For a 32×32 network, the proposed 2-layer barrier can respectively reduce the latency and hop count up to 61.7% and 99.3%. The experimental results have also revealed in-depth analysis of different design options.
AB - Barrier is a widely used synchronization mechanism adopted in different scales of parallel systems. Being a global operation in a system, scalability has become a critical design concern of the barrier implementation. Reducing the number of messages and hop count are main challenges for attaining a well-scalable barrier design. This paper proposes an efficient control mechanism and communication scheme for barrier operations and exploits novel multi-layer barrier algorithms on NoC (Network on Chip) based multiprocessor systems. A novel barrier controller and communication unit are introduced to enable efficient barrier synchronization on NoC. The proposed modules improve the cooperative communication between synchronization messages, and can be easily integrated into a general NoC switch. For a 32×32 network, the proposed 2-layer barrier can respectively reduce the latency and hop count up to 61.7% and 99.3%. The experimental results have also revealed in-depth analysis of different design options.
UR - http://www.scopus.com/inward/record.url?scp=84978436674&partnerID=8YFLogxK
U2 - 10.1109/VLSI-DAT.2016.7482560
DO - 10.1109/VLSI-DAT.2016.7482560
M3 - Conference contribution
AN - SCOPUS:84978436674
T3 - 2016 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2016
BT - 2016 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2016
Y2 - 25 April 2016 through 27 April 2016
ER -