TY - GEN
T1 - Energy and performance analysis of mapping parallel multi-threaded tasks for an on-chip multi-processor system
AU - Lai, Bo-Cheng
AU - Schaumont, Patrick
AU - Qin, Wei
AU - Verbauwhede, Ingrid
PY - 2005
Y1 - 2005
N2 - Multiprocessor systems offer superior performance and potentially better energy-reduction than single-processor systems. It all depends however, on how well the application can be mapped onto the architecture. Indeed, a careful tradeoff of energy and performance requires a thorough understanding of the energy consumption pattern of the application across the architecture. We develop a simulation platform, MultiPo-Sim, which returns the cycle-accurate performance and energy consumption of a multiprocessor system, for both hardware components and software primitives. On the hardware level, energy scaling techniques can be modeled and each processing core can operate at different energy modes. MultiPo-Sim achieves 331K. cycles per second simulation speed for a four-processor system on a 3GHz, 512MByte Fedora-2 PC. On the software level, data parallelizing and task parallelizing are two common models of multi-thread programming. By using MultiPo-Sim, we show that they show different energy and performance characteristics when mapping onto a multi-processor system.
AB - Multiprocessor systems offer superior performance and potentially better energy-reduction than single-processor systems. It all depends however, on how well the application can be mapped onto the architecture. Indeed, a careful tradeoff of energy and performance requires a thorough understanding of the energy consumption pattern of the application across the architecture. We develop a simulation platform, MultiPo-Sim, which returns the cycle-accurate performance and energy consumption of a multiprocessor system, for both hardware components and software primitives. On the hardware level, energy scaling techniques can be modeled and each processing core can operate at different energy modes. MultiPo-Sim achieves 331K. cycles per second simulation speed for a four-processor system on a 3GHz, 512MByte Fedora-2 PC. On the software level, data parallelizing and task parallelizing are two common models of multi-thread programming. By using MultiPo-Sim, we show that they show different energy and performance characteristics when mapping onto a multi-processor system.
UR - http://www.scopus.com/inward/record.url?scp=33748531286&partnerID=8YFLogxK
U2 - 10.1109/ICCD.2005.47
DO - 10.1109/ICCD.2005.47
M3 - Conference contribution
AN - SCOPUS:33748531286
SN - 0769524516
SN - 9780769524511
T3 - Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors
SP - 102
EP - 104
BT - Proceedings - 2005 IEEE International Conference on Computer Design
T2 - 2005 IEEE International Conference on Computer Design: VLSI in Computers and Processors, ICCD 2005
Y2 - 2 October 2005 through 5 October 2005
ER -