Multiprocessor systems offer superior performance and potentially better energy-reduction than single-processor systems. It all depends however, on how well the application can be mapped onto the architecture. Indeed, a careful tradeoff of energy and performance requires a thorough understanding of the energy consumption pattern of the application across the architecture. We develop a simulation platform, MultiPo-Sim, which returns the cycle-accurate performance and energy consumption of a multiprocessor system, for both hardware components and software primitives. On the hardware level, energy scaling techniques can be modeled and each processing core can operate at different energy modes. MultiPo-Sim achieves 331K. cycles per second simulation speed for a four-processor system on a 3GHz, 512MByte Fedora-2 PC. On the software level, data parallelizing and task parallelizing are two common models of multi-thread programming. By using MultiPo-Sim, we show that they show different energy and performance characteristics when mapping onto a multi-processor system.