Data-Pipelining is a widely used model to represent streaming applications. Incremental decomposition and optimization of a data-pipelining application onto a multi-processor platform spans multiple design layers, including the application layer, the system software layer, the architecture layer and the micro-architecture layer. For best results, designers have to consider multiple design layers (vertical exploration) and multiple architecture options (horizontal exploration). By using a data-pipelining JPEG encoder as the application driver, this paper presents a comprehensive analysis of mapping a data-pipelined application through multiple design layers, to a shared-memory SMP (Symmetric Multi-Processing) system. It is shown that a single-layered optimization ends up with a 110% worse design if the system effects from other layers are not taken into account. Compared to the nominal case, with appropriate mapping of the application, we achieve 47.5% improvement for high performance design and 77.6% energy reduction for energy efficient design under constant performance.