The multimedia SoC usually integrates programmable digital signal processors (DSP) to accelerate data-intensive computations. But the DSP and the host processor (e.g. ARM) are both designed for standalone uses, and they must have overlapped functionalities and thus some redundant components. In this paper, we propose a compact DSP core for dual-core multimedia SoC and its complete software development tools. The DSP core contains a dataflow engine that is composed of off-the-shelf memory modules with limited ports, and we have investigated software techniques extensively to reduce the hardware complexity as the principles of VLIW processors. Moreover, the DSP is equipped with novel static floating-point units to emulate expensive floating-point DSP operations at low cost. In our experiments, this core has about thrice the performance (estimated in execution cycles) of Analog Devices ADSP-218x with similar computing resources. Our first prototype in the 0.35μm CMOS technology operates at 100MHz and consumes 122mW power. The core size is 2.8mm2 including an embedded DMA controller and the AMBA AHB interface.