Latency-tolerant virtual cluster architecture for VLIW DSP

Pi Chen Hsiao*, Tay Jyi Lin, Chih-Wei Liu, Chein Wei Jen

*Corresponding author for this work

    Research output: Contribution to journalConference articlepeer-review

    Abstract

    This paper proposes a virtual cluster architecture, which executes multi-cluster VLIW programs with a reduced number of clusters in a time-sharing fashion. The interleaved sub-VLIWs help to hide instruction latencies significantly, and thus the proposed virtual cluster will have advantages of (1) reduced forwarding complexity in the processor datapath, (2) improved programming model for further code optimizations, and (3) supporting composite instructions without any extra functional unit. In our experiments with a 4-cluster VLIW DSP, the 28 forwarding paths inside a cluster are completely eliminated, which contributes to savings of 21.71% delay and 17.56% silicon area. Moreover, the virtual cluster has been verified to have better efficiency on its code sizes and execution times for its improved programming model for various DSP kernels.

    Original languageEnglish
    Article number4253436
    Pages (from-to)3506-3509
    Number of pages4
    JournalProceedings - IEEE International Symposium on Circuits and Systems
    DOIs
    StatePublished - 2007
    Event2007 IEEE International Symposium on Circuits and Systems, ISCAS 2007 - New Orleans, LA, United States
    Duration: 27 May 200730 May 2007

    Fingerprint

    Dive into the research topics of 'Latency-tolerant virtual cluster architecture for VLIW DSP'. Together they form a unique fingerprint.

    Cite this