Data locality optimization for a parallel object detection on embedded multi-core systems

Bo-Cheng Lai*, Chih Hsuan Chiang, Guan Ru Li

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    6 Scopus citations

    Abstract

    Object detection is an important application for modern smart embedded devices. It enables the device to recognize the surrounding environment and perform intelligent applications. The intensive computation requirements make the object detection an expensive application running on the resource-constrained embedded device. Parallel processing on multi-core systems provides a platform to boost the performance. However, the memory bottleneck limits the performance scalability. Improving data locality of the on-chip cache has therefore become a critical design concern. This paper analyzed the memory behavior of a parallel Viola-Jones algorithm, and proposed a scheme to enhance the data locality of on-chip cache. By running a multi-threaded object detection algorithm on a cycle-accurate multi-core simulator, the proposed approach can achieve up to 58% better performance when compared with the original parallel program.

    Original languageEnglish
    Title of host publicationICSESS 2011 - Proceedings
    Subtitle of host publication2011 IEEE 2nd International Conference on Software Engineering and Service Science
    Pages576-579
    Number of pages4
    DOIs
    StatePublished - 12 Sep 2011
    Event2011 IEEE 2nd International Conference on Software Engineering and Service Science, ICSESS 2011 - Beijing, China
    Duration: 15 Jul 201117 Jul 2011

    Publication series

    NameICSESS 2011 - Proceedings: 2011 IEEE 2nd International Conference on Software Engineering and Service Science

    Conference

    Conference2011 IEEE 2nd International Conference on Software Engineering and Service Science, ICSESS 2011
    Country/TerritoryChina
    CityBeijing
    Period15/07/1117/07/11

    Keywords

    • data locality
    • embedded device
    • multi-core
    • object detection
    • parallel processing

    Fingerprint

    Dive into the research topics of 'Data locality optimization for a parallel object detection on embedded multi-core systems'. Together they form a unique fingerprint.

    Cite this