Performance study of software and hardware data prefetching schemes

Tien-Fu Chen*, Jean Loup Baer

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

111 Scopus citations


Prefetching, i.e., exploiting the overlap of processor computations with data accesses, is one of several approaches for tolerating memory latencies. Prefetching can be either hardware-based or software-directed or a combination of both. Hardware-based prefetching, requiring some support unit connected to the cache, can dynamically handle prefetches at run-time without compiler intervention. Software-directed approaches rely on compiler technology to insert explicit prefetch instructions. Mowry et al.'s software scheme and our hardware approach are two representative schemes. In this paper, we evaluate approximations to these two schemes in the context of a shared-memory multiprocessor environment. Our qualitative comparisons indicate that both schemes are able to reduce cache misses in the domain of linear array references. When complex data access patterns are considered, the software approach has compile-time information to perform sophisticated prefetching whereas the hardware scheme has the advantage of manipulating dynamic information. The performance results from an instruction-level simulation of four benchmarks confirm these observations. Our simulations show that the hardware scheme introduces more memory traffic into the network and that the software scheme introduces a non-negligible instruction execution overhead. An approach combining software and hardware schemes is proposed; it shows promise in reducing the memory latency with least overhead.

Original languageEnglish
Pages (from-to)223-232
Number of pages10
JournalConference Proceedings - Annual International Symposium on Computer Architecture, ISCA
StatePublished - 1 Jan 1994
EventProceedings of the 21st Annual International Symposium on Computer Architecture - Chicago, IL, USA
Duration: 18 Apr 199421 Apr 1994


Dive into the research topics of 'Performance study of software and hardware data prefetching schemes'. Together they form a unique fingerprint.

Cite this