Abstract
This paper investigates how the use of a new data structure, the codeword, can help improve the performance of quicksort when the records to be sorted are long and the keys are alphanumeric sequences of bytes. The codeword is a compact representation of a key with respect to some codeword generator. It consists of a byte for a character count of “equal” bytes, a byte for the “first nonequal” byte, and a pointer to the record. The paper shows how the ordering of keys is preserved by an adequate choice of the code generator and how this can be applied to the quicksort algorithm. An analysis of the potential savings on various architectures and actual measurements show the improvements that can be attained by using codewords rather than pointers. Architecturally independent parameters, such as the number of bytes to be compared and the number of swaps, architecture-dependent parameters such as caches and their write policies, and compiler optimizations such as inline expansion and register allocation are considered.
Original language | English |
---|---|
Pages (from-to) | 622-631 |
Number of pages | 10 |
Journal | IEEE Transactions on Software Engineering |
Volume | 15 |
Issue number | 5 |
DOIs | |
State | Published - 1 Jan 1989 |