Tree indexing for efficient search of similar documents

Chung Min Chen, Duen-Ren Liu

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Linear algebra-based techniques have long been used to correlate similar documents. They map the documents to a multi-dimensional vector space, in which each document is represented by a vector. Searching related documents then translates into searching nearest neighbors in the vector space. In this paper, we propose an indexing structure, called cosine R-tree, which indexes multidimensional vector space and provides efficient nearest neighbor search. Our preliminary results show that it gives better performance than a brute-force linear scan strategy.

Original languageEnglish
Article number884720
Pages (from-to)210-211
Number of pages2
JournalProceedings - IEEE Computer Society's International Computer Software and Applications Conference
DOIs
StatePublished - 25 Oct 2000

Fingerprint

Dive into the research topics of 'Tree indexing for efficient search of similar documents'. Together they form a unique fingerprint.

Cite this