TY - JOUR
T1 - Tree indexing for efficient search of similar documents
AU - Chen, Chung Min
AU - Liu, Duen-Ren
PY - 2000/10/25
Y1 - 2000/10/25
N2 - Linear algebra-based techniques have long been used to correlate similar documents. They map the documents to a multi-dimensional vector space, in which each document is represented by a vector. Searching related documents then translates into searching nearest neighbors in the vector space. In this paper, we propose an indexing structure, called cosine R-tree, which indexes multidimensional vector space and provides efficient nearest neighbor search. Our preliminary results show that it gives better performance than a brute-force linear scan strategy.
AB - Linear algebra-based techniques have long been used to correlate similar documents. They map the documents to a multi-dimensional vector space, in which each document is represented by a vector. Searching related documents then translates into searching nearest neighbors in the vector space. In this paper, we propose an indexing structure, called cosine R-tree, which indexes multidimensional vector space and provides efficient nearest neighbor search. Our preliminary results show that it gives better performance than a brute-force linear scan strategy.
UR - http://www.scopus.com/inward/record.url?scp=0034497007&partnerID=8YFLogxK
U2 - 10.1109/CMPSAC.2000.884720
DO - 10.1109/CMPSAC.2000.884720
M3 - Article
AN - SCOPUS:0034497007
SN - 0730-3157
SP - 210
EP - 211
JO - Proceedings - IEEE Computer Society's International Computer Software and Applications Conference
JF - Proceedings - IEEE Computer Society's International Computer Software and Applications Conference
M1 - 884720
ER -