摘要
We propose a new similarity-based technique for declustering data. The proposed method can adapt to available information about query distributions, data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph defined over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of data-items that are to be accessed together by queries are allocated to distinct disks. We show that the proposed method can achieve optimal speed-up for a query-set, if there exists any other declustering method which will achieve the optimal speed-up. Experiments in parallelizing Grid Files show that the proposed method outperforms mapping-function-based methods for interesting query distributions as well for non-uniform data distributions.
| 原文 | English |
|---|---|
| 頁面 | 373-381 |
| 頁數 | 9 |
| DOIs | |
| 出版狀態 | Published - 1 1月 1995 |
| 事件 | Proceedings of the 1995 IEEE 11th International Conference on Data Engineering - Taipei, Taiwan 持續時間: 6 3月 1995 → 10 3月 1995 |
Conference
| Conference | Proceedings of the 1995 IEEE 11th International Conference on Data Engineering |
|---|---|
| 城市 | Taipei, Taiwan |
| 期間 | 6/03/95 → 10/03/95 |
指紋
深入研究「Similarity graph-based approach to declustering problems and its application towards parallelizing grid files」主題。共同形成了獨特的指紋。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver