Mining top-K high utility itemsets

Cheng Wei Wu*, Bai En Shie, Vincent Shin-Mu Tseng, Philip S. Yu

*此作品的通信作者

研究成果: Conference contribution同行評審

98 引文 斯高帕斯(Scopus)

摘要

Mining high utility itemsets from databases is an emerging topic in data mining, which refers to the discovery of itemsets with utilities higher than a user-specified minimum utility threshold min-util. Although several studies have been carried out on this topic, setting an appropriate minimum utility threshold is a difficult problem for users. If min-util is set too low, too many high utility itemsets will be generated, which may cause the mining algorithms to become inefficient or even run out of memory. On the other hand, if min-util is set too high, no high utility itemset will be found. Setting appropriate minimum utility thresholds by trial and error is a tedious process for users. In this paper, we address this problem by proposing a new framework named top-k high utility itemset mining, where k is the desired number of high utility itemsets to be mined. An efficient algorithm named TKU (Top-K Utility itemsets mining) is proposed for mining such itemsets without setting min-util. Several features were designed in TKU to solve the new challenges raised in this problem, like the absence of anti-monotone property and the requirement of lossless results. Moreover, TKU incorporates several novel strategies for pruning the search space to achieve high efficiency. Results on real and synthetic datasets show that TKU has excellent performance and scalability.

原文English
主出版物標題KDD'12 - 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
頁面78-86
頁數9
DOIs
出版狀態Published - 14 九月 2012
事件18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012 - Beijing, China
持續時間: 12 八月 201216 八月 2012

出版系列

名字Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012
國家/地區China
城市Beijing
期間12/08/1216/08/12

指紋

深入研究「Mining top-K high utility itemsets」主題。共同形成了獨特的指紋。

引用此