Efficient mining of a concise and lossless representation of high utility itemsets

Cheng Wei Wu*, Philippe Fournier-Viger, Philip S. Yu, S. Tseng

*此作品的通信作者

研究成果: Conference contribution同行評審

46 引文 斯高帕斯(Scopus)

摘要

Mining high utility itemsets from transactional databases is an important data mining task, which refers to the discovery of itemsets with high utilities (e.g. high profits). Although several studies have been carried out, current methods may present too many high utility itemsets for users, which degrades the performance of the mining task in terms of execution and memory efficiency. To achieve high efficiency for the mining task and provide a concise mining result to users, we propose a novel framework in this paper for mining closed + high utility itemsets, which serves as a compact and lossless representation of high utility itemsets. We present an efficient algorithm called CHUD (Closed + High Utility itemset Discovery) for mining closed + high utility itemsets. Further, a method called DAHU (Derive All High Utility itemsets) is proposed to recover all high utility itemsets from the set of closed + high utility itemsets without accessing the original database. Results of experiments on real and synthetic datasets show that CHUD and DAHU are very efficient with a massive reduction (up to 800 times in our experiments) in the number of high utility itemsets. In addition, when all high utility itemsets are recovered by DAHU, the approach combining CHUD and DAHU also outperforms the state-of-the-art algorithms in mining high utility itemsets.

原文English
主出版物標題Proceedings - 11th IEEE International Conference on Data Mining, ICDM 2011
頁面824-833
頁數10
DOIs
出版狀態Published - 2011
事件11th IEEE International Conference on Data Mining, ICDM 2011 - Vancouver, BC, Canada
持續時間: 11 12月 201114 12月 2011

出版系列

名字Proceedings - IEEE International Conference on Data Mining, ICDM
ISSN(列印)1550-4786

Conference

Conference11th IEEE International Conference on Data Mining, ICDM 2011
國家/地區Canada
城市Vancouver, BC
期間11/12/1114/12/11

指紋

深入研究「Efficient mining of a concise and lossless representation of high utility itemsets」主題。共同形成了獨特的指紋。

引用此