Mining Compact High Utility Itemsets Without Candidate Generation

Cheng Wei Wu*, Philippe Fournier-Viger, Jia Yuan Gu, Vincent S. Tseng

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

18 Scopus citations

Abstract

Though the research topic of high utility itemset (HUI) mining has received extensive attention in recent years, current algorithms suffer from the crucial problem that too many HUIs tend to be produced. This seriously degrades the performance of HUI mining in terms of execution and memory efficiency. Moreover, it is very hard for users to discover meaningful information in a huge number of HUIs. In this paper, we address this issue by proposing a promising framework with a novel algorithm named CHUI (Compact High Utility Itemset)-Mine to discover closed $$^{+}$$ HUIs and maximal HUIs, which are compact representations of HUIs. The main merits of CHUI-Mine lie in two aspects: First, in terms of efficiency, unlike existing algorithms that tend to produce a large amount of candidates during the mining process, CHUI-Mine computes the utility of itemsets directly without generating candidates. Second, in terms of losslessness, unlike current algorithms that provide incomplete results, CHUI-Mine can discover the complete closed $$^{+}$$ or maximal HUIs with no miss. A comprehensive investigation is also presented to compare the relative advantages of different compact representations in terms of computational cost and compactness. To our best knowledge, this is the first work addressing the issue of mining compact high utility itemsets in terms of closed $$^{+}$$ and maximal HUIs without candidate generation. Experimental results show that CHUI-Mine achieves a massive reduction in the number of HUIs and is several orders of magnitude faster than benchmark algorithms.

Original languageEnglish
Title of host publicationStudies in Big Data
PublisherSpringer Science and Business Media Deutschland GmbH
Pages279-302
Number of pages24
DOIs
StatePublished - 2019

Publication series

NameStudies in Big Data
Volume51
ISSN (Print)2197-6503
ISSN (Electronic)2197-6511

Fingerprint

Dive into the research topics of 'Mining Compact High Utility Itemsets Without Candidate Generation'. Together they form a unique fingerprint.

Cite this