TY - JOUR
T1 - Mining high-utility itemsets in dynamic profit databases
AU - Nguyen, Loan T.T.
AU - Nguyen, Phuc
AU - Nguyen, Trinh D.D.
AU - Vo, Bay
AU - Fournier-Viger, Philippe
AU - Tseng, Vincent S.
N1 - Publisher Copyright:
© 2019 Elsevier B.V.
PY - 2019/7/1
Y1 - 2019/7/1
N2 - High-Utility Itemset (HUI)mining is an important data-mining task which has gained popularity in recent years due to its applications in numerous fields. HUI mining aims at discovering itemsets that have high utility (e.g., yield a high profit)in transactional databases. Although several algorithms have been designed to enumerate all HUIs, an important issue is that they assume that the utilities (e.g., unit profits)of items are static. But this simplifying assumption does not hold in real-life situations. For example, the unit profits of items often vary over time in a retail store due to fluctuating supply costs and promotions. Ignoring this important characteristic of real-life transactional databases makes current HUI-mining algorithms inapplicable in many real-world applications. To address this critical limitation of current HUI-mining techniques, this paper studies the novel problem of mining HUIs in databases having dynamic unit profits. To accurately assess the utility of any itemset in this context, a redefined utility measure is introduced. Furthermore, a novel algorithm named MEFIM (Modified EFficient high-utility Itemset Mining), which relies on a novel compact database format to discover the desired itemsets efficiently, is designed. An improved version of the MEFIM algorithm, named iMEFIM, is also introduced. This algorithm employs a novel structure called P-set to reduce the number of transaction scans and to speed up the mining process. Experimental results show that the proposed algorithms considerably outperform the state-of-the-art HUI-mining algorithms on dynamic profit databases in terms of runtime, memory usage, and scalability.
AB - High-Utility Itemset (HUI)mining is an important data-mining task which has gained popularity in recent years due to its applications in numerous fields. HUI mining aims at discovering itemsets that have high utility (e.g., yield a high profit)in transactional databases. Although several algorithms have been designed to enumerate all HUIs, an important issue is that they assume that the utilities (e.g., unit profits)of items are static. But this simplifying assumption does not hold in real-life situations. For example, the unit profits of items often vary over time in a retail store due to fluctuating supply costs and promotions. Ignoring this important characteristic of real-life transactional databases makes current HUI-mining algorithms inapplicable in many real-world applications. To address this critical limitation of current HUI-mining techniques, this paper studies the novel problem of mining HUIs in databases having dynamic unit profits. To accurately assess the utility of any itemset in this context, a redefined utility measure is introduced. Furthermore, a novel algorithm named MEFIM (Modified EFficient high-utility Itemset Mining), which relies on a novel compact database format to discover the desired itemsets efficiently, is designed. An improved version of the MEFIM algorithm, named iMEFIM, is also introduced. This algorithm employs a novel structure called P-set to reduce the number of transaction scans and to speed up the mining process. Experimental results show that the proposed algorithms considerably outperform the state-of-the-art HUI-mining algorithms on dynamic profit databases in terms of runtime, memory usage, and scalability.
KW - Candidate pruning
KW - Data mining
KW - Dynamic profit
KW - High-utility itemset mining
UR - http://www.scopus.com/inward/record.url?scp=85063892862&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2019.03.022
DO - 10.1016/j.knosys.2019.03.022
M3 - Article
AN - SCOPUS:85063892862
SN - 0950-7051
VL - 175
SP - 130
EP - 144
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
ER -