TY - JOUR
T1 - A Scalable Analytical Framework for Complex Event Episode Mining With Various Domains Applications
AU - Tseng, Jerry C.C.
AU - Hsieh, Sun Yuan
AU - Tseng, Vincent S.
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2022
Y1 - 2022
N2 - With the ubiquity of sensor networks and smart devices that continuously collect data, we face the challenge of analyzing the growing stream of data in real time. In recent years, there has been a huge need to gain useful knowledge by incrementally analyzing event sequence data. Although episode pattern mining techniques have existed for years, people have recently become more aware of their practical value in solving real-life domain problems such as manufacturing records, stock markets, and weather forecasts. The effective and efficient application of episode pattern mining techniques to analyze complex event data is becoming increasingly important for solving real-life problems in wide domains. However, few studies have focused on developing a scalable framework based on episode pattern mining of complex event sequences for applications in various domains. In this work, we propose a novel framework named SAAF (Scalable Analytical Application Framework) based on complex event episode mining techniques, including batch episode mining, delta episode mining, incremental episode mining, and pattern merging, to consider both efficiency and accuracy. Moreover, to enhance scalability, we adopt the lambda architecture with Apache Spark and Apache Spark Streaming as the system development framework. Finally, the experimental results on three real datasets of different domains and two benchmark datasets showed that the proposed SAAF framework exhibits excellent performance in terms of efficiency, accuracy, and scalability.
AB - With the ubiquity of sensor networks and smart devices that continuously collect data, we face the challenge of analyzing the growing stream of data in real time. In recent years, there has been a huge need to gain useful knowledge by incrementally analyzing event sequence data. Although episode pattern mining techniques have existed for years, people have recently become more aware of their practical value in solving real-life domain problems such as manufacturing records, stock markets, and weather forecasts. The effective and efficient application of episode pattern mining techniques to analyze complex event data is becoming increasingly important for solving real-life problems in wide domains. However, few studies have focused on developing a scalable framework based on episode pattern mining of complex event sequences for applications in various domains. In this work, we propose a novel framework named SAAF (Scalable Analytical Application Framework) based on complex event episode mining techniques, including batch episode mining, delta episode mining, incremental episode mining, and pattern merging, to consider both efficiency and accuracy. Moreover, to enhance scalability, we adopt the lambda architecture with Apache Spark and Apache Spark Streaming as the system development framework. Finally, the experimental results on three real datasets of different domains and two benchmark datasets showed that the proposed SAAF framework exhibits excellent performance in terms of efficiency, accuracy, and scalability.
KW - Complex event sequence
KW - data stream
KW - episode pattern mining
KW - incremental mining
KW - lambda architecture
UR - http://www.scopus.com/inward/record.url?scp=85144774900&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2022.3228962
DO - 10.1109/ACCESS.2022.3228962
M3 - Article
AN - SCOPUS:85144774900
SN - 2169-3536
VL - 10
SP - 130672
EP - 130685
JO - IEEE Access
JF - IEEE Access
ER -