摘要
Data classification is an important topic in data mining field due to the wide applications. A number of related methods have been proposed based on the well-known learning models like decision tree or neural network. However, these kinds of classification methods may not perform well in mining time sequence datasets like time-series gene expression data. In this paper, we propose a new data mining method, namely Classify-By-Sequence (CBS), for classifying large time-series datasets. The main methodology of CBS method is to integrate the sequential pattern mining with the probabilistic induction such that the inherent sequential patterns can be extracted efficiently and the classification task be done more accurately. Meanwhile, CBS method has the merit of simplicity in implementation. Through experimental evaluation, the CBS method is shown to outperform other methods greatly in the classification accuracy.
原文 | English |
---|---|
頁面 | 596-600 |
頁數 | 5 |
DOIs | |
出版狀態 | Published - 4月 2005 |
事件 | 5th SIAM International Conference on Data Mining, SDM 2005 - Newport Beach, CA, 美國 持續時間: 21 4月 2005 → 23 4月 2005 |
Conference
Conference | 5th SIAM International Conference on Data Mining, SDM 2005 |
---|---|
國家/地區 | 美國 |
城市 | Newport Beach, CA |
期間 | 21/04/05 → 23/04/05 |