Abstract
Given a query DNA sequence, our goal is to find in the DNA sequence database all the sequence segments that are similar to the query. In this paper we present a string-to-signal transform technique that can transform a DNA sequence into a four-channel signal. Without considering gaps, the edit distance between two DNA sequences can be calculated as the sum of absolute difference (SAD) between their corresponding four-channel signals. The algorithm proposed in this paper can then be applied to speed up the process of searching for the desired sequence segments that yield small SADs. In addition to efficiency, this algorithm guarantees the optimal search. That is, all the sequence segments that are similar enough to the query can be found without any miss.
Original language | English |
---|---|
Pages (from-to) | 1019-1022 |
Number of pages | 4 |
Journal | Proceedings - International Conference on Pattern Recognition |
Volume | 16 |
Issue number | 3 |
DOIs | |
State | Published - 1 Dec 2002 |