Abstract
Recognition of off-line handwritten Chinese character had been an important problem. Because of the variation and vagueness derived from different users' handwritings, it was hard to recognize handwriting characters via statistical features obtained from database. The purpose of this study is to use lexical, syntactical and corpus rules for increasing the accuracy mentioned above. Our methods could be divided into three phases. First, we used lexical rule “multi-syllable words priority” to predict some characters of a sentence from candidate characters. Second, neighbor several candidate characters in which particular grammar patterns appear will be treated as the characters of the sentence. Finally, two adjacent candidate characters will be regarded as a string. The strings which occur in a corpus frequently will be used to be the characters of the sentence. To contrast approach “highest frequency priority”, experimental results shown that the accurate rate of Chinese handwriting character recognition could be effectively increased from 0.45 to 0.85.
Translated title of the contribution | Revision for recognizing Chinese handwritten sentences based on lexical, syntactical and corpus rules |
---|---|
Original language | Chinese (Traditional) |
Pages | 227-239 |
Number of pages | 13 |
State | Published - 8 Sep 2011 |
Event | 23rd Conference on Computational Linguistics and Speech Processing, ROCLING 2011 - Taipei, Taiwan Duration: 8 Sep 2011 → 9 Sep 2011 |
Conference
Conference | 23rd Conference on Computational Linguistics and Speech Processing, ROCLING 2011 |
---|---|
Country/Territory | Taiwan |
City | Taipei |
Period | 8/09/11 → 9/09/11 |
Keywords
- OCR
- Natural Language Processing
- Corpus
- Handwritten Chinese Character Recognition