應用詞彙、語法與語料規則於中文手寫句辨識之校正模組

Translated title of the contribution: Revision for recognizing Chinese handwritten sentences based on lexical, syntactical and corpus rules

Tao Hsing Chang, Chia Bin Chou, Shou Yen Su, Chien-Liang Liu

Research output: Contribution to conferencePaperpeer-review

Abstract

Recognition of off-line handwritten Chinese character had been an important problem. Because of the variation and vagueness derived from different users' handwritings, it was hard to recognize handwriting characters via statistical features obtained from database. The purpose of this study is to use lexical, syntactical and corpus rules for increasing the accuracy mentioned above. Our methods could be divided into three phases. First, we used lexical rule “multi-syllable words priority” to predict some characters of a sentence from candidate characters. Second, neighbor several candidate characters in which particular grammar patterns appear will be treated as the characters of the sentence. Finally, two adjacent candidate characters will be regarded as a string. The strings which occur in a corpus frequently will be used to be the characters of the sentence. To contrast approach “highest frequency priority”, experimental results shown that the accurate rate of Chinese handwriting character recognition could be effectively increased from 0.45 to 0.85.

Translated title of the contributionRevision for recognizing Chinese handwritten sentences based on lexical, syntactical and corpus rules
Original languageChinese (Traditional)
Pages227-239
Number of pages13
StatePublished - 8 Sep 2011
Event23rd Conference on Computational Linguistics and Speech Processing, ROCLING 2011 - Taipei, Taiwan
Duration: 8 Sep 20119 Sep 2011

Conference

Conference23rd Conference on Computational Linguistics and Speech Processing, ROCLING 2011
Country/TerritoryTaiwan
CityTaipei
Period8/09/119/09/11

Keywords

  • OCR
  • Natural Language Processing
  • Corpus
  • Handwritten Chinese Character Recognition

Fingerprint

Dive into the research topics of 'Revision for recognizing Chinese handwritten sentences based on lexical, syntactical and corpus rules'. Together they form a unique fingerprint.

Cite this