Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking

Tzu Mi Lin, Man Chen Hung, Lung Hao Lee*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Entity linking is the task of assigning a unique identity to named entities mentioned in a text, a sort of word sense disambiguation that focuses on automatically determining a pre-defined sense for a target entity to be disambiguated. This study proposes the DGE (Dual Gloss Encoders) model for Chinese entity linking in the biomedical domain. We separately model a dual encoder architecture, comprising a context-aware gloss encoder and a lexical gloss encoder, for contextualized embedding representations. DGE are then jointly optimized to assign the nearest gloss with the highest score for target entity disambiguation. The experimental datasets consist of a total of 10,218 sentences that were manually annotated with glosses defined in the BabelNet 5.0 across 40 distinct biomedical entities. Experimental results show that the DGE model achieved an F1-score of 97.81, outperforming other existing methods. A series of model analyses indicate that the proposed approach is effective for Chinese biomedical entity linking.

Original languageEnglish
JournalACM Transactions on Asian and Low-Resource Language Information Processing
Volume23
Issue number2
DOIs
StatePublished - 8 Feb 2024

Keywords

  • biomedical informatics
  • language transformers
  • lexical semantics
  • natural language understanding
  • Word sense disambiguation

Fingerprint

Dive into the research topics of 'Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking'. Together they form a unique fingerprint.

Cite this