TY - JOUR
T1 - Multiple Embeddings Enhanced Multi-Graph Neural Networks for Chinese Healthcare Named Entity Recognition
AU - Lee, Lung Hao
AU - Lu, Yi
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2021/7
Y1 - 2021/7
N2 - Named Entity Recognition (NER) is a natural language processing task for recognizing named entities in a given sentence. Chinese NER is difficult due to the lack of delimited spaces and conventional features for determining named entity boundaries and categories. This study proposes the ME-MGNN (Multiple Embeddings enhanced Multi-Graph Neural Networks) model for Chinese NER in the healthcare domain. We integrate multiple embeddings at different granularities from the radical, character to word levels for an extended character representation, and this is fed into multiple gated graph sequence neural networks to identify named entities and classify their types. The experimental datasets were collected from health-related news, digital health magazines and medical question/answer forums. Manual annotation was conducted for a total of 68,460 named entities across 10 entity types (body, symptom, instrument, examination, chemical, disease, drug, supplement, treatment and time) in 30,692 sentences. Experimental results indicated our ME-MGNN model achieved an F1-score result of 75.69, outperforming previous methods. In practice, a series of model analysis implied that our method is effective and efficient for Chinese healthcare NER.
AB - Named Entity Recognition (NER) is a natural language processing task for recognizing named entities in a given sentence. Chinese NER is difficult due to the lack of delimited spaces and conventional features for determining named entity boundaries and categories. This study proposes the ME-MGNN (Multiple Embeddings enhanced Multi-Graph Neural Networks) model for Chinese NER in the healthcare domain. We integrate multiple embeddings at different granularities from the radical, character to word levels for an extended character representation, and this is fed into multiple gated graph sequence neural networks to identify named entities and classify their types. The experimental datasets were collected from health-related news, digital health magazines and medical question/answer forums. Manual annotation was conducted for a total of 68,460 named entities across 10 entity types (body, symptom, instrument, examination, chemical, disease, drug, supplement, treatment and time) in 30,692 sentences. Experimental results indicated our ME-MGNN model achieved an F1-score result of 75.69, outperforming previous methods. In practice, a series of model analysis implied that our method is effective and efficient for Chinese healthcare NER.
KW - Embedding representation
KW - graph neural networks
KW - information extraction
KW - named entity recognition
UR - http://www.scopus.com/inward/record.url?scp=85099108496&partnerID=8YFLogxK
U2 - 10.1109/JBHI.2020.3048700
DO - 10.1109/JBHI.2020.3048700
M3 - Article
C2 - 33385314
AN - SCOPUS:85099108496
SN - 2168-2194
VL - 25
SP - 2801
EP - 2810
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
IS - 7
M1 - 9312396
ER -