AUGMENTATION STRATEGY OPTIMIZATION FOR LANGUAGE UNDERSTANDING

Chang Ting Chu*, Mahdin Rohmatillah, Ching Hsien Lee, Jen Tzung Chien

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This paper presents a new language processing and understanding where an adaptive data augmentation strategy for individual documents is proposed instead of using one universal policy for the whole dataset. Importantly, a reinforcement learning and understanding method is exploited for document classification where the document encoder, augmenter and classifier are jointly optimized. In particular, a new reward function based on the consistency loss maximization is presented to assure the diversity of the generated documents. Using this method, the reward for adaptive augmentation policy is immediately calculated for every augmented instance without the need of waiting the child model performance metrics as the reward. The experiments on various classification tasks with a strong baseline model show that the augmentation strategy optimization can improve the model training process by providing meaningful augmentation data which eventually result in desirable evaluation performance. Furthermore, the extensive studies on the behavior of policy in different settings are provided in order to assure the diversity of the augmented data that was obtained by the proposed method.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7952-7956
Number of pages5
ISBN (Electronic)9781665405409
DOIs
StatePublished - 2022
Event47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore
Duration: 23 May 202227 May 2022

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityVirtual, Online
Period23/05/2227/05/22

Keywords

  • Data augmentation
  • document representation
  • natural language understanding
  • policy optimization

Fingerprint

Dive into the research topics of 'AUGMENTATION STRATEGY OPTIMIZATION FOR LANGUAGE UNDERSTANDING'. Together they form a unique fingerprint.

Cite this