TY - GEN
T1 - AUGMENTATION STRATEGY OPTIMIZATION FOR LANGUAGE UNDERSTANDING
AU - Chu, Chang Ting
AU - Rohmatillah, Mahdin
AU - Lee, Ching Hsien
AU - Chien, Jen Tzung
N1 - Publisher Copyright:
© 2022 IEEE
PY - 2022
Y1 - 2022
N2 - This paper presents a new language processing and understanding where an adaptive data augmentation strategy for individual documents is proposed instead of using one universal policy for the whole dataset. Importantly, a reinforcement learning and understanding method is exploited for document classification where the document encoder, augmenter and classifier are jointly optimized. In particular, a new reward function based on the consistency loss maximization is presented to assure the diversity of the generated documents. Using this method, the reward for adaptive augmentation policy is immediately calculated for every augmented instance without the need of waiting the child model performance metrics as the reward. The experiments on various classification tasks with a strong baseline model show that the augmentation strategy optimization can improve the model training process by providing meaningful augmentation data which eventually result in desirable evaluation performance. Furthermore, the extensive studies on the behavior of policy in different settings are provided in order to assure the diversity of the augmented data that was obtained by the proposed method.
AB - This paper presents a new language processing and understanding where an adaptive data augmentation strategy for individual documents is proposed instead of using one universal policy for the whole dataset. Importantly, a reinforcement learning and understanding method is exploited for document classification where the document encoder, augmenter and classifier are jointly optimized. In particular, a new reward function based on the consistency loss maximization is presented to assure the diversity of the generated documents. Using this method, the reward for adaptive augmentation policy is immediately calculated for every augmented instance without the need of waiting the child model performance metrics as the reward. The experiments on various classification tasks with a strong baseline model show that the augmentation strategy optimization can improve the model training process by providing meaningful augmentation data which eventually result in desirable evaluation performance. Furthermore, the extensive studies on the behavior of policy in different settings are provided in order to assure the diversity of the augmented data that was obtained by the proposed method.
KW - Data augmentation
KW - document representation
KW - natural language understanding
KW - policy optimization
UR - http://www.scopus.com/inward/record.url?scp=85131248831&partnerID=8YFLogxK
U2 - 10.1109/ICASSP43922.2022.9746696
DO - 10.1109/ICASSP43922.2022.9746696
M3 - Conference contribution
AN - SCOPUS:85131248831
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 7952
EP - 7956
BT - 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Y2 - 23 May 2022 through 27 May 2022
ER -