Towards Pre-trained Language Model for Dynamic Disturbance

Z He, Y Yang, S Zhao - 2021 3rd International Academic …, 2021 - ieeexplore.ieee.org
Z He, Y Yang, S Zhao
2021 3rd International Academic Exchange Conference on Science and …, 2021ieeexplore.ieee.org
Disturbance token embedding can effectively prevent model overfitting and data
augmentation. Simple injection disturbances, however, can adversely impact the model,
making it challenging to learn the correct sample features. In this paper, we propose a novel
idea, which is called Dynamic Disturbance Embedding (DDEM), to inject disturbances.
When the model is trained to a certain extent, the token embedding is disturbed according to
the training error change. The disturbance methods mainly include adversarial training and …
Disturbance token embedding can effectively prevent model overfitting and data augmentation. Simple injection disturbances, however, can adversely impact the model, making it challenging to learn the correct sample features. In this paper, we propose a novel idea, which is called Dynamic Disturbance Embedding (DDEM), to inject disturbances. When the model is trained to a certain extent, the token embedding is disturbed according to the training error change. The disturbance methods mainly include adversarial training and gradient penalty. To ver- ify the effectiveness of the proposed method, we conduct experi- ments in several natural language processing tasks. Experiments indicate that the DDEM method has a pronounced improvement effect than the simple injection disturbance, especially in the summary generation task, the ROUGE-1 index increases by 2.3%. Additionally, dynamic disturbance embedding helps avoid overfitting while improving model accuracy and learning the correct sample features.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果