A data augmentation method for English-Vietnamese neural machine translation
The translation quality of machine translation systems depends on the parallel corpus used
for training, particularly on the quantity and quality of the corpus. However, building a high …
for training, particularly on the quantity and quality of the corpus. However, building a high …
Bridging the data gap between training and inference for unsupervised neural machine translation
Back-translation is a critical component of Unsupervised Neural Machine Translation
(UNMT), which generates pseudo parallel data from target monolingual data. A UNMT …
(UNMT), which generates pseudo parallel data from target monolingual data. A UNMT …
Exploring All-In-One Knowledge Distillation Framework for Neural Machine Translation
Conventional knowledge distillation (KD) approaches are commonly employed to compress
neural machine translation (NMT) models. However, they only obtain one lightweight …
neural machine translation (NMT) models. However, they only obtain one lightweight …
Refining low-resource unsupervised translation by language disentanglement of multilingual translation model
Numerous recent work on unsupervised machine translation (UMT) implies that competent
unsupervised translations of low-resource and unrelated languages, such as Nepali or …
unsupervised translations of low-resource and unrelated languages, such as Nepali or …
Latent constraints on unsupervised text-graph alignment with information asymmetry
Unsupervised text-graph alignment (UTGA) is a fundamental task that bidirectionally
generates texts and graphs without parallel data. Most available models of UTGA suffer from …
generates texts and graphs without parallel data. Most available models of UTGA suffer from …
Inquiries into Farmers' Perception of Biodiversity in Vietnam: A Systematic Analysis
Conserving biodiversity has become more important for tropical countries, where agricultural
production is featured by a large number of small farms scattered in wide areas conducting …
production is featured by a large number of small farms scattered in wide areas conducting …
Ga-scs: Graph-augmented source code summarization
M Zhang, G Zhou, W Yu, N Huang, W Liu - ACM Transactions on Asian …, 2023 - dl.acm.org
Automatic source code summarization system aims to generate a valuable natural language
description for a program, which can facilitate software development and maintenance, code …
description for a program, which can facilitate software development and maintenance, code …
CantonMT: Cantonese to English NMT Platform with Fine-Tuned Models Using Synthetic Back-Translation Data
Neural Machine Translation (NMT) for low-resource languages is still a challenging task in
front of NLP researchers. In this work, we deploy a standard data augmentation …
front of NLP researchers. In this work, we deploy a standard data augmentation …
BaSFormer: A Balanced Sparsity Regularized Attention Network for Transformer
Attention networks often make decisions relying solely on a few pieces of tokens, even if
those reliances are not truly indicative of the underlying meaning or intention of the full …
those reliances are not truly indicative of the underlying meaning or intention of the full …
Augvic: Exploiting bitext vicinity for low-resource nmt
The success of Neural Machine Translation (NMT) largely depends on the availability of
large bitext training corpora. Due to the lack of such large corpora in low-resource language …
large bitext training corpora. Due to the lack of such large corpora in low-resource language …