Esrl: Efficient sampling-based reinforcement learning for sequence generation
Applying Reinforcement Learning (RL) to sequence generation models enables the direct
optimization of long-term rewards (\textit {eg,} BLEU and human feedback), but typically …
optimization of long-term rewards (\textit {eg,} BLEU and human feedback), but typically …
Introduction to Transformers: an NLP Perspective
Transformers have dominated empirical machine learning models of natural language
processing. In this paper, we introduce basic concepts of Transformers and present key …
processing. In this paper, we introduce basic concepts of Transformers and present key …
TranSFormer: Slow-fast transformer for machine translation
Learning multiscale Transformer models has been evidenced as a viable approach to
augmenting machine translation systems. Prior research has primarily focused on treating …
augmenting machine translation systems. Prior research has primarily focused on treating …
Learning Evaluation Models from Large Language Models for Sequence Generation
Large language models achieve state-of-the-art performance on sequence generation
evaluation, but typically have a large number of parameters. This is a computational …
evaluation, but typically have a large number of parameters. This is a computational …
Pluggable Neural Machine Translation Models via Memory-augmented Adapters
Although neural machine translation (NMT) models perform well in the general domain, it
remains rather challenging to control their generation behavior to satisfy the requirement of …
remains rather challenging to control their generation behavior to satisfy the requirement of …
RSMformer: an efficient multiscale transformer-based framework for long sequence time-series forecasting
G Tong, Z Ge, D Peng - Applied Intelligence, 2024 - Springer
Long sequence time-series forecasting (LSTF) is a significant and challenging task. Many
real-world applications require long-term forecasting of time series. In recent years …
real-world applications require long-term forecasting of time series. In recent years …
Enhancing Neural Machine Translation with Semantic Units
Conventional neural machine translation (NMT) models typically use subwords and words
as the basic units for model input and comprehension. However, complete words and …
as the basic units for model input and comprehension. However, complete words and …
End-to-end Planner Training for Language Modeling
Through end-to-end training to predict the next token, LLMs have become valuable tools for
various tasks. Enhancing their core training in language modeling can improve numerous …
various tasks. Enhancing their core training in language modeling can improve numerous …
EIT: Enhanced interactive transformer
In this paper, we propose a novel architecture, the Enhanced Interactive Transformer (EIT),
to address the issue of head degradation in self-attention mechanisms. Our approach …
to address the issue of head degradation in self-attention mechanisms. Our approach …
[HTML][HTML] Compressive Strength Prediction of Fly Ash-Based Concrete Using Single and Hybrid Machine Learning Models
H Li, H Chung, Z Li, W Li - Buildings, 2024 - mdpi.com
The compressive strength of concrete is a crucial parameter in structural design, yet its
determination in a laboratory setting is both time-consuming and expensive. The prediction …
determination in a laboratory setting is both time-consuming and expensive. The prediction …