Semantic communications for future internet: Fundamentals, applications, and challenges
With the increasing demand for intelligent services, the sixth-generation (6G) wireless
networks will shift from a traditional architecture that focuses solely on a high transmission …
networks will shift from a traditional architecture that focuses solely on a high transmission …
Artificial intelligence and internet of things in small and medium-sized enterprises: A survey
Internet of things (IoT) and artificial intelligence (AI) are popular topics of Industry 4.0. Many
publications regarding these topics have been published, but they are primarily focused on …
publications regarding these topics have been published, but they are primarily focused on …
Imagereward: Learning and evaluating human preferences for text-to-image generation
We present a comprehensive solution to learn and improve text-to-image models from
human preference feedback. To begin with, we build ImageReward---the first general …
human preference feedback. To begin with, we build ImageReward---the first general …
Coderl: Mastering code generation through pretrained models and deep reinforcement learning
Program synthesis or code generation aims to generate a program that satisfies a problem
specification. Recent approaches using large-scale pretrained language models (LMs) have …
specification. Recent approaches using large-scale pretrained language models (LMs) have …
Training language models to follow instructions with human feedback
Making language models bigger does not inherently make them better at following a user's
intent. For example, large language models can generate outputs that are untruthful, toxic, or …
intent. For example, large language models can generate outputs that are untruthful, toxic, or …
Aligning text-to-image models using human feedback
Deep generative models have shown impressive results in text-to-image synthesis.
However, current text-to-image models often generate images that are inadequately aligned …
However, current text-to-image models often generate images that are inadequately aligned …
BRIO: Bringing order to abstractive summarization
Abstractive summarization models are commonly trained using maximum likelihood
estimation, which assumes a deterministic (one-point) target distribution in which an ideal …
estimation, which assumes a deterministic (one-point) target distribution in which an ideal …
Learning to summarize with human feedback
As language models become more powerful, training and evaluation are increasingly
bottlenecked by the data and metrics used for a particular task. For example, summarization …
bottlenecked by the data and metrics used for a particular task. For example, summarization …
Transfer learning in deep reinforcement learning: A survey
Reinforcement learning is a learning paradigm for solving sequential decision-making
problems. Recent years have witnessed remarkable progress in reinforcement learning …
problems. Recent years have witnessed remarkable progress in reinforcement learning …
Recursively summarizing books with human feedback
A major challenge for scaling machine learning is training models to perform tasks that are
very difficult or time-consuming for humans to evaluate. We present progress on this …
very difficult or time-consuming for humans to evaluate. We present progress on this …