Reinforcement learning for generative ai: State of the art, opportunities and open research challenges
G Franceschelli, M Musolesi - Journal of Artificial Intelligence Research, 2024 - jair.org
Abstract Generative Artificial Intelligence (AI) is one of the most exciting developments in
Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has …
Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has …
Pick-a-pic: An open dataset of user preferences for text-to-image generation
The ability to collect a large dataset of human preferences from text-to-image users is
usually limited to companies, making such datasets inaccessible to the public. To address …
usually limited to companies, making such datasets inaccessible to the public. To address …
Reinforcement learning for fine-tuning text-to-image diffusion models
Learning from human feedback has been shown to improve text-to-image models. These
techniques first learn a reward function that captures what humans care about in the task …
techniques first learn a reward function that captures what humans care about in the task …
Holistic evaluation of text-to-image models
The stunning qualitative improvement of text-to-image models has led to their widespread
attention and adoption. However, we lack a comprehensive quantitative understanding of …
attention and adoption. However, we lack a comprehensive quantitative understanding of …
Training diffusion models with reinforcement learning
Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …
the log-likelihood objective. However, most use cases of diffusion models are not concerned …
Diffusion model alignment using direct preference optimization
Large language models (LLMs) are fine-tuned using human comparison data with
Reinforcement Learning from Human Feedback (RLHF) methods to make them better …
Reinforcement Learning from Human Feedback (RLHF) methods to make them better …
TBench: Benchmarking Current Progress in Text-to-3D Generation
Recent methods in text-to-3D leverage powerful pretrained diffusion models to optimize
NeRF. Notably, these methods are able to produce high-quality 3D scenes without training …
NeRF. Notably, these methods are able to produce high-quality 3D scenes without training …
Hive: Harnessing human feedback for instructional visual editing
Incorporating human feedback has been shown to be crucial to align text generated by large
language models to human preferences. We hypothesize that state-of-the-art instructional …
language models to human preferences. We hypothesize that state-of-the-art instructional …
[HTML][HTML] A survey of safety and trustworthiness of large language models through the lens of verification and validation
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …
engage end-users in human-level conversations with detailed and articulate answers across …
Power hungry processing: Watts driving the cost of AI deployment?
Recent years have seen a surge in the popularity of commercial AI products based on
generative, multi-purpose AI systems promising a unified approach to building machine …
generative, multi-purpose AI systems promising a unified approach to building machine …