Reinforcement learning for generative ai: State of the art, opportunities and open research challenges

G Franceschelli, M Musolesi - Journal of Artificial Intelligence Research, 2024 - jair.org
Abstract Generative Artificial Intelligence (AI) is one of the most exciting developments in
Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has …

Pick-a-pic: An open dataset of user preferences for text-to-image generation

Y Kirstain, A Polyak, U Singer… - Advances in …, 2023 - proceedings.neurips.cc
The ability to collect a large dataset of human preferences from text-to-image users is
usually limited to companies, making such datasets inaccessible to the public. To address …

Reinforcement learning for fine-tuning text-to-image diffusion models

Y Fan, O Watkins, Y Du, H Liu, M Ryu… - Advances in …, 2024 - proceedings.neurips.cc
Learning from human feedback has been shown to improve text-to-image models. These
techniques first learn a reward function that captures what humans care about in the task …

Holistic evaluation of text-to-image models

T Lee, M Yasunaga, C Meng, Y Mai… - Advances in …, 2024 - proceedings.neurips.cc
The stunning qualitative improvement of text-to-image models has led to their widespread
attention and adoption. However, we lack a comprehensive quantitative understanding of …

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arXiv preprint arXiv …, 2023 - arxiv.org
Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

Diffusion model alignment using direct preference optimization

B Wallace, M Dang, R Rafailov… - Proceedings of the …, 2024 - openaccess.thecvf.com
Large language models (LLMs) are fine-tuned using human comparison data with
Reinforcement Learning from Human Feedback (RLHF) methods to make them better …

TBench: Benchmarking Current Progress in Text-to-3D Generation

Y He, Y Bai, M Lin, W Zhao, Y Hu, J Sheng, R Yi… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent methods in text-to-3D leverage powerful pretrained diffusion models to optimize
NeRF. Notably, these methods are able to produce high-quality 3D scenes without training …

Hive: Harnessing human feedback for instructional visual editing

S Zhang, X Yang, Y Feng, C Qin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Incorporating human feedback has been shown to be crucial to align text generated by large
language models to human preferences. We hypothesize that state-of-the-art instructional …

[HTML][HTML] A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G Jin, Y Dong… - Artificial Intelligence …, 2024 - Springer
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

Power hungry processing: Watts driving the cost of AI deployment?

S Luccioni, Y Jernite, E Strubell - The 2024 ACM Conference on …, 2024 - dl.acm.org
Recent years have seen a surge in the popularity of commercial AI products based on
generative, multi-purpose AI systems promising a unified approach to building machine …