Imagereward: Learning and evaluating human preferences for text-to-image generation

Reinforcement learning for generative ai: State of the art, opportunities and open research challenges

G Franceschelli, M Musolesi - Journal of Artificial Intelligence Research, 2024 - jair.org

Abstract Generative Artificial Intelligence (AI) is one of the most exciting developments in
Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has …

被引用次数：10 相关文章所有 10 个版本

[PDF] neurips.cc

Pick-a-pic: An open dataset of user preferences for text-to-image generation

Y Kirstain, A Polyak, U Singer… - Advances in …, 2023 - proceedings.neurips.cc

The ability to collect a large dataset of human preferences from text-to-image users is
usually limited to companies, making such datasets inaccessible to the public. To address …

被引用次数：126 相关文章所有 5 个版本

[PDF] neurips.cc

Reinforcement learning for fine-tuning text-to-image diffusion models

Y Fan, O Watkins, Y Du, H Liu, M Ryu… - Advances in …, 2024 - proceedings.neurips.cc

Learning from human feedback has been shown to improve text-to-image models. These
techniques first learn a reward function that captures what humans care about in the task …

被引用次数：77 相关文章所有 7 个版本

[PDF] neurips.cc

Holistic evaluation of text-to-image models

T Lee, M Yasunaga, C Meng, Y Mai… - Advances in …, 2024 - proceedings.neurips.cc

The stunning qualitative improvement of text-to-image models has led to their widespread
attention and adoption. However, we lack a comprehensive quantitative understanding of …

被引用次数：46 相关文章所有 6 个版本

[PDF] arxiv.org

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arXiv preprint arXiv …, 2023 - arxiv.org

Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

被引用次数：97 相关文章所有 6 个版本

[PDF] thecvf.com

Diffusion model alignment using direct preference optimization

B Wallace, M Dang, R Rafailov… - Proceedings of the …, 2024 - openaccess.thecvf.com

Large language models (LLMs) are fine-tuned using human comparison data with
Reinforcement Learning from Human Feedback (RLHF) methods to make them better …

被引用次数：40 相关文章所有 3 个版本

[PDF] arxiv.org

TBench: Benchmarking Current Progress in Text-to-3D Generation

Y He, Y Bai, M Lin, W Zhao, Y Hu, J Sheng, R Yi… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent methods in text-to-3D leverage powerful pretrained diffusion models to optimize
NeRF. Notably, these methods are able to produce high-quality 3D scenes without training …

被引用次数：14 相关文章所有 2 个版本

[PDF] thecvf.com

Hive: Harnessing human feedback for instructional visual editing

S Zhang, X Yang, Y Feng, C Qin… - Proceedings of the …, 2024 - openaccess.thecvf.com

Incorporating human feedback has been shown to be crucial to align text generated by large
language models to human preferences. We hypothesize that state-of-the-art instructional …

被引用次数：49 相关文章所有 4 个版本

[HTML] springer.com Full View

[HTML][HTML] A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G Jin, Y Dong… - Artificial Intelligence …, 2024 - Springer

Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

被引用次数：54 相关文章所有 6 个版本

[PDF] acm.org

Power hungry processing: Watts driving the cost of AI deployment?

S Luccioni, Y Jernite, E Strubell - The 2024 ACM Conference on …, 2024 - dl.acm.org

Recent years have seen a surge in the popularity of commercial AI products based on
generative, multi-purpose AI systems promising a unified approach to building machine …

被引用次数：38 相关文章所有 5 个版本