关注
Ryan S Park
Ryan S Park
Stanford Student
在 stanford.edu 的电子邮件经过验证
标题
引用次数
引用次数
年份
From to : Your Language Model is Secretly a Q-Function
R Rafailov, J Hejna, R Park, C Finn
arXiv preprint arXiv:2404.12358, 2024
212024
Disentangling length from quality in direct preference optimization
R Park, R Rafailov, S Ermon, C Finn
arXiv preprint arXiv:2403.19159, 2024
172024
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
R Rafailov, Y Chittepu, R Park, H Sikchi, J Hejna, B Knox, C Finn, ...
arXiv preprint arXiv:2406.02900, 2024
32024
Preference Optimization for Molecular Language Models
R Park, R Theisen, N Sahni, M Patek, A Cichońska, R Rahman
arXiv preprint arXiv:2310.12304, 2023
22023
系统目前无法执行此操作,请稍后再试。
文章 1–4