关注
Yifei Zhou
Yifei Zhou
在 berkeley.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Hybrid rl: Using both offline and online data can make rl efficient
Y Song, Y Zhou, A Sekhari, JA Bagnell, A Krishnamurthy, W Sun
ICLR 2023, 2022
702022
Test-time distribution normalization for contrastively learned visual-language models
Y Zhou, J Ren, F Li, R Zabih, SN Lim
Advances in Neural Information Processing Systems 36, 2024
14*2024
Archer: Training language model agents via hierarchical multi-turn rl
Y Zhou, A Zanette, J Pan, S Levine, A Kumar
arXiv preprint arXiv:2402.19446, 2024
112024
Autonomous evaluation and refinement of digital agents
J Pan, Y Zhang, N Tomlin, Y Zhou, S Levine, A Suhr
First Conference on Language Modeling, 2024
102024
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Y Zhai, H Bai, Z Lin, J Pan, S Tong, Y Zhou, A Suhr, S Xie, Y LeCun, Y Ma, ...
arXiv preprint arXiv:2405.10292, 2024
92024
Offline data enhanced on-policy policy gradient with provable guarantees
Y Zhou, A Sekhari, Y Song, W Sun
arXiv preprint arXiv:2311.08384, 2023
62023
: Backward-compatible Training with Basis Transformation
Y Zhou, Z Li, A Shrivastava, H Zhao, A Torralba, T Tian, SN Lim
ICCV 2023, 2022
52022
Digirl: Training in-the-wild device-control agents with autonomous reinforcement learning
H Bai, Y Zhou, M Cemri, J Pan, A Suhr, S Levine, A Kumar
arXiv preprint arXiv:2406.11896, 2024
42024
Improve discourse dependency parsing with contextualized representations
Y Zhou, Y Feng
ACL 2022 findings, 2022
32022
Aligning Large Language Models with Representation Editing: A Control Perspective
L Kong, H Wang, W Mu, Y Du, Y Zhuang, Y Zhou, Y Song, R Zhang, ...
arXiv preprint arXiv:2406.05954, 2024
12024
KALIE: Fine-Tuning Vision-Language Models for Open-World Manipulation without Robot Data
G Tang, S Rajkumar, Y Zhou, HR Walke, S Levine, K Fang
arXiv preprint arXiv:2409.14066, 2024
2024
GAPX: generalized autoregressive paraphrase-identification X
Y Zhou, R Li, H Housen, SN Lim
Advances in Neural Information Processing Systems 35, 2211-2225, 2022
2022
系统目前无法执行此操作,请稍后再试。
文章 1–12