Hybrid rl: Using both offline and online data can make rl efficient Y Song, Y Zhou, A Sekhari, JA Bagnell, A Krishnamurthy, W Sun ICLR 2023, 2022 | 70 | 2022 |
Test-time distribution normalization for contrastively learned visual-language models Y Zhou, J Ren, F Li, R Zabih, SN Lim Advances in Neural Information Processing Systems 36, 2024 | 14* | 2024 |
Archer: Training language model agents via hierarchical multi-turn rl Y Zhou, A Zanette, J Pan, S Levine, A Kumar arXiv preprint arXiv:2402.19446, 2024 | 11 | 2024 |
Autonomous evaluation and refinement of digital agents J Pan, Y Zhang, N Tomlin, Y Zhou, S Levine, A Suhr First Conference on Language Modeling, 2024 | 10 | 2024 |
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Y Zhai, H Bai, Z Lin, J Pan, S Tong, Y Zhou, A Suhr, S Xie, Y LeCun, Y Ma, ... arXiv preprint arXiv:2405.10292, 2024 | 9 | 2024 |
Offline data enhanced on-policy policy gradient with provable guarantees Y Zhou, A Sekhari, Y Song, W Sun arXiv preprint arXiv:2311.08384, 2023 | 6 | 2023 |
: Backward-compatible Training with Basis Transformation Y Zhou, Z Li, A Shrivastava, H Zhao, A Torralba, T Tian, SN Lim ICCV 2023, 2022 | 5 | 2022 |
Digirl: Training in-the-wild device-control agents with autonomous reinforcement learning H Bai, Y Zhou, M Cemri, J Pan, A Suhr, S Levine, A Kumar arXiv preprint arXiv:2406.11896, 2024 | 4 | 2024 |
Improve discourse dependency parsing with contextualized representations Y Zhou, Y Feng ACL 2022 findings, 2022 | 3 | 2022 |
Aligning Large Language Models with Representation Editing: A Control Perspective L Kong, H Wang, W Mu, Y Du, Y Zhuang, Y Zhou, Y Song, R Zhang, ... arXiv preprint arXiv:2406.05954, 2024 | 1 | 2024 |
KALIE: Fine-Tuning Vision-Language Models for Open-World Manipulation without Robot Data G Tang, S Rajkumar, Y Zhou, HR Walke, S Levine, K Fang arXiv preprint arXiv:2409.14066, 2024 | | 2024 |
GAPX: generalized autoregressive paraphrase-identification X Y Zhou, R Li, H Housen, SN Lim Advances in Neural Information Processing Systems 35, 2211-2225, 2022 | | 2022 |