[Github] Diffusers: State-of-the-art diffusion models P von Platen, S Patil, A Lozhkov, P Cuenca, N Lambert, K Rasul, ... https://github.com/huggingface/diffusers, 2022 | 256* | 2022 |
Zephyr: Direct distillation of lm alignment L Tunstall, E Beeching, N Lambert, N Rajani, K Rasul, Y Belkada, ... arXiv preprint arXiv:2310.16944, 2023 | 200 | 2023 |
Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning N Lambert, DS Drew, J Yaconelli, R Calandra, S Levine, KSJ Pister IEEE Robotics and Automation Letters 4 (4), 4224-4230, 2019 | 167 | 2019 |
Open LLM Leaderboard E Beeching, C Fourrier, N Habib, S Han, N Lambert, N Rajani, ... URL https://huggingface. co/spaces/HuggingFaceH4/open_llm_leaderboard, 2023 | 158 | 2023 |
On the importance of hyperparameter optimization for model-based reinforcement learning B Zhang, R Rajan, L Pineda, N Lambert, A Biedenkapp, K Chua, F Hutter, ... International Conference on Artificial Intelligence and Statistics, 4015-4023, 2021 | 106 | 2021 |
Objective Mismatch in Model-based Reinforcement Learning N Lambert, B Amos, O Yadan, R Calandra Learning for Dynamics and Control (L4DC), 2020 | 91 | 2020 |
[Github] Trl: Transformer reinforcement learning L von Werra, Y Belkada, L Tunstall, E Beeching, T Thrush, N Lambert https://github.com/lvwerra/trl, 2020 | 82* | 2020 |
[Blog] Illustrating reinforcement learning from human feedback (RLHF) N Lambert, L Castricato, L von Werra, A Havrilla https://hf.co/blog/rlhf, 2022 | 77* | 2022 |
Toward controlled flight of the ionocraft: a flying microrobot using electrohydrodynamic thrust with onboard sensing and no moving parts D Drew, N Lambert, C Schindler, K Pister IEEE Robotics and Automation Letters 3 (4), 2807-2813, 2018 | 76 | 2018 |
Camels in a changing climate: Enhancing lm adaptation with tulu 2 H Ivison, Y Wang, V Pyatkin, N Lambert, M Peters, P Dasigi, J Jang, ... arXiv preprint arXiv:2311.10702, 2023 | 64 | 2023 |
Mbrl-lib: A modular library for model-based reinforcement learning L Pineda, B Amos, A Zhang, NO Lambert, R Calandra arXiv preprint arXiv:2104.10159, 2021 | 49 | 2021 |
Learning generalizable locomotion skills with hierarchical reinforcement learning T Li, N Lambert, R Calandra, F Meier, A Rai IEEE International Conference on Robotics and Automation (ICRA), 413-419, 2020 | 47 | 2020 |
The challenges of exploration for offline reinforcement learning N Lambert, M Wulfmeier, W Whitney, A Byravan, M Bloesch, V Dasagi, ... arXiv preprint arXiv:2201.11861, 2022 | 40 | 2022 |
Reward reports for reinforcement learning TK Gilbert, N Lambert, S Dean, T Zick, A Snoswell, S Mehta Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 84-130, 2023 | 31 | 2023 |
Olmo: Accelerating the science of language models D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord, AH Jha, ... arXiv preprint arXiv:2402.00838, 2024 | 30 | 2024 |
Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning N Lambert, A Wilcox, H Zhang, K Pister, R Calandra IEEE Conference on Decision and Control (CDC), 2880-2887, 2021 | 25 | 2021 |
Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ... arXiv preprint arXiv:2402.00159, 2024 | 23 | 2024 |
The alignment handbook L Tunstall, E Beeching, N Lambert, N Rajani, AM Rush, T Wolf GitHub repository, 2023 | 22 | 2023 |
[HuggingFace] H4 Stack Exchange Preference Dataset N Lambert, NR Lewis Tunstall, T Thrush https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences, 2023 | 18* | 2023 |
Investigating compounding prediction errors in learned dynamics models N Lambert, K Pister, R Calandra arXiv preprint arXiv:2203.09637, 2022 | 18 | 2022 |