关注
Andrea Michi
Andrea Michi
Google DeepMind
在 google.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ...
arXiv preprint arXiv:2403.05530, 2024
6332024
Hyperparameter selection for offline reinforcement learning
TL Paine, C Paduraru, A Michi, C Gulcehre, K Zolna, A Novikov, Z Wang, ...
arXiv preprint arXiv:2007.09055, 2020
1672020
Faster sorting algorithms discovered using deep reinforcement learning
DJ Mankowitz, A Michi, A Zhernov, M Gelmi, M Selvi, C Paduraru, ...
Nature 618 (7964), 257-263, 2023
1642023
Nash learning from human feedback
R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ...
arXiv preprint arXiv:2312.00886, 2023
782023
A generic human–machine annotation framework based on dynamic cooperative learning
Y Zhang, A Michi, J Wagner, E André, B Schuller, F Weninger
IEEE transactions on cybernetics 50 (3), 1230-1239, 2019
192019
Bond: Aligning llms with best-of-n distillation
PG Sessa, R Dadashi, L Hussenot, J Ferret, N Vieillard, A Ramé, ...
arXiv preprint arXiv:2407.14622, 2024
132024
Conditional Language Policy: A General Framework for Steerable Multi-Objective Finetuning
K Wang, R Kidambi, R Sullivan, A Agarwal, C Dann, A Michi, M Gelmi, ...
arXiv preprint arXiv:2407.15762, 2024
72024
Towards practical reinforcement learning for tokamak magnetic control
BD Tracey, A Michi, Y Chervonyi, I Davies, C Paduraru, N Lazic, F Felici, ...
Fusion Engineering and Design 200, 114161, 2024
52024
Towards practical reinforcement learning for tokamak magnetic control
BD Tracey, A Michi, Y Chervonyi, I Davies, C Paduraru, N Lazic, F Felici, ...
arXiv preprint arXiv:2307.11546, 2023
42023
OFFLINE HYPERPARAMETER SELECTION FOR OFFLINE REINFORCEMENT LEARNING
T Le Paine, C Paduraru, A Michi, C Gulcehre, K Zołna, A Novikov, ...
系统目前无法执行此操作,请稍后再试。
文章 1–10