Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 1197 | 2023 |
Scaling Language Models: Methods, Analysis & Insights from Training Gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 890 | 2021 |
Randomized Prior Functions for Deep Reinforcement Learning I Osband, J Aslanides, A Cassirer Neural Information Processing Systems 32, 2018 | 423 | 2018 |
Red Teaming Language Models with Language Models E Perez, S Huang, F Song, T Cai, R Ring, J Aslanides, A Glaese, ... arXiv preprint arXiv:2202.03286, 2022 | 399 | 2022 |
Improving alignment of dialogue agents via targeted human judgements A Glaese, N McAleese, M Trębacz, J Aslanides, V Firoiu, T Ewalds, ... arXiv preprint arXiv:2209.14375, 2022 | 366 | 2022 |
Acme: A Research Framework for Distributed Reinforcement Learning M Hoffman, B Shahriari, J Aslanides, G Barth-Maron, F Behbahani, ... arXiv preprint arXiv:2006.00979, 2020 | 246 | 2020 |
When to use parametric models in reinforcement learning? H van Hasselt, M Hessel, J Aslanides Neural Information Processing Systems 33, 2019 | 216 | 2019 |
Behaviour Suite for Reinforcement Learning I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ... International Conference on Learning Representations 8, 2020 | 182 | 2020 |
Teaching language models to support answers with verified quotes J Menick, M Trebacz, V Mikulik, J Aslanides, F Song, M Chadwick, ... arXiv preprint arXiv:2203.11147, 2022 | 169 | 2022 |
Fine-tuning language models to find agreement among humans with diverse preferences M Bakker, M Chadwick, H Sheahan, M Tessler, L Campbell-Gillingham, ... Advances in Neural Information Processing Systems 35, 38176-38189, 2022 | 149 | 2022 |
Relativity concept inventory: Development, analysis, and results JS Aslanides, CM Savage Physical Review Special Topics-Physics Education Research 9 (1), 010118, 2013 | 80 | 2013 |
A general approach to fairness with optimal transport S Chiappa, R Jiang, T Stepleton, A Pacchiano, H Jiang, J Aslanides AAAI, 2020 | 76* | 2020 |
Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning G Parascandolo, L Buesing, J Merel, L Hasenclever, J Aslanides, ... arXiv preprint arXiv:2004.11410, 2020 | 31 | 2020 |
Universal Reinforcement Learning Algorithms: Survey and Experiments J Aslanides, J Leike, M Hutter International Joint Conference on Artificial Intelligence 26, 1403-1410, 2017 | 25 | 2017 |
TF-Replicator: Distributed Machine Learning for Researchers P Buchlovsky, D Budden, D Grewe, C Jones, J Aslanides, F Besse, ... arXiv preprint arXiv:1902.00465, 2019 | 24 | 2019 |
Fine-Tuning Language Models via Epistemic Neural Networks I Osband, SM Asghari, B Van Roy, N McAleese, J Aslanides, G Irving arXiv preprint arXiv:2211.01568, 2022 | 9 | 2022 |
AIXIjs: A software demo for general reinforcement learning J Aslanides arXiv preprint arXiv:1705.07615, 2017 | 6 | 2017 |
Generalised discount functions applied to a Monte-Carlo AImu implementation S Lamont, J Aslanides, J Leike, M Hutter Autonomous Agents and Multiagent Systems, 2017, 2017 | 5 | 2017 |