Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 826 | 2021 |
Localizing syntactic predictions using recurrent neural network grammars JR Brennan, C Dyer, A Kuncoro, JT Hale Neuropsychologia 146, 107479, 2020 | 733 | 2020 |
Dynet: The dynamic neural network toolkit G Neubig, C Dyer, Y Goldberg, A Matthews, W Ammar, A Anastasopoulos, ... arXiv preprint, 2017 | 443* | 2017 |
What do recurrent neural network grammars learn about syntax? A Kuncoro, M Ballesteros, L Kong, C Dyer, G Neubig, NA Smith Proceedings of EACL 2017 1, 1249-1258, 2017 | 164 | 2017 |
LSTMs can learn syntax-sensitive dependencies well, but modeling structure makes them better A Kuncoro, C Dyer, J Hale, D Yogatama, S Clark, P Blunsom Proceedings of the 56th Annual Meeting of the Association for Computational …, 2018 | 160 | 2018 |
Finding syntax in human encephalography with beam search J Hale, C Dyer, A Kuncoro, JR Brennan arXiv preprint arXiv:1806.04127, 2018 | 151 | 2018 |
Unsupervised recurrent neural network grammars Y Kim, AM Rush, L Yu, A Kuncoro, C Dyer, G Melis arXiv preprint arXiv:1904.03746, 2019 | 145 | 2019 |
Mind the gap: Assessing temporal generalization in neural language models A Lazaridou, A Kuncoro, E Gribovskaya, D Agrawal, A Liska, T Terzi, ... Advances in Neural Information Processing Systems 34, 29348-29363, 2021 | 132* | 2021 |
Cyprien de Masson d’Autume JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... | 85 | 2021 |
Distilling an ensemble of greedy dependency parsers into one MST parser A Kuncoro, M Ballesteros, L Kong, C Dyer, NA Smith Proceedings of EMNLP, 1744-1753, 2016 | 84 | 2016 |
IndoNLG: Benchmark and resources for evaluating Indonesian natural language generation S Cahyawijaya, GI Winata, B Wilie, K Vincentio, X Li, A Kuncoro, S Ruder, ... arXiv preprint arXiv:2104.08200, 2021 | 66 | 2021 |
Memory architectures in recurrent neural network language models D Yogatama, Y Miao, G Melis, W Ling, A Kuncoro, C Dyer, P Blunsom International Conference on Learning Representations, 2018 | 61 | 2018 |
Cyprien de Masson d’Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew J JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, HF Song, J Aslanides, ... Johnson, Blake A. Hechtman, Laura Weidinger, Iason Gabriel, William S. Isaac …, 2021 | 52 | 2021 |
Syntactic structure distillation pretraining for bidirectional encoders A Kuncoro, L Kong, D Fried, D Yogatama, L Rimell, C Dyer, P Blunsom Transactions of the Association for Computational Linguistics 8, 776-794, 2020 | 47* | 2020 |
A systematic investigation of commonsense knowledge in large language models XL Li, A Kuncoro, J Hoffmann, CM d'Autume, P Blunsom, A Nematzadeh arXiv preprint arXiv:2111.00607, 2021 | 43 | 2021 |
Cyprien de Masson d’Autume, Tomáš Kociský, Sebastian Ruder, Dani Yogatama, Kris Cao, Susannah Young, and Phil Blunsom. 2021. Mind the gap: Assessing temporal generalization in … A Lazaridou, A Kuncoro, E Gribovskaya, D Agrawal, A Liska, T Terzi, ... Advances in Neural Information Processing Systems 34, 6-14, 0 | 41 | |
Transformer grammars: Augmenting transformer language models with syntactic inductive biases at scale L Sartran, S Barrett, A Kuncoro, M Stanojević, P Blunsom, C Dyer Transactions of the Association for Computational Linguistics 10, 1423-1439, 2022 | 40 | 2022 |
Scalable syntax-aware language models using knowledge distillation A Kuncoro, C Dyer, L Rimell, S Clark, P Blunsom arXiv preprint arXiv:1906.06438, 2019 | 40 | 2019 |
The perils of natural behaviour tests for unnatural models: the case of number agreement A Kuncoro, C Dyer, J Hale, P Blunsom Poster presented at Learning Language in Humans and in Machines, Paris, Fr …, 2018 | 9 | 2018 |
DiLoCo: Distributed Low-Communication Training of Language Models A Douillard, Q Feng, AA Rusu, R Chhaparia, Y Donchev, A Kuncoro, ... arXiv preprint arXiv:2311.08105, 2023 | 8 | 2023 |