Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors M Baroni, G Dinu, G Kruszewski Proceedings of the 52nd Annual Meeting of the Association for Computational …, 2014 | 2067 | 2014 |
Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1344 | 2023 |
What you can cram into a single vector: Probing sentence embeddings for linguistic properties A Conneau, G Kruszewski, G Lample, L Barrault, M Baroni Proceedings of the 56th Annual Meeting of the Association for Computational …, 2018 | 953 | 2018 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 877 | 2022 |
The LAMBADA dataset: Word prediction requiring a broad discourse context D Paperno, G Kruszewski, A Lazaridou, QN Pham, R Bernardi, S Pezzelle, ... Proceedings of the 54th Annual Meeting of the Association for Computational …, 2016 | 447 | 2016 |
How cosmopolitan are emojis? Exploring emojis usage and meaning over different languages with distributional semantics F Barbieri, G Kruszewski, F Ronzano, H Saggion Proceedings of the 24th ACM international conference on Multimedia, 531-535, 2016 | 228 | 2016 |
The emergence of number and syntax units in LSTM language models Y Lakretz, G Kruszewski, T Desbordes, D Hupkes, S Dehaene, M Baroni Proceedings of the 2019 Conference of the North American Chapter of the …, 2019 | 186 | 2019 |
Memorize or generalize? searching for a compositional rnn in a haystack A Liška, G Kruszewski, M Baroni arXiv preprint arXiv:1802.06467, 2018 | 79 | 2018 |
Convolutional neural network language models NQ Pham, G Kruszewski, G Boleda Proceedings of the 2016 conference on empirical methods in natural language …, 2016 | 73 | 2016 |
Cooperative learning of disjoint syntax and semantics S Havrylov, G Kruszewski, A Joulin Proceedings of the 2019 Conference of the North American Chapter of the …, 2019 | 59 | 2019 |
Deriving boolean structures from distributional vectors G Kruszewski, D Paperno, M Baroni Transactions of the Association for Computational Linguistics 3, 375-388, 2015 | 51 | 2015 |
Generating grammar exercises L Perez-Beltrachini, C Gardent, G Kruszewski Proceedings of the Seventh Workshop on Building Educational Applications …, 2012 | 45 | 2012 |
Aligning Foundation Models for Language with Preferences through -divergence Minimization D Go, T Korbak, G Kruszewski, J Rozen, N Ryu, M Dymetman ICLR 2023 Workshop on Mathematical and Empirical Understanding of Foundation …, 2023 | 44* | 2023 |
On reinforcement learning and distribution matching for fine-tuning language models with no catastrophic forgetting T Korbak, H Elsahar, G Kruszewski, M Dymetman Advances in Neural Information Processing Systems 35, 16203-16220, 2022 | 32 | 2022 |
Learning compositionally through attentive guidance D Hupkes, A Singh, K Korrel, G Kruszewski, E Bruni arXiv preprint arXiv:1805.09657, 2018 | 31 | 2018 |
There is no logical negation here, but there are alternatives: modeling conversational negation with distributional semantics G Kruszewski, D Paperno, R Bernardi, M Baroni Computational Linguistics special issue: Formal Distributional Semantics, 2016 | 28 | 2016 |
Jointly optimizing word representations for lexical and sentential tasks with the C-PHRASE model N Pham, G Kruszewski, A Lazaridou, M Baroni Proceedings of the 53rd Annual Meeting of the Association for Computational …, 2015 | 27* | 2015 |
Controlling conditional language models without catastrophic forgetting T Korbak, H Elsahar, G Kruszewski, M Dymetman International Conference on Machine Learning, 11499-11528, 2022 | 25 | 2022 |
So similar and yet incompatible: Toward the automated identification of semantically compatible words G Kruszewski, M Baroni Proceedings of the 2015 Conference of the North American Chapter of the …, 2015 | 23 | 2015 |
Unsupervised and distributional detection of machine-generated text M Gallé, J Rozen, G Kruszewski, H Elsahar arXiv preprint arXiv:2111.02878, 2021 | 21 | 2021 |