Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1434 | 2023 |
Gender coreference and bias evaluation at wmt 2020 T Kocmi, T Limisiewicz, G Stanovsky Proceedings of the Fifth Conference on Machine Translation, 357-364, 2020 | 36 | 2020 |
Universal dependencies according to BERT: both more specific and more general T Limisiewicz, R Rosa, D Mareček arXiv preprint arXiv:2004.14620, 2020 | 17 | 2020 |
A balanced data approach for evaluating cross-lingual transfer: Mapping the linguistic blood bank D Malkin, T Limisiewicz, G Stanovsky arXiv preprint arXiv:2205.04086, 2022 | 15 | 2022 |
Don't Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information T Limisiewicz, D Mareček arXiv preprint arXiv:2206.10744, 2022 | 11 | 2022 |
Introducing orthogonal constraint in structural probes T Limisiewicz, D Mareček arXiv preprint arXiv:2012.15228, 2020 | 11 | 2020 |
Tokenization impacts multilingual language modeling: Assessing vocabulary allocation and overlap across languages T Limisiewicz, J Balhar, D Mareček arXiv preprint arXiv:2305.17179, 2023 | 10 | 2023 |
Syntax Representation in Word Embeddings and Neural Networks--A Survey T Limisiewicz, D Mareček arXiv preprint arXiv:2010.01063, 2020 | 10 | 2020 |
Breaking the curse of multilinguality with cross-lingual expert language models T Blevins, T Limisiewicz, S Gururangan, M Li, H Gonen, NA Smith, ... arXiv preprint arXiv:2401.10440, 2024 | 7 | 2024 |
Debiasing algorithm through model adaptation T Limisiewicz, D Mareček, T Musil arXiv preprint arXiv:2310.18913, 2023 | 7 | 2023 |
You can have your data and balance it too: towards balanced and efficient multilingual models T Limisiewicz, D Malkin, G Stanovsky arXiv preprint arXiv:2210.07135, 2022 | 4 | 2022 |
Myte: Morphology-driven byte encoding for better and fairer multilingual language modeling T Limisiewicz, T Blevins, H Gonen, O Ahia, L Zettlemoyer arXiv preprint arXiv:2403.10691, 2024 | 3 | 2024 |
Exploring the impact of training data distribution and subword tokenization on gender bias in machine translation B Iluz, T Limisiewicz, G Stanovsky, D Mareček arXiv preprint arXiv:2309.12491, 2023 | 2 | 2023 |
Ufal submission for sigtyp supervised cognate detection task T Limisiewicz Proceedings of the 5th Workshop on Research in Computational Linguistic …, 2023 | 1 | 2023 |
Examining Cross-lingual Contextual Embeddings with Orthogonal Structural Probes T Limisiewicz, D Mareček arXiv preprint arXiv:2109.04921, 2021 | 1 | 2021 |
Teaching LLMs at Charles University: Assignments and Activities J Helcl, Z Kasner, O Dušek, T Limisiewicz, D Macháček, T Musil, ... arXiv preprint arXiv:2407.19798, 2024 | | 2024 |
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization O Ahia, S Kumar, H Gonen, V Hoffman, T Limisiewicz, Y Tsvetkov, ... arXiv preprint arXiv:2407.08818, 2024 | | 2024 |
Hidden in the Layers D Mareček, J Libovický, R Rosa, T Musil, T Limisiewicz | | 2020 |
Interpreting and Controlling Linguistic Features in Neural Networks’ Representations T Limisiewicz | | |