Probing pretrained language models for lexical semantics

I Vulić, EM Ponti, R Litschko, G Glavaš… - Proceedings of the …, 2020 - aclanthology.org
The success of large pretrained language models (LMs) such as BERT and RoBERTa has
sparked interest in probing their representations, in order to unveil what types of knowledge …

Visually grounded reasoning across languages and cultures

F Liu, E Bugliarello, EM Ponti, S Reddy… - arXiv preprint arXiv …, 2021 - arxiv.org
The design of widespread vision-and-language datasets and pre-trained encoders directly
adopts, or draws inspiration from, the concepts and images of ImageNet. While one can …

Sustainable modular debiasing of language models

A Lauscher, T Lueken, G Glavaš - arXiv preprint arXiv:2109.03646, 2021 - arxiv.org
Unfair stereotypical biases (eg, gender, racial, or religious biases) encoded in modern
pretrained language models (PLMs) have negative ethical implications for widespread …

XCOPA: A multilingual dataset for causal commonsense reasoning

EM Ponti, G Glavaš, O Majewska, Q Liu, I Vulić… - arXiv preprint arXiv …, 2020 - arxiv.org
In order to simulate human language capacity, natural language processing systems must
be able to reason about the dynamics of everyday situations, including their possible causes …

[PDF][PDF] Measuring fairness with biased rulers: A comparative study on bias metrics for pre-trained language models

P Delobelle, EK Tokpo, T Calders… - Proceedings of the 2022 …, 2022 - lirias.kuleuven.be
An increasing awareness of biased patterns in natural language processing resources such
as BERT has motivated many metrics to quantify 'bias' and 'fairness' in these resources …

Fast, effective, and self-supervised: Transforming masked language models into universal lexical and sentence encoders

F Liu, I Vulić, A Korhonen, N Collier - arXiv preprint arXiv:2104.08027, 2021 - arxiv.org
Pretrained Masked Language Models (MLMs) have revolutionised NLP in recent years.
However, previous work has indicated that off-the-shelf MLMs are not effective as universal …

On the independence of association bias and empirical fairness in language models

L Cabello, AK Jørgensen, A Søgaard - … of the 2023 ACM Conference on …, 2023 - dl.acm.org
The societal impact of pre-trained language models has prompted researchers to probe
them for strong associations between protected attributes and value-loaded terms, from slur …

Revisiting non-English text simplification: A unified multilingual benchmark

MJ Ryan, T Naous, W Xu - arXiv preprint arXiv:2305.15678, 2023 - arxiv.org
Recent advancements in high-quality, large-scale English resources have pushed the
frontier of English Automatic Text Simplification (ATS) research. However, less work has …

From word types to tokens and back: A survey of approaches to word meaning representation and interpretation

M Apidianaki - Computational Linguistics, 2023 - direct.mit.edu
Vector-based word representation paradigms situate lexical meaning at different levels of
abstraction. Distributional and static embedding models generate a single vector per word …

Simreluz: Similarity and relatedness scores as a semantic evaluation dataset for uzbek language

U Salaev, E Kuriyozov, C Gómez-Rodríguez - arXiv preprint arXiv …, 2022 - arxiv.org
Semantic relatedness between words is one of the core concepts in natural language
processing, thus making semantic evaluation an important task. In this paper, we present a …