Adversarial removal of demographic attributes revisited

[HTML][HTML] How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing

S Sousa, R Kern - Artificial Intelligence Review, 2023 - Springer

Deep learning (DL) models for natural language processing (NLP) tasks often handle
private data, demanding protection against breaches and disclosures. Data protection laws …

被引用次数：44 相关文章所有 10 个版本

[PDF] arxiv.org

Null it out: Guarding protected attributes by iterative nullspace projection

S Ravfogel, Y Elazar, H Gonen, M Twiton… - arXiv preprint arXiv …, 2020 - arxiv.org

The ability to control for the kinds of information encoded in neural representation has a
variety of use cases, especially in light of the challenge of interpreting these models. We …

被引用次数：335 相关文章所有 5 个版本

[PDF] mit.edu

The text anonymization benchmark (tab): A dedicated corpus and evaluation framework for text anonymization

I Pilán, P Lison, L Øvrelid, A Papadopoulou… - Computational …, 2022 - direct.mit.edu

We present a novel benchmark and associated evaluation metrics for assessing the
performance of text anonymization methods. Text anonymization, defined as the task of …

被引用次数：55 相关文章所有 6 个版本

[PDF] neurips.cc

Investigating gender bias in language models using causal mediation analysis

J Vig, S Gehrmann, Y Belinkov… - Advances in neural …, 2020 - proceedings.neurips.cc

Many interpretation methods for neural models in natural language processing investigate
how information is encoded inside hidden representations. However, these methods can …

被引用次数：337 相关文章所有 10 个版本

[HTML] researchain.net

Measuring and reducing gendered correlations in pre-trained models

K Webster, X Wang, I Tenney, A Beutel, E Pitler… - arXiv preprint arXiv …, 2020 - arxiv.org

Pre-trained models have revolutionized natural language understanding. However,
researchers have found they can encode artifacts undesired in many applications, such as …

被引用次数：91 相关文章所有 5 个版本

[PDF] arxiv.org

A novel estimator of mutual information for learning to disentangle textual representations

P Colombo, C Clavel, P Piantanida - arXiv preprint arXiv:2105.02685, 2021 - arxiv.org

Learning disentangled representations of textual data is essential for many natural language
tasks such as fair classification, style transfer and sentence generation, among others. The …

被引用次数：70 相关文章所有 8 个版本

[PDF] arxiv.org

Societal biases in retrieved contents: Measurement framework and adversarial mitigation of bert rankers

N Rekabsaz, S Kopeinik, M Schedl - … of the 44th International ACM SIGIR …, 2021 - dl.acm.org

Societal biases resonate in the retrieved contents of information retrieval (IR) systems,
resulting in reinforcing existing stereotypes. Approaching this issue requires established …

被引用次数：67 相关文章所有 4 个版本

[PDF] arxiv.org

Learning disentangled textual representations via statistical measures of similarity

P Colombo, G Staerman, N Noiry… - arXiv preprint arXiv …, 2022 - arxiv.org

When working with textual data, a natural application of disentangled representations is fair
classification where the goal is to make predictions without being biased (or influenced) by …

被引用次数：34 相关文章所有 7 个版本

[PDF] arxiv.org

Causal mediation analysis for interpreting neural nlp: The case of gender bias

J Vig, S Gehrmann, Y Belinkov, S Qian, D Nevo… - arXiv preprint arXiv …, 2020 - arxiv.org

Common methods for interpreting neural models in natural language processing typically
examine either their structure or their behavior, but not both. We propose a methodology …

被引用次数：86 相关文章所有 2 个版本

[PDF] arxiv.org

A survey on out-of-distribution evaluation of neural nlp models

X Li, M Liu, S Gao, W Buntine - arXiv preprint arXiv:2306.15261, 2023 - arxiv.org

Adversarial robustness, domain generalization and dataset biases are three active lines of
research contributing to out-of-distribution (OOD) evaluation on neural NLP models …

被引用次数：13 相关文章所有 5 个版本