A survey on fairness in large language models
Large language models (LLMs) have shown powerful performance and development
prospect and are widely deployed in the real world. However, LLMs can capture social …
prospect and are widely deployed in the real world. However, LLMs can capture social …
Anonymisation models for text data: State of the art, challenges and future directions
This position paper investigates the problem of automated text anonymisation, which is a
prerequisite for secure sharing of documents containing sensitive information about …
prerequisite for secure sharing of documents containing sensitive information about …
Auto-debias: Debiasing masked language models with automated biased prompts
Human-like biases and undesired social stereotypes exist in large pretrained language
models. Given the wide adoption of these models in real-world applications, mitigating such …
models. Given the wide adoption of these models in real-world applications, mitigating such …
Leace: Perfect linear concept erasure in closed form
N Belrose, D Schneider-Joseph… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Concept erasure aims to remove specified features from a representation. It can
improve fairness (eg preventing a classifier from using gender or race) and interpretability …
improve fairness (eg preventing a classifier from using gender or race) and interpretability …
Winogrande: An adversarial winograd schema challenge at scale
Commonsense reasoning remains a major challenge in AI, and yet, recent progresses on
benchmarks may seem to suggest otherwise. In particular, the recent neural language …
benchmarks may seem to suggest otherwise. In particular, the recent neural language …
Plug and play language models: A simple approach to controlled text generation
Large transformer-based language models (LMs) trained on huge text corpora have shown
unparalleled generation capabilities. However, controlling attributes of the generated …
unparalleled generation capabilities. However, controlling attributes of the generated …
Think locally, act globally: Federated learning with local and global representations
Federated learning is a method of training models on private data distributed over multiple
devices. To keep device data private, the global model is trained by only communicating …
devices. To keep device data private, the global model is trained by only communicating …
Differential privacy has disparate impact on model accuracy
E Bagdasaryan, O Poursaeed… - Advances in neural …, 2019 - proceedings.neurips.cc
Differential privacy (DP) is a popular mechanism for training machine learning models with
bounded leakage about the presence of specific points in the training data. The cost of …
bounded leakage about the presence of specific points in the training data. The cost of …
Null it out: Guarding protected attributes by iterative nullspace projection
The ability to control for the kinds of information encoded in neural representation has a
variety of use cases, especially in light of the challenge of interpreting these models. We …
variety of use cases, especially in light of the challenge of interpreting these models. We …
Evaluating gender bias in machine translation
We present the first challenge set and evaluation protocol for the analysis of gender bias in
machine translation (MT). Our approach uses two recent coreference resolution datasets …
machine translation (MT). Our approach uses two recent coreference resolution datasets …