A review on fairness in machine learning

D Pessach, E Shmueli - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
An increasing number of decisions regarding the daily lives of human beings are being
controlled by artificial intelligence and machine learning (ML) algorithms in spheres ranging …

Trustworthy artificial intelligence: a review

D Kaur, S Uslu, KJ Rittichier, A Durresi - ACM computing surveys (CSUR …, 2022 - dl.acm.org
Artificial intelligence (AI) and algorithmic decision making are having a profound impact on
our daily lives. These systems are vastly used in different high-stakes applications like …

[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

B Wang, W Chen, H Pei, C Xie, M Kang, C Zhang, C Xu… - NeurIPS, 2023 - blogs.qub.ac.uk
Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …

Holistic evaluation of language models

P Liang, R Bommasani, T Lee, D Tsipras… - arXiv preprint arXiv …, 2022 - arxiv.org
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …

The capacity for moral self-correction in large language models

D Ganguli, A Askell, N Schiefer, TI Liao… - arXiv preprint arXiv …, 2023 - arxiv.org
We test the hypothesis that language models trained with reinforcement learning from
human feedback (RLHF) have the capability to" morally self-correct"--to avoid producing …

Auditing large language models: a three-layered approach

J Mökander, J Schuett, HR Kirk, L Floridi - AI and Ethics, 2023 - Springer
Large language models (LLMs) represent a major advance in artificial intelligence (AI)
research. However, the widespread use of LLMs is also coupled with significant ethical and …

A survey on the fairness of recommender systems

Y Wang, W Ma, M Zhang, Y Liu, S Ma - ACM Transactions on …, 2023 - dl.acm.org
Recommender systems are an essential tool to relieve the information overload challenge
and play an important role in people's daily lives. Since recommendations involve …

[图书][B] Towards a standard for identifying and managing bias in artificial intelligence

R Schwartz, R Schwartz, A Vassilev, K Greene… - 2022 - dwt.com
As individuals and communities interact in and with an environment that is increasingly
virtual, they are often vulnerable to the commodification of their digital footprint. Concepts …

A holistic approach to undesired content detection in the real world

T Markov, C Zhang, S Agarwal, FE Nekoul… - Proceedings of the …, 2023 - ojs.aaai.org
We present a holistic approach to building a robust and useful natural language
classification system for real-world content moderation. The success of such a system relies …

Toward causal representation learning

B Schölkopf, F Locatello, S Bauer, NR Ke… - Proceedings of the …, 2021 - ieeexplore.ieee.org
The two fields of machine learning and graphical causality arose and are developed
separately. However, there is, now, cross-pollination and increasing interest in both fields to …