Repairing the cracked foundation: A survey of obstacles in evaluation practices for generated text
S Gehrmann, E Clark, T Sellam - Journal of Artificial Intelligence Research, 2023 - jair.org
Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …
but improved evaluation approaches are rarely widely adopted. This issue has become …
[HTML][HTML] The path toward equal performance in medical machine learning
To ensure equitable quality of care, differences in machine learning model performance
between patient groups must be addressed. Here, we argue that two separate mechanisms …
between patient groups must be addressed. Here, we argue that two separate mechanisms …
The measure and mismeasure of fairness
The field of fair machine learning aims to ensure that decisions guided by algorithms are
equitable. Over the last decade, several formal, mathematical definitions of fairness have …
equitable. Over the last decade, several formal, mathematical definitions of fairness have …
Beyond the safeguards: exploring the security risks of ChatGPT
E Derner, K Batistič - arXiv preprint arXiv:2305.08005, 2023 - arxiv.org
The increasing popularity of large language models (LLMs) such as ChatGPT has led to
growing concerns about their safety, security risks, and ethical implications. This paper aims …
growing concerns about their safety, security risks, and ethical implications. This paper aims …
Representation in AI evaluations
Calls for representation in artificial intelligence (AI) and machine learning (ML) are
widespread, with" representation" or" representativeness" generally understood to be both …
widespread, with" representation" or" representativeness" generally understood to be both …
A security risk taxonomy for large language models
As large language models (LLMs) permeate more and more applications, an assessment of
their associated security risks becomes increasingly necessary. The potential for exploitation …
their associated security risks becomes increasingly necessary. The potential for exploitation …
Designing equitable algorithms
Predictive algorithms are now commonly used to distribute society's resources and
sanctions. But these algorithms can entrench and exacerbate inequities. To guard against …
sanctions. But these algorithms can entrench and exacerbate inequities. To guard against …
“You Can't Fix What You Can't Measure”: Privately Measuring Demographic Performance Disparities in Federated Learning
M Juarez, A Korolova - … through the Lens of Causality and …, 2023 - proceedings.mlr.press
As in traditional machine learning models, models trained with federated learning may
exhibit disparate performance across demographic groups. Model holders must identify …
exhibit disparate performance across demographic groups. Model holders must identify …
Seeing through the data: A statistical evaluation of prohibited item detection benchmark datasets for X-ray security screening
BKS Isaac-Medina, S Yucer… - Proceedings of the …, 2023 - openaccess.thecvf.com
The rapid progress in automatic prohibited object detection within the context of X-ray
security screening, driven forward by advances in deep learning, has resulted in the first …
security screening, driven forward by advances in deep learning, has resulted in the first …
Making It Possible for the Auditing of AI: A Systematic Review of AI Audits and AI Auditability
Artificial intelligence (AI) technologies have become the key driver of innovation in society.
However, numerous vulnerabilities of AI systems can lead to negative consequences for …
However, numerous vulnerabilities of AI systems can lead to negative consequences for …