The illusion of artificial inclusion

W Agnew, AS Bergman, J Chien, M Díaz… - Proceedings of the CHI …, 2024 - dl.acm.org
Human participants play a central role in the development of modern artificial intelligence
(AI) technology, in psychological science, and in user research. Recent advances in …

Representation in AI evaluations

AS Bergman, LA Hendricks, M Rauh, B Wu… - Proceedings of the …, 2023 - dl.acm.org
Calls for representation in artificial intelligence (AI) and machine learning (ML) are
widespread, with" representation" or" representativeness" generally understood to be both …

Lost in translation: large language models in non-English content analysis

G Nicholas, A Bhatia - arXiv preprint arXiv:2306.07377, 2023 - arxiv.org
In recent years, large language models (eg, Open AI's GPT-4, Meta's LLaMa, Google's
PaLM) have become the dominant approach for building AI systems to analyze and …

Toxic language detection: A systematic review of Arabic datasets

I Bensalem, P Rosso, H Zitouni - Expert Systems, 2024 - Wiley Online Library
The detection of toxic language in the Arabic language has emerged as an active area of
research in recent years, and reviewing the existing datasets employed for training the …

ALDi: Quantifying the arabic level of dialectness of text

A Keleg, S Goldwater, W Magdy - arXiv preprint arXiv:2310.13747, 2023 - arxiv.org
Transcribed speech and user-generated text in Arabic typically contain a mixture of Modern
Standard Arabic (MSA), the standardized language taught in schools, and Dialectal Arabic …

Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation

M Boubdir, E Kim, B Ermis, M Fadaee… - arXiv preprint arXiv …, 2023 - arxiv.org
Human evaluation is increasingly critical for assessing large language models, capturing
linguistic nuances, and reflecting user preferences more accurately than traditional …

Toxic language detection: a systematic review of Arabic datasets

I Bensalem, P Rosso, H Zitouni - arXiv preprint arXiv:2312.07228, 2023 - arxiv.org
The detection of toxic language in the Arabic language has emerged as an active area of
research in recent years, and reviewing the existing datasets employed for training the …

The Geopolitics of Deplatforming: A Study of Suspensions of Politically-Interested Iranian Accounts on Twitter

A Casas - Political Communication, 2024 - Taylor & Francis
Social media companies increasingly play a role in regulating freedom of speech. Debates
over ideological motivations behind suspension policies of major platforms are on the rise …

ChatGPT Rates Natural Language Explanation Quality Like Humans: But on Which Scales?

F Huang, H Kwak, K Park, J An - arXiv preprint arXiv:2403.17368, 2024 - arxiv.org
As AI becomes more integral in our lives, the need for transparency and responsibility
grows. While natural language explanations (NLEs) are vital for clarifying the reasoning …

Estimating the Level of Dialectness Predicts Interannotator Agreement in Multi-dialect Arabic Datasets

A Keleg, W Magdy, S Goldwater - arXiv preprint arXiv:2405.11282, 2024 - arxiv.org
On annotating multi-dialect Arabic datasets, it is common to randomly assign the samples
across a pool of native Arabic speakers. Recent analyses recommended routing dialectal …