Name-based demographic inference and the unequal distribution of misrecognition

JW Lockhart, MM King, C Munsch - Nature Human Behaviour, 2023 - nature.com
Academics and companies increasingly draw on large datasets to understand the social
world, and name-based demographic ascription tools are widespread for imputing …

[HTML][HTML] Avoiding bias when inferring race using name-based approaches

D Kozlowski, DS Murray, A Bell, W Hulsey, V Larivière… - Plos one, 2022 - journals.plos.org
Racial disparity in academia is a widely acknowledged problem. The quantitative
understanding of racial-based systemic inequalities is an important step towards a more …

[HTML][HTML] Estimating the success of re-identifications in incomplete datasets using generative models

L Rocher, JM Hendrickx, YA De Montjoye - Nature communications, 2019 - nature.com
While rich medical, behavioral, and socio-demographic data are key to modern data-driven
research, their collection and use raise legitimate privacy concerns. Anonymizing datasets …

What's in a name? Reducing bias in bios without access to protected attributes

A Romanov, M De-Arteaga, H Wallach… - arXiv preprint arXiv …, 2019 - arxiv.org
There is a growing body of work that proposes methods for mitigating bias in machine
learning systems. These methods typically rely on access to protected attributes such as …

Addressing census data problems in race imputation via fully Bayesian Improved Surname Geocoding and name supplements

K Imai, S Olivella, ETR Rosenman - Science Advances, 2022 - science.org
Prediction of individuals' race and ethnicity plays an important role in studies of racial
disparity. Bayesian Improved Surname Geocoding (BISG), which relies on detailed census …

Using first name information to improve race and ethnicity classification

I Voicu - Statistics and Public Policy, 2018 - Taylor & Francis
This article uses a recent first name list to develop an improvement to an existing Bayesian
classifier, namely the Bayesian Improved Surname Geocoding (BISG) method, which …

MABEL: Attenuating gender bias using textual entailment data

J He, M Xia, C Fellbaum, D Chen - arXiv preprint arXiv:2210.14975, 2022 - arxiv.org
Pre-trained language models encode undesirable social biases, which are further
exacerbated in downstream use. To this end, we propose MABEL (a Method for Attenuating …

Measuring model biases in the absence of ground truth

O Aka, K Burke, A Bauerle, C Greer… - Proceedings of the 2021 …, 2021 - dl.acm.org
The measurement of bias in machine learning often focuses on model performance across
identity subgroups (such as man and woman) with respect to groundtruth labels. However …

Lessons from archives: Strategies for collecting sociocultural data in machine learning

ES Jo, T Gebru - Proceedings of the 2020 conference on fairness …, 2020 - dl.acm.org
A growing body of work shows that many problems in fairness, accountability, transparency,
and ethics in machine learning systems are rooted in decisions surrounding the data …

[HTML][HTML] A cross-verified database of notable people, 3500BC-2018AD

M Laouenan, P Bhargava, JB Eyméoud, O Gergaud… - Scientific Data, 2022 - nature.com
A new strand of literature aims at building the most comprehensive and accurate database
of notable individuals. We collect a massive amount of data from various editions of …