Lexical diversity in kinship across languages and dialects
Languages are known to describe the world in diverse ways. Across lexicons, diversity is
pervasive, appearing through phenomena such as lexical gaps and untranslatability …
pervasive, appearing through phenomena such as lexical gaps and untranslatability …
A large and evolving cognate database
We present CogNet, a large-scale, automatically-built database of sense-tagged cognates—
words of common origin and meaning across languages. CogNet is continuously evolving …
words of common origin and meaning across languages. CogNet is continuously evolving …
Representing interlingual meaning in lexical databases
In today's multilingual lexical databases, the majority of the world's languages are under-
represented. Beyond a mere issue of resource incompleteness, we show that existing lexical …
represented. Beyond a mere issue of resource incompleteness, we show that existing lexical …
Diversity and language technology: how techno-linguistic bias can cause epistemic injustice
It is well known that AI-based language technology--large language models, machine
translation systems, multilingual dictionaries, and corpora--is currently limited to 2 to 3 …
translation systems, multilingual dictionaries, and corpora--is currently limited to 2 to 3 …
Language diversity: Visible to humans, exploitable by machines
G Bella, E Byambadorj, Y Chandrashekar… - arXiv preprint arXiv …, 2022 - arxiv.org
The Universal Knowledge Core (UKC) is a large multilingual lexical database with a focus
on language diversity and covering over a thousand languages. The aim of the database, as …
on language diversity and covering over a thousand languages. The aim of the database, as …
Diversity and language technology: how language modeling bias causes epistemic injustice
It is well known that AI-based language technology—large language models, machine
translation systems, multilingual dictionaries, and corpora—is currently limited to three …
translation systems, multilingual dictionaries, and corpora—is currently limited to three …
Using linguistic typology to enrich multilingual lexicons: the case of lexical gaps in kinship
This paper describes a method to enrich lexical resources with content relating to linguistic
diversity, based on knowledge from the field of lexical typology. We capture the …
diversity, based on knowledge from the field of lexical typology. We capture the …
Towards bridging the digital language divide
It is a well-known fact that current AI-based language technology--language models,
machine translation systems, multilingual dictionaries and corpora--focuses on the world's 2 …
machine translation systems, multilingual dictionaries and corpora--focuses on the world's 2 …
Tackling Language Modelling Bias in Support of Linguistic Diversity
Current AI-based language technologies—language models, machine translation systems,
multilingual dictionaries and corpora—are known to focus on the world's 2–3% most widely …
multilingual dictionaries and corpora—are known to focus on the world's 2–3% most widely …
[PDF][PDF] Linguistic diversity and bias in online dictionaries
Traditional bilingual dictionaries, once pivotal translation tools, have been superseded on
the Web by multilingual lexical databases that interconnect the lexicons of hundreds of …
the Web by multilingual lexical databases that interconnect the lexicons of hundreds of …