Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance

J Weed, F Bach - 2019 - projecteuclid.org
Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein
distance Page 1 Bernoulli 25(4A), 2019, 2620–2648 https://doi.org/10.3150/18-BEJ1065 Sharp …

Earth mover's distance minimization for unsupervised bilingual lexicon induction

M Zhang, Y Liu, H Luan, M Sun - Proceedings of the 2017 …, 2017 - aclanthology.org
Cross-lingual natural language processing hinges on the premise that there exists
invariance across languages. At the word level, researchers have identified such invariance …

XQA: A cross-lingual open-domain question answering dataset

J Liu, Y Lin, Z Liu, M Sun - … of the 57th Annual Meeting of the …, 2019 - aclanthology.org
Open-domain question answering (OpenQA) aims to answer questions through text retrieval
and reading comprehension. Recently, lots of neural network-based models have been …

Distances between probability distributions of different dimensions

Y Cai, LH Lim - IEEE Transactions on Information Theory, 2022 - ieeexplore.ieee.org
Comparing probability distributions is an indispensable and ubiquitous task in machine
learning and statistics. The most common way to compare a pair of Borel probability …

Minimax distribution estimation in Wasserstein distance

S Singh, B Póczos - arXiv preprint arXiv:1802.08855, 2018 - arxiv.org
The Wasserstein metric is an important measure of distance between probability
distributions, with applications in machine learning, statistics, probability theory, and data …

Re-evaluating word mover's distance

R Sato, M Yamada, H Kashima - … Conference on Machine …, 2022 - proceedings.mlr.press
The word mover's distance (WMD) is a fundamental technique for measuring the similarity of
two documents. As the crux of WMD, it can take advantage of the underlying geometry of the …

A hybrid semantic query expansion approach for Arabic information retrieval

H ALMarwi, M Ghurab, I Al-Baltah - Journal of Big Data, 2020 - Springer
In fact, most of information retrieval systems retrieve documents based on keywords
matching, which are certainly fail at retrieving documents that have similar meaning with …

A distribution-based model to learn bilingual word embeddings

H Cao, T Zhao, S Zhang, Y Meng - Proceedings of COLING 2016 …, 2016 - aclanthology.org
We introduce a distribution based model to learn bilingual word embeddings from
monolingual data. It is simple, effective and does not require any parallel data or any seed …

Inference for projection-based wasserstein distances on finite spaces

R Okano, M Imaizumi - arXiv preprint arXiv:2202.05495, 2022 - arxiv.org
The Wasserstein distance is a distance between two probability distributions and has
recently gained increasing popularity in statistics and machine learning, owing to its …

How can large language models become more human?

D Wang, M Sadrzadeh, M Stanojević… - Proceedings of the …, 2024 - aclanthology.org
Psycholinguistic experiments reveal that efficiency of human language use is founded on
predictions at both syntactic and lexical levels. Previous models of human prediction …