A survey of the usages of deep learning for natural language processing

DW Otter, JR Medina, JK Kalita - IEEE transactions on neural …, 2020 - ieeexplore.ieee.org
Over the last several years, the field of natural language processing has been propelled
forward by an explosion in the use of deep learning models. This article provides a brief …

Evolution of semantic similarity—a survey

D Chandrasekaran, V Mago - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Estimating the semantic similarity between text data is one of the challenging and open
research problems in the field of Natural Language Processing (NLP). The versatility of …

C-pack: Packed resources for general chinese embeddings

S Xiao, Z Liu, P Zhang, N Muennighoff, D Lian… - Proceedings of the 47th …, 2024 - dl.acm.org
We introduce C-Pack, a package of resources that significantly advances the field of general
text embeddings for Chinese. C-Pack includes three critical resources. 1) C-MTP is a …

MTEB: Massive text embedding benchmark

N Muennighoff, N Tazi, L Magne, N Reimers - arXiv preprint arXiv …, 2022 - arxiv.org
Text embeddings are commonly evaluated on a small set of datasets from a single task not
covering their possible applications to other tasks. It is unclear whether state-of-the-art …

Simcse: Simple contrastive learning of sentence embeddings

T Gao, X Yao, D Chen - arXiv preprint arXiv:2104.08821, 2021 - arxiv.org
This paper presents SimCSE, a simple contrastive learning framework that greatly advances
state-of-the-art sentence embeddings. We first describe an unsupervised approach, which …

Consert: A contrastive framework for self-supervised sentence representation transfer

Y Yan, R Li, S Wang, F Zhang, W Wu, W Xu - arXiv preprint arXiv …, 2021 - arxiv.org
Learning high-quality sentence representations benefits a wide range of natural language
processing tasks. Though BERT-based pre-trained language models achieve high …

Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models

J Ni, GH Abrego, N Constant, J Ma, KB Hall… - arXiv preprint arXiv …, 2021 - arxiv.org
We provide the first exploration of sentence embeddings from text-to-text transformers (T5).
Sentence embeddings are broadly useful for language processing tasks. While T5 achieves …

On the sentence embeddings from pre-trained language models

B Li, H Zhou, J He, M Wang, Y Yang, L Li - arXiv preprint arXiv:2011.05864, 2020 - arxiv.org
Pre-trained contextual representations like BERT have achieved great success in natural
language processing. However, the sentence embeddings from the pre-trained language …

Whitening sentence representations for better semantics and faster retrieval

J Su, J Cao, W Liu, Y Ou - arXiv preprint arXiv:2103.15316, 2021 - arxiv.org
Pre-training models such as BERT have achieved great success in many natural language
processing tasks. However, how to obtain better sentence representation through these pre …

[PDF][PDF] Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

N Reimers - arXiv preprint arXiv:1908.10084, 2019 - fq.pkwyx.com
BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art
performance on sentence-pair regression tasks like semantic textual similarity (STS) …