C-pack: Packed resources for general chinese embeddings

S Xiao, Z Liu, P Zhang, N Muennighoff, D Lian… - Proceedings of the 47th …, 2024 - dl.acm.org
We introduce C-Pack, a package of resources that significantly advances the field of general
text embeddings for Chinese. C-Pack includes three critical resources. 1) C-MTP is a …

ConFit: Improving resume-job matching using data augmentation and contrastive learning

X Yu, J Zhang, Z Yu - Proceedings of the 18th ACM Conference on …, 2024 - dl.acm.org
A reliable resume-job matching system helps a company find suitable candidates from a
pool of resumes, and helps a job seeker find relevant jobs from a list of job posts. However …

End-to-End Retrieval with Learned Dense and Sparse Representations Using Lucene

H Chen, C Lassance, J Lin - arXiv preprint arXiv:2311.18503, 2023 - arxiv.org
The bi-encoder architecture provides a framework for understanding machine-learned
retrieval models based on dense and sparse vector representations. Although these …

Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control

T Nguyen, M Hendriksen, A Yates, M Rijke - European Conference on …, 2024 - Springer
Learned sparse retrieval (LSR) is a family of neural methods that encode queries and
documents into sparse lexical vectors that can be indexed and retrieved efficiently with an …

Over-penalization for Extra Information in Neural IR Models

K Usuha, MP Kato, S Fujita - Proceedings of the 33rd ACM International …, 2024 - dl.acm.org
This paper presents our analysis of neural IR models, particularly focusing on over-
penalization for extra information (OPEX)-a phenomenon where addition of a sentence to a …

Link, Synthesize, Retrieve: Universal Document Linking for Zero-Shot Information Retrieval

DY Hwang, B Taha, H Pande, Y Nechaev - arXiv preprint arXiv …, 2024 - arxiv.org
Despite the recent advancements in information retrieval (IR), zero-shot IR remains a
significant challenge, especially when dealing with new domains, languages, and newly …

[PDF][PDF] Comparatively Assessing Large Language Models for Query Expansion in Information Retrieval via Zero-Shot and Chain-of-Thought Prompting

D Rizzo, A Raganato, M Viviani - 2024 - ceur-ws.org
In our research, we aim to assess the effectiveness of Large Language Models (LLMs) in
performing query expansion in the context of Information Retrieval (IR). Some recent …

[PDF][PDF] An Effective Framework for Legal Entailment Retrieval with Large Language Models and Optimal Transport

TC TRAN - 2024 - dspace.jaist.ac.jp
Legal case entailment is a fundamental principle of the legal system in which the verdict of
previous cases serves as a guiding precedent for later cases with similar factual …

[PDF][PDF] Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control

A Yates, M de Rijke - staff.fnwi.uva.nl
Learned sparse retrieval (LSR) is a family of neural methods that encode queries and
documents into sparse lexical vectors that can be indexed and retrieved efficiently with an …

[PDF][PDF] Multimodal Machine Learning for Information Retrieval

MY Hendriksen - researchgate.net
Suppose you want to learn more about a certain topic. For the sake of argument, let us
assume this topic is multimodal machine learning for information retrieval. How would you …