SAILER: structure-aware pre-trained language model for legal case retrieval

H Li, Q Ai, J Chen, Q Dong, Y Wu, Y Liu… - Proceedings of the 46th …, 2023 - dl.acm.org
Legal case retrieval, which aims to find relevant cases for a query case, plays a core role in
the intelligent legal system. Despite the success that pre-training has achieved in ad-hoc …

Constructing tree-based index for efficient and effective dense retrieval

H Li, Q Ai, J Zhan, J Mao, Y Liu, Z Liu… - Proceedings of the 46th …, 2023 - dl.acm.org
Recent studies have shown that Dense Retrieval (DR) techniques can significantly improve
the performance of first-stage retrieval in IR systems. Despite its empirical effectiveness, the …

An intent taxonomy of legal case retrieval

Y Shao, H Li, Y Wu, Y Liu, Q Ai, J Mao, Y Ma… - ACM Transactions on …, 2023 - dl.acm.org
Legal case retrieval is a special Information Retrieval (IR) task focusing on legal case
documents. Depending on the downstream tasks of the retrieved case documents, users' …

Thuir@ coliee 2023: Incorporating structural knowledge into pre-trained language models for legal case retrieval

H Li, W Su, C Wang, Y Wu, Q Ai, Y Liu - arXiv preprint arXiv:2305.06812, 2023 - arxiv.org
Legal case retrieval techniques play an essential role in modern intelligent legal systems. As
an annually well-known international competition, COLIEE is aiming to achieve the state-of …

Unsupervised real-time hallucination detection based on the internal states of large language models

W Su, C Wang, Q Ai, Y Hu, Z Wu, Y Zhou… - arXiv preprint arXiv …, 2024 - arxiv.org
Hallucinations in large language models (LLMs) refer to the phenomenon of LLMs
producing responses that are coherent yet factually inaccurate. This issue undermines the …

Dragin: Dynamic retrieval augmented generation based on the real-time information needs of large language models

W Su, Y Tang, Q Ai, Z Wu, Y Liu - arXiv preprint arXiv:2403.10081, 2024 - arxiv.org
Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what
to retrieve during the text generation process of Large Language Models (LLMs). There are …

Unbiased Learning to Rank Meets Reality: Lessons from Baidu's Large-Scale Search Dataset

P Hager, R Deffayet, JM Renders, O Zoeter… - Proceedings of the 47th …, 2024 - dl.acm.org
Unbiased learning-to-rank (ULTR) is a well-established framework for learning from user
clicks, which are often biased by the ranker collecting the data. While theoretically justified …

When Search Engine Services meet Large Language Models: Visions and Challenges

H Xiong, J Bian, Y Li, X Li, M Du… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Combining Large Language Models (LLMs) with search engine services marks a significant
shift in the field of services computing, opening up new possibilities to enhance how we …

THUIR@ COLIEE 2023: more parameters and legal knowledge for legal case entailment

H Li, C Wang, W Su, Y Wu, Q Ai, Y Liu - arXiv preprint arXiv:2305.06817, 2023 - arxiv.org
This paper describes the approach of the THUIR team at the COLIEE 2023 Legal Case
Entailment task. This task requires the participant to identify a specific paragraph from a …

Mitigating Entity-Level Hallucination in Large Language Models

W Su, Y Tang, Q Ai, C Wang, Z Wu, Y Liu - arXiv preprint arXiv:2407.09417, 2024 - arxiv.org
The emergence of Large Language Models (LLMs) has revolutionized how users access
information, shifting from traditional search engines to direct question-and-answer …