Lost in the middle: How language models use long contexts
While recent language models have the ability to take long contexts as input, relatively little
is known about how well they use longer context. We analyze the performance of language …
is known about how well they use longer context. We analyze the performance of language …
Ul2: Unifying language learning paradigms
Existing pre-trained models are generally geared towards a particular class of problems. To
date, there seems to be still no consensus on what the right architecture and pre-training …
date, there seems to be still no consensus on what the right architecture and pre-training …
Personality traits in large language models
The advent of large language models (LLMs) has revolutionized natural language
processing, enabling the generation of coherent and contextually relevant text. As LLMs …
processing, enabling the generation of coherent and contextually relevant text. As LLMs …
Nugget: Neural agglomerative embeddings of text
G Qin, B Van Durme - International Conference on Machine …, 2023 - proceedings.mlr.press
Embedding text sequences is a widespread requirement in modern language
understanding. Existing approaches focus largely on constant-size representations. This is …
understanding. Existing approaches focus largely on constant-size representations. This is …
Marg: Multi-agent review generation for scientific papers
We study the ability of LLMs to generate feedback for scientific papers and develop MARG, a
feedback generation approach using multiple LLM instances that engage in internal …
feedback generation approach using multiple LLM instances that engage in internal …
Selective Perception: Learning Concise State Descriptions for Language Model Actors
The latest large language models (LMs) support increasingly longer contexts. While this
trend permits using substantial amounts of text with SOTA LMs, requiring these large LMs to …
trend permits using substantial amounts of text with SOTA LMs, requiring these large LMs to …
Length-Aware Multi-Kernel Transformer for Long Document Classification
Lengthy documents pose a unique challenge to neural language models due to substantial
memory consumption. While existing state-of-the-art (SOTA) models segment long texts into …
memory consumption. While existing state-of-the-art (SOTA) models segment long texts into …
Multilingual needle in a haystack: Investigating long-context behavior of multilingual large language models
While recent large language models (LLMs) demonstrate remarkable abilities in responding
to queries in diverse languages, their ability to handle long multilingual contexts is …
to queries in diverse languages, their ability to handle long multilingual contexts is …
CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation
Legal professionals need to write analyses that rely on citations to relevant precedents, ie,
previous case decisions. Intelligent systems assisting legal professionals in writing such …
previous case decisions. Intelligent systems assisting legal professionals in writing such …
Attention instruction: Amplifying attention in the middle via prompting
The context window of large language models has been extended to 128k tokens or more.
However, language models still suffer from position bias and have difficulty in accessing and …
However, language models still suffer from position bias and have difficulty in accessing and …