A survey on the memory mechanism of large language model based agents

Z Zhang, X Bo, C Ma, R Li, X Chen, Q Dai, J Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language model (LLM) based agents have recently attracted much attention from the
research and industry communities. Compared with original LLMs, LLM-based agents are …

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

H Pouransari, CL Li, JHR Chang, PKA Vasu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) are commonly trained on datasets consisting of fixed-length
token sequences. These datasets are created by randomly concatenating documents of …

LoGra-Med: Long context multi-graph alignment for medical vision-language model

DMH Nguyen, NT Diep, TQ Nguyen, HB Le… - arXiv preprint arXiv …, 2024 - arxiv.org
State-of-the-art medical multi-modal large language models (med-MLLM), like LLaVA-Med
or BioMedGPT, leverage instruction-following data in pre-training. However, those models …

Tulip: Token-length upgraded clip

I Najdenkoska, MM Derakhshani, YM Asano… - arXiv preprint arXiv …, 2024 - arxiv.org
We address the challenge of representing long captions in vision-language models, such as
CLIP. By design these models are limited by fixed, absolute positional encodings, restricting …

CNNSum: Exploring Long-Conext Summarization with Large Language Models in Chinese Novels

L Wei, H Yan, X Lu, J Zhu, J Wang, W Zhang - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have been well-researched in many long-context tasks.
However, due to high annotation costs, high-quality long-context summary datasets for …

Language Models can Self-Lengthen to Generate Long Texts

S Quan, T Tang, B Yu, A Yang, D Liu, B Gao… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in Large Language Models (LLMs) have significantly enhanced their
ability to process long contexts, yet a notable gap remains in generating long, aligned …

Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning

M Cui, J Du, S Zhu, D Xiong - arXiv preprint arXiv:2406.07081, 2024 - arxiv.org
Large language models (LLMs) exhibit outstanding performance in machine translation via
in-context learning. In contrast to sentence-level translation, document-level translation …

A Study on Context Length and Efficient Transformers for Biomedical Image Analysis

SM Hooper, H Xue - arXiv preprint arXiv:2501.00619, 2024 - arxiv.org
Biomedical imaging modalities often produce high-resolution, multi-dimensional images that
pose computational challenges for deep neural networks. These computational challenges …

Advancing Bug Detection in Fastjson2 with Large Language Models Driven Unit Test Generation

Z Zhong, S Wang, H Wang, S Wen, H Guan… - arXiv preprint arXiv …, 2024 - arxiv.org
Data-serialization libraries are essential tools in software development, responsible for
converting between programmable data structures and data persistence formats. Among …

The CAP Principle for LLM Serving

P Zeng, Z Ning, J Zhao, W Cui, M Xu, L Guo… - arXiv preprint arXiv …, 2024 - arxiv.org
We survey the large language model (LLM) serving area to understand the intricate
dynamics between cost-efficiency and accuracy, which is magnified by the growing need for …