Cllms: Consistency large language models

Y Li, F Wei, C Zhang, H Zhang - arXiv preprint arXiv:2406.16858, 2024 - arxiv.org

Inference with modern Large Language Models (LLMs) is expensive and time-consuming,
and speculative sampling has proven to be an effective solution. Most speculative sampling …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems

A Mohammadjafari, AS Maida… - arXiv preprint arXiv …, 2024 - arxiv.org

Since the onset of LLMs, translating natural language queries to structured SQL commands
is assuming increasing. Unlike the previous reviews, this survey provides a comprehensive …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Accelerating auto-regressive text-to-image generation with training-free speculative jacobi decoding

Y Teng, H Shi, X Liu, X Ning, G Dai, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

The current large auto-regressive models can generate high-quality, high-resolution images,
but these models require hundreds or even thousands of steps of next-token prediction …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

Z Hong, Z Yuan, Q Zhang, H Chen, J Dong… - arXiv preprint arXiv …, 2024 - arxiv.org

Generating accurate SQL according to natural language questions (text-to-SQL) is a long-
standing problem since it is challenging in user question understanding, database schema …

被引用次数：27 相关文章所有 2 个版本

[PDF] arxiv.org

ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality

Y He, F Chen, Y He, S He, H Zhou, K Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

In this paper, we propose ZipAR, a training-free, plug-and-play parallel decoding framework
for accelerating auto-regressive (AR) visual generation. The motivation stems from the …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling

X Luo, Y Wang, Q Zhu, Z Zhang, X Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid growth in the parameters of large language models (LLMs) has made inference
latency a fundamental bottleneck, limiting broader application of LLMs. Speculative …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Closed-loop long-horizon robotic planning via equilibrium sequence modeling

J Li, Z Sun, F Li, C Sheng, J Yu, Y Mu - arXiv preprint arXiv:2410.01440, 2024 - arxiv.org

In the endeavor to make autonomous robots take actions, task planning is a major challenge
that requires translating high-level task descriptions into long-horizon action sequences …

被引用次数：1 相关文章所有 2 个版本

[PDF] acm.org