Reaugkd: Retrieval-augmented knowledge distillation for pre-trained language models

M Kang, S Lee, J Baek… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Large Language Models (LLMs) have shown promising performance in knowledge-
intensive reasoning tasks that require a compound understanding of knowledge. However …

被引用次数：40 相关文章所有 6 个版本

[PDF] arxiv.org

Retrieval-Augmented Generation for Natural Language Processing: A Survey

S Wu, Y Xiong, Y Cui, H Wu, C Chen, Y Yuan… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) have demonstrated great success in various fields,
benefiting from their huge amount of parameters that store knowledge. However, LLMs still …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Model compression and efficient inference for large language models: A survey

W Wang, W Chen, Y Luo, Y Long, Z Lin… - arXiv preprint arXiv …, 2024 - arxiv.org

Transformer based large language models have achieved tremendous success. However,
the significant memory and computational costs incurred during the inference process make …

被引用次数：14 相关文章所有 2 个版本

[PDF] arxiv.org

MLLM-FL: Multimodal Large Language Model Assisted Federated Learning on Heterogeneous and Long-tailed Data

J Zhang, HF Yang, A Li, X Guo, P Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Previous studies on federated learning (FL) often encounter performance degradation due
to data heterogeneity among different clients. In light of the recent advances in multimodal …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application

C Yang, W Lu, Y Zhu, Y Wang, Q Chen, C Gao… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) have showcased exceptional capabilities in various
domains, attracting significant interest from both academia and industry. Despite their …

被引用次数：1 相关文章所有 2 个版本

[PDF] aclanthology.org

Teaching Small Language Models to Reason for Knowledge-Intensive Multi-Hop Question Answering

X Li, S He, F Lei, JY JunYang, T Su… - Findings of the …, 2024 - aclanthology.org

Abstract Large Language Models (LLMs) can teach small language models (SLMs) to solve
complex reasoning tasks (eg, mathematical question answering) by Chain-of-thought …

[PDF] arxiv.org

Knowledge boosting during low-latency inference

V Srinivas, M Itani, T Chen, ES Eskimez… - arXiv preprint arXiv …, 2024 - arxiv.org

Models for low-latency, streaming applications could benefit from the knowledge capacity of
larger models, but edge devices cannot run these models due to resource constraints. A …

Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs

X Zhang, L Zang, Q Liu, S Wei, S Hu - arXiv preprint arXiv:2403.15273, 2024 - arxiv.org

Event temporal relation (TempRel) is a primary subject of the event relation extraction task.
However, the inherent ambiguity of TempRel increases the difficulty of the task. With the rise …

被引用次数：2 相关文章所有 2 个版本

SolMover: Smart Contract Code Translation Based on Concepts

R Karanjai, L Xu, W Shi - Proceedings of the 1st ACM International …, 2024 - dl.acm.org

Large language models (LLMs) have showcased remarkable skills, rivaling or even
exceeding human intelligence in certain areas. Their proficiency in translation is notable, as …