Prompting to distill: Boosting data-free knowledge distillation via reinforced prompt

X Ma, G Fang, X Wang - Advances in neural information …, 2023 - proceedings.neurips.cc

Large language models (LLMs) have shown remarkable capabilities in language
understanding and generation. However, such impressive capability typically comes with a …

被引用次数：266 相关文章所有 5 个版本

[PDF] thecvf.com

Fake it till you make it: Learning transferable representations from synthetic imagenet clones

MB Sarıyıldız, K Alahari, D Larlus… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent image generation models such as Stable Diffusion have exhibited an impressive
ability to generate fairly realistic images starting from a simple text prompt. Could such …

被引用次数：114 相关文章所有 13 个版本

[PDF] arxiv.org

Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models

MM Ferdaus, M Abdelguerfi, E Ioup, KN Niles… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid progress in Large Language Models (LLMs) could transform many fields, but their
fast development creates significant challenges for oversight, ethical creation, and building …

被引用次数：4 相关文章

[PDF] aclanthology.org

Iag: Induction-augmented generation framework for answering reasoning questions

Z Zhang, X Zhang, Y Ren, S Shi, M Han… - Proceedings of the …, 2023 - aclanthology.org

Abstract Retrieval-Augmented Generation (RAG), by incorporating external knowledge with
parametric memory of language models, has become the state-of-the-art architecture for …

被引用次数：8 相关文章所有 4 个版本

[PDF] arxiv.org

PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning

G Kim, D Jang, E Yang - arXiv preprint arXiv:2402.12842, 2024 - arxiv.org

Recent advancements in large language models (LLMs) have raised concerns about
inference costs, increasing the need for research into model compression. While knowledge …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Data-Free Distillation of Language Model by Text-to-Text Transfer

Z Bai, X Liu, H Hu, T Guo, Q Zhang, Y Wang - arXiv preprint arXiv …, 2023 - arxiv.org

Data-Free Knowledge Distillation (DFKD) plays a vital role in compressing the model when
original training data is unavailable. Previous works for DFKD in NLP mainly focus on …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

P Vijayaraghavan, H Wang, L Shi, T Baldwin… - arXiv preprint arXiv …, 2024 - arxiv.org

Recently, there has been a growing availability of pre-trained text models on various model
repositories. These models greatly reduce the cost of training new models from scratch as …