Llm-pruner: On the structural pruning of large language models

X Ma, G Fang, X Wang - Advances in neural information …, 2023 - proceedings.neurips.cc
Large language models (LLMs) have shown remarkable capabilities in language
understanding and generation. However, such impressive capability typically comes with a …

Fake it till you make it: Learning transferable representations from synthetic imagenet clones

MB Sarıyıldız, K Alahari, D Larlus… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent image generation models such as Stable Diffusion have exhibited an impressive
ability to generate fairly realistic images starting from a simple text prompt. Could such …

Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models

MM Ferdaus, M Abdelguerfi, E Ioup, KN Niles… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid progress in Large Language Models (LLMs) could transform many fields, but their
fast development creates significant challenges for oversight, ethical creation, and building …

Iag: Induction-augmented generation framework for answering reasoning questions

Z Zhang, X Zhang, Y Ren, S Shi, M Han… - Proceedings of the …, 2023 - aclanthology.org
Abstract Retrieval-Augmented Generation (RAG), by incorporating external knowledge with
parametric memory of language models, has become the state-of-the-art architecture for …

PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning

G Kim, D Jang, E Yang - arXiv preprint arXiv:2402.12842, 2024 - arxiv.org
Recent advancements in large language models (LLMs) have raised concerns about
inference costs, increasing the need for research into model compression. While knowledge …

Data-Free Distillation of Language Model by Text-to-Text Transfer

Z Bai, X Liu, H Hu, T Guo, Q Zhang, Y Wang - arXiv preprint arXiv …, 2023 - arxiv.org
Data-Free Knowledge Distillation (DFKD) plays a vital role in compressing the model when
original training data is unavailable. Previous works for DFKD in NLP mainly focus on …

Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

P Vijayaraghavan, H Wang, L Shi, T Baldwin… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, there has been a growing availability of pre-trained text models on various model
repositories. These models greatly reduce the cost of training new models from scratch as …