Llm-pruner: On the structural pruning of large language models
Large language models (LLMs) have shown remarkable capabilities in language
understanding and generation. However, such impressive capability typically comes with a …
understanding and generation. However, such impressive capability typically comes with a …
Fake it till you make it: Learning transferable representations from synthetic imagenet clones
Recent image generation models such as Stable Diffusion have exhibited an impressive
ability to generate fairly realistic images starting from a simple text prompt. Could such …
ability to generate fairly realistic images starting from a simple text prompt. Could such …
Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models
MM Ferdaus, M Abdelguerfi, E Ioup, KN Niles… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid progress in Large Language Models (LLMs) could transform many fields, but their
fast development creates significant challenges for oversight, ethical creation, and building …
fast development creates significant challenges for oversight, ethical creation, and building …
Iag: Induction-augmented generation framework for answering reasoning questions
Abstract Retrieval-Augmented Generation (RAG), by incorporating external knowledge with
parametric memory of language models, has become the state-of-the-art architecture for …
parametric memory of language models, has become the state-of-the-art architecture for …
PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning
Recent advancements in large language models (LLMs) have raised concerns about
inference costs, increasing the need for research into model compression. While knowledge …
inference costs, increasing the need for research into model compression. While knowledge …
Data-Free Distillation of Language Model by Text-to-Text Transfer
Data-Free Knowledge Distillation (DFKD) plays a vital role in compressing the model when
original training data is unavailable. Previous works for DFKD in NLP mainly focus on …
original training data is unavailable. Previous works for DFKD in NLP mainly focus on …
Self-Regulated Data-Free Knowledge Amalgamation for Text Classification
P Vijayaraghavan, H Wang, L Shi, T Baldwin… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, there has been a growing availability of pre-trained text models on various model
repositories. These models greatly reduce the cost of training new models from scratch as …
repositories. These models greatly reduce the cost of training new models from scratch as …