Llm-pruner: On the structural pruning of large language models
Large language models (LLMs) have shown remarkable capabilities in language
understanding and generation. However, such impressive capability typically comes with a …
understanding and generation. However, such impressive capability typically comes with a …
Up to 100x faster data-free knowledge distillation
Data-free knowledge distillation (DFKD) has recently been attracting increasing attention
from research communities, attributed to its capability to compress a model only using …
from research communities, attributed to its capability to compress a model only using …
Contrastive model inversion for data-free knowledge distillation
Model inversion, whose goal is to recover training data from a pre-trained model, has been
recently proved feasible. However, existing inversion methods usually suffer from the mode …
recently proved feasible. However, existing inversion methods usually suffer from the mode …
Robust and resource-efficient data-free knowledge distillation by generative pseudo replay
Abstract Data-Free Knowledge Distillation (KD) allows knowledge transfer from a trained
neural network (teacher) to a more compact one (student) in the absence of original training …
neural network (teacher) to a more compact one (student) in the absence of original training …
Data-free knowledge transfer: A survey
In the last decade, many deep learning models have been well trained and made a great
success in various fields of machine intelligence, especially for computer vision and natural …
success in various fields of machine intelligence, especially for computer vision and natural …
When gradient descent meets derivative-free optimization: A match made in black-box scenario
Large pre-trained language models (PLMs) have garnered significant attention for their
versatility and potential for solving a wide spectrum of natural language processing (NLP) …
versatility and potential for solving a wide spectrum of natural language processing (NLP) …
Prompting to distill: Boosting data-free knowledge distillation via reinforced prompt
Data-free knowledge distillation (DFKD) conducts knowledge distillation via eliminating the
dependence of original training data, and has recently achieved impressive results in …
dependence of original training data, and has recently achieved impressive results in …
Data-Free Distillation of Language Model by Text-to-Text Transfer
Data-Free Knowledge Distillation (DFKD) plays a vital role in compressing the model when
original training data is unavailable. Previous works for DFKD in NLP mainly focus on …
original training data is unavailable. Previous works for DFKD in NLP mainly focus on …
Feature-rich audio model inversion for data-free knowledge distillation towards general sound classification
Z Kang, Y He, J Wang, J Peng, X Qu… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Data-Free Knowledge Distillation (DFKD) has recently attracted growing attention in the
academic community, especially with major breakthroughs in computer vision. Despite …
academic community, especially with major breakthroughs in computer vision. Despite …
Narrowing the language gap: domain adaptation guided cross-lingual passage re-ranking
D Chen, X Zhang, S Zhang - Neural Computing and Applications, 2023 - Springer
For a given query, the objective of Cross-lingual Passage Re-ranking (XPR) is to rank a list
of candidate passages in multiple languages, where only a portion of the passages are in …
of candidate passages in multiple languages, where only a portion of the passages are in …