Llm-pruner: On the structural pruning of large language models

X Ma, G Fang, X Wang - Advances in neural information …, 2023 - proceedings.neurips.cc
Large language models (LLMs) have shown remarkable capabilities in language
understanding and generation. However, such impressive capability typically comes with a …

Depgraph: Towards any structural pruning

G Fang, X Ma, M Song, MB Mi… - Proceedings of the …, 2023 - openaccess.thecvf.com
Structural pruning enables model acceleration by removing structurally-grouped parameters
from neural networks. However, the parameter-grouping patterns vary widely across …

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2024 - proceedings.neurips.cc
Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

Diffusion probabilistic model made slim

X Yang, D Zhou, J Feng… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Despite the visually-pleasing results achieved, the massive computational cost has been a
long-standing flaw for diffusion probabilistic models (DPMs), which, in turn, greatly limits …

Diffusion model as representation learner

X Yang, X Wang - … of the IEEE/CVF International Conference …, 2023 - openaccess.thecvf.com
Abstract Diffusion Probabilistic Models (DPMs) have recently demonstrated impressive
results on various generative tasks. Despite its promises, the learned representations of pre …

Graphadapter: Tuning vision-language models with dual knowledge graph

X Li, D Lian, Z Lu, J Bai, Z Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Adapter-style efficient transfer learning (ETL) has shown excellent performance in the tuning
of vision-language models (VLMs) under the low-data regime, where only a few additional …

Sg-former: Self-guided transformer with evolving token reallocation

S Ren, X Yang, S Liu, X Wang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Vision Transformer has demonstrated impressive success across various vision tasks.
However, its heavy computation cost, which grows quadratically with respect to the token …

Task residual for tuning vision-language models

T Yu, Z Lu, X Jin, Z Chen… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Large-scale vision-language models (VLMs) pre-trained on billion-level data have learned
general visual representations and broad visual concepts. In principle, the well-learned …

Towards lossless dataset distillation via difficulty-aligned trajectory matching

Z Guo, K Wang, G Cazenavette, H Li, K Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
The ultimate goal of Dataset Distillation is to synthesize a small synthetic dataset such that a
model trained on this synthetic set will perform equally well as a model trained on the full …

Zero-shot video grounding with pseudo query lookup and verification

Y Lu, R Quan, L Zhu, Y Yang - IEEE Transactions on Image …, 2024 - ieeexplore.ieee.org
Video grounding, the process of identifying a specific moment in an untrimmed video based
on a natural language query, has become a popular topic in video understanding. However …