Revisiting knowledge distillation via label smoothing regularization

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org

Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

被引用次数：61 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] Few-shot object detection on aerial imagery via deep metric learning and knowledge inheritance

W Li, J Zhou, X Li, Y Cao, G Jin - … Journal of Applied Earth Observation and …, 2023 - Elsevier

Object detection is crucial in aerial imagery analysis. Previous methods based on
convolutional neural networks (CNNs) require large-scale labeled datasets for training to …

被引用次数：11 相关文章所有 2 个版本

[PDF] thecvf.com

Knowledge distillation with the reused teacher classifier

D Chen, JP Mei, H Zhang, C Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract Knowledge distillation aims to compress a powerful yet cumbersome teacher model
into a lightweight student model without much sacrifice of performance. For this purpose …

被引用次数：149 相关文章所有 6 个版本

[PDF] thecvf.com

Tokens-to-token vit: Training vision transformers from scratch on imagenet

L Yuan, Y Chen, T Wang, W Yu, Y Shi… - Proceedings of the …, 2021 - openaccess.thecvf.com

Transformers, which are popular for language modeling, have been explored for solving
vision tasks recently, eg, the Vision Transformer (ViT) for image classification. The ViT model …

被引用次数：1997 相关文章所有 7 个版本

[PDF] arxiv.org

Volo: Vision outlooker for visual recognition

L Yuan, Q Hou, Z Jiang, J Feng… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Recently, Vision Transformers (ViTs) have been broadly explored in visual recognition. With
low efficiency in encoding fine-level features, the performance of ViTs is still inferior to the …

被引用次数：281 相关文章所有 7 个版本

[PDF] thecvf.com

L2g: A simple local-to-global knowledge transfer framework for weakly supervised semantic segmentation

PT Jiang, Y Yang, Q Hou, Y Wei - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Mining precise class-aware attention maps, aka, class activation maps, is essential for
weakly supervised semantic segmentation. In this paper, we present L2G, a simple online …

被引用次数：123 相关文章所有 5 个版本

[PDF] aaai.org

Cross-layer distillation with semantic calibration

D Chen, JP Mei, Y Zhang, C Wang, Z Wang… - Proceedings of the …, 2021 - ojs.aaai.org

Recently proposed knowledge distillation approaches based on feature-map transfer
validate that intermediate layers of a teacher model can serve as effective targets for training …

被引用次数：270 相关文章所有 7 个版本

[PDF] thecvf.com

Channel-wise knowledge distillation for dense prediction

C Shu, Y Liu, J Gao, Z Yan… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Abstract Knowledge distillation (KD) has been proven a simple and effective tool for training
compact dense prediction models. Lightweight student networks are trained by extra …

被引用次数：212 相关文章所有 5 个版本

[PDF] neurips.cc

All tokens matter: Token labeling for training better vision transformers

ZH Jiang, Q Hou, L Yuan, D Zhou… - Advances in neural …, 2021 - proceedings.neurips.cc

In this paper, we present token labeling---a new training objective for training high-
performance vision transformers (ViTs). Different from the standard training objective of ViTs …

被引用次数：185 相关文章所有 11 个版本

[PDF] thecvf.com

General instance distillation for object detection

X Dai, Z Jiang, Z Wu, Y Bao, Z Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com

In recent years, knowledge distillation has been proved to be an effective solution for model
compression. This approach can make lightweight student models acquire the knowledge …

被引用次数：198 相关文章所有 6 个版本