Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks

L Wang, KJ Yoon - IEEE transactions on pattern analysis and …, 2021 - ieeexplore.ieee.org
Deep neural models, in recent years, have been successful in almost every field, even
solving the most complex problem statements. However, these models are huge in size with …

Leveraging recent advances in deep learning for audio-visual emotion recognition

L Schoneveld, A Othmani, H Abdelkawy - Pattern Recognition Letters, 2021 - Elsevier
Emotional expressions are the behaviors that communicate our emotional state or attitude to
others. They are expressed through verbal and non-verbal communication. Complex human …

AI models collapse when trained on recursively generated data

I Shumailov, Z Shumaylov, Y Zhao, N Papernot… - Nature, 2024 - nature.com
Stable diffusion revolutionized image creation from descriptive text. GPT-2 (ref.), GPT-3 (.
5)(ref.) and GPT-4 (ref.) demonstrated high performance across a variety of language tasks …

YOLOv6: A single-stage object detection framework for industrial applications

C Li, L Li, H Jiang, K Weng, Y Geng, L Li, Z Ke… - arXiv preprint arXiv …, 2022 - arxiv.org
For years, the YOLO series has been the de facto industry-level standard for efficient object
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …

R-drop: Regularized dropout for neural networks

L Wu, J Li, Y Wang, Q Meng, T Qin… - Advances in …, 2021 - proceedings.neurips.cc
Dropout is a powerful and widely used technique to regularize the training of deep neural
networks. Though effective and performing well, the randomness introduced by dropout …

Knowledge distillation with the reused teacher classifier

D Chen, JP Mei, H Zhang, C Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract Knowledge distillation aims to compress a powerful yet cumbersome teacher model
into a lightweight student model without much sacrifice of performance. For this purpose …

Knowledge distillation: A survey

J Gou, B Yu, SJ Maybank, D Tao - International Journal of Computer Vision, 2021 - Springer
In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …

Towards understanding ensemble, knowledge distillation and self-distillation in deep learning

Z Allen-Zhu, Y Li - arXiv preprint arXiv:2012.09816, 2020 - arxiv.org
We formally study how ensemble of deep learning models can improve test accuracy, and
how the superior performance of ensemble can be distilled into a single model using …

Rethinking few-shot image classification: a good embedding is all you need?

Y Tian, Y Wang, D Krishnan, JB Tenenbaum… - Computer Vision–ECCV …, 2020 - Springer
The focus of recent meta-learning research has been on the development of learning
algorithms that can quickly adapt to test time tasks with limited data and low computational …

Theoretical analysis of self-training with deep networks on unlabeled data

C Wei, K Shen, Y Chen, T Ma - arXiv preprint arXiv:2010.03622, 2020 - arxiv.org
Self-training algorithms, which train a model to fit pseudolabels predicted by another
previously-learned model, have been very successful for learning with unlabeled data using …