Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks
L Wang, KJ Yoon - IEEE transactions on pattern analysis and …, 2021 - ieeexplore.ieee.org
Deep neural models, in recent years, have been successful in almost every field, even
solving the most complex problem statements. However, these models are huge in size with …
solving the most complex problem statements. However, these models are huge in size with …
Leveraging recent advances in deep learning for audio-visual emotion recognition
L Schoneveld, A Othmani, H Abdelkawy - Pattern Recognition Letters, 2021 - Elsevier
Emotional expressions are the behaviors that communicate our emotional state or attitude to
others. They are expressed through verbal and non-verbal communication. Complex human …
others. They are expressed through verbal and non-verbal communication. Complex human …
AI models collapse when trained on recursively generated data
Stable diffusion revolutionized image creation from descriptive text. GPT-2 (ref.), GPT-3 (.
5)(ref.) and GPT-4 (ref.) demonstrated high performance across a variety of language tasks …
5)(ref.) and GPT-4 (ref.) demonstrated high performance across a variety of language tasks …
YOLOv6: A single-stage object detection framework for industrial applications
For years, the YOLO series has been the de facto industry-level standard for efficient object
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …
R-drop: Regularized dropout for neural networks
Dropout is a powerful and widely used technique to regularize the training of deep neural
networks. Though effective and performing well, the randomness introduced by dropout …
networks. Though effective and performing well, the randomness introduced by dropout …
Knowledge distillation with the reused teacher classifier
Abstract Knowledge distillation aims to compress a powerful yet cumbersome teacher model
into a lightweight student model without much sacrifice of performance. For this purpose …
into a lightweight student model without much sacrifice of performance. For this purpose …
Knowledge distillation: A survey
In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …
especially for computer vision tasks. The great success of deep learning is mainly due to its …
Towards understanding ensemble, knowledge distillation and self-distillation in deep learning
Z Allen-Zhu, Y Li - arXiv preprint arXiv:2012.09816, 2020 - arxiv.org
We formally study how ensemble of deep learning models can improve test accuracy, and
how the superior performance of ensemble can be distilled into a single model using …
how the superior performance of ensemble can be distilled into a single model using …
Rethinking few-shot image classification: a good embedding is all you need?
The focus of recent meta-learning research has been on the development of learning
algorithms that can quickly adapt to test time tasks with limited data and low computational …
algorithms that can quickly adapt to test time tasks with limited data and low computational …
Theoretical analysis of self-training with deep networks on unlabeled data
Self-training algorithms, which train a model to fit pseudolabels predicted by another
previously-learned model, have been very successful for learning with unlabeled data using …
previously-learned model, have been very successful for learning with unlabeled data using …