Knowledge distillation from a stronger teacher
Unlike existing knowledge distillation methods focus on the baseline settings, where the
teacher models and training strategies are not that strong and competing as state-of-the-art …
teacher models and training strategies are not that strong and competing as state-of-the-art …
Deep transfer learning for intelligent vehicle perception: A survey
Deep learning-based intelligent vehicle perception has been developing prominently in
recent years to provide a reliable source for motion planning and decision making in …
recent years to provide a reliable source for motion planning and decision making in …
Automated knowledge distillation via monte carlo tree search
In this paper, we present Auto-KD, the first automated search framework for optimal
knowledge distillation design. Traditional distillation techniques typically require handcrafted …
knowledge distillation design. Traditional distillation techniques typically require handcrafted …
Knowledge diffusion for distillation
The representation gap between teacher and student is an emerging topic in knowledge
distillation (KD). To reduce the gap and improve the performance, current methods often …
distillation (KD). To reduce the gap and improve the performance, current methods often …
Kd-zero: Evolving knowledge distiller for any teacher-student pairs
Abstract Knowledge distillation (KD) has emerged as an effective technique for compressing
models that can enhance the lightweight model. Conventional KD methods propose various …
models that can enhance the lightweight model. Conventional KD methods propose various …
Norm: Knowledge distillation via n-to-one representation matching
Existing feature distillation methods commonly adopt the One-to-one Representation
Matching between any pre-selected teacher-student layer pair. In this paper, we present N …
Matching between any pre-selected teacher-student layer pair. In this paper, we present N …
TransKD: Transformer knowledge distillation for efficient semantic segmentation
Large pre-trained transformers are on top of contemporary semantic segmentation
benchmarks, but come with high computational cost and a lengthy training. To lift this …
benchmarks, but come with high computational cost and a lengthy training. To lift this …
Directional connectivity-based segmentation of medical images
Anatomical consistency in biomarker segmentation is crucial for many medical image
analysis tasks. A promising paradigm for achieving anatomically consistent segmentation …
analysis tasks. A promising paradigm for achieving anatomically consistent segmentation …
Mixskd: Self-knowledge distillation from mixup for image recognition
Abstract Unlike the conventional Knowledge Distillation (KD), Self-KD allows a network to
learn knowledge from itself without any guidance from extra networks. This paper proposes …
learn knowledge from itself without any guidance from extra networks. This paper proposes …