[HTML][HTML] Review of large vision models and visual prompt engineering
Visual prompt engineering is a fundamental methodology in the field of visual and image
artificial general intelligence. As the development of large vision models progresses, the …
artificial general intelligence. As the development of large vision models progresses, the …
Scconv: Spatial and channel reconstruction convolution for feature redundancy
J Li, Y Wen, L He - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Abstract Convolutional Neural Networks (CNNs) have achieved remarkable performance in
various computer vision tasks but this comes at the cost of tremendous computational …
various computer vision tasks but this comes at the cost of tremendous computational …
Effective whole-body pose estimation with two-stages distillation
Whole-body pose estimation localizes the human body, hand, face, and foot keypoints in an
image. This task is challenging due to multi-scale body parts, fine-grained localization for …
image. This task is challenging due to multi-scale body parts, fine-grained localization for …
Efficientsam: Leveraged masked image pretraining for efficient segment anything
Abstract Segment Anything Model (SAM) has emerged as a powerful tool for numerous
vision applications. A key component that drives the impressive performance for zero-shot …
vision applications. A key component that drives the impressive performance for zero-shot …
Decoupled multimodal distilling for emotion recognition
Human multimodal emotion recognition (MER) aims to perceive human emotions via
language, visual and acoustic modalities. Despite the impressive performance of previous …
language, visual and acoustic modalities. Despite the impressive performance of previous …
A survey on model compression for large language models
Large Language Models (LLMs) have revolutionized natural language processing tasks with
remarkable success. However, their formidable size and computational demands present …
remarkable success. However, their formidable size and computational demands present …
Multi-level logit distillation
Abstract Knowledge Distillation (KD) aims at distilling the knowledge from the large teacher
model to a lightweight student model. Mainstream KD methods can be divided into two …
model to a lightweight student model. Mainstream KD methods can be divided into two …
Curriculum temperature for knowledge distillation
Most existing distillation methods ignore the flexible role of the temperature in the loss
function and fix it as a hyper-parameter that can be decided by an inefficient grid search. In …
function and fix it as a hyper-parameter that can be decided by an inefficient grid search. In …
From knowledge distillation to self-knowledge distillation: A unified approach with normalized loss and customized soft labels
Abstract Knowledge Distillation (KD) uses the teacher's prediction logits as soft labels to
guide the student, while self-KD does not need a real teacher to require the soft labels. This …
guide the student, while self-KD does not need a real teacher to require the soft labels. This …