Daso: Distribution-aware semantics-oriented pseudo-label for imbalanced semi-supervised learning

Y Oh, DJ Kim, IS Kweon - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com
The capability of the traditional semi-supervised learning (SSL) methods is far from real-
world application due to severely biased pseudo-labels caused by (1) class imbalance and …

Mcdal: Maximum classifier discrepancy for active learning

JW Cho, DJ Kim, Y Jung… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Recent state-of-the-art active learning methods have mostly leveraged generative
adversarial networks (GANs) for sample acquisition; however, GAN is usually known to …

Generative bias for robust visual question answering

JW Cho, DJ Kim, H Ryu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract The task of Visual Question Answering (VQA) is known to be plagued by the issue
of VQA models exploiting biases within the dataset to make its final prediction. Various …

Visible-infrared person re-identification using privileged intermediate information

M Alehdaghi, A Josi, RMO Cruz, E Granger - European Conference on …, 2022 - Springer
Visible-infrared person re-identification (ReID) aims to recognize a same person of interest
across a network of RGB and IR cameras. Some deep learning (DL) models have directly …

Clip-td: Clip targeted distillation for vision-language tasks

Z Wang, N Codella, YC Chen, L Zhou, J Yang… - arXiv preprint arXiv …, 2022 - arxiv.org
Contrastive language-image pretraining (CLIP) links vision and language modalities into a
unified embedding space, yielding the tremendous potential for vision-language (VL) tasks …

Dense relational image captioning via multi-task triple-stream networks

DJ Kim, TH Oh, J Choi, IS Kweon - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
We introduce dense relational captioning, a novel image captioning task which aims to
generate multiple captions with respect to relational information between objects in a visual …

Enhancing modality-agnostic representations via meta-learning for brain tumor segmentation

A Konwer, X Hu, J Bae, X Xu… - Proceedings of the …, 2023 - openaccess.thecvf.com
In medical vision, different imaging modalities provide complementary information. However,
in practice, not all modalities may be available during inference or even training. Previous …

Acp++: Action co-occurrence priors for human-object interaction detection

DJ Kim, X Sun, J Choi, S Lin… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
A common problem in the task of human-object interaction (HOI) detection is that numerous
HOI classes have only a small number of labeled examples, resulting in training sets with a …

Signing outside the studio: Benchmarking background robustness for continuous sign language recognition

Y Jang, Y Oh, JW Cho, DJ Kim, JS Chung… - arXiv preprint arXiv …, 2022 - arxiv.org
The goal of this work is background-robust continuous sign language recognition. Most
existing Continuous Sign Language Recognition (CSLR) benchmarks have fixed …

Towards robust multimodal sentiment analysis under uncertain signal missing

M Li, D Yang, L Zhang - IEEE Signal Processing Letters, 2023 - ieeexplore.ieee.org
Multimodal Sentiment Analysis (MSA) has attracted widespread research attention recently.
Most MSA studies are based on the assumption of signal completeness. However, many …