Autoad ii: The sequel-who, when, and what in movie audio description

T Han, M Bain, A Nagrani, G Varol… - Proceedings of the …, 2023 - openaccess.thecvf.com
Audio Description (AD) is the task of generating descriptions of visual content, at suitable
time intervals, for the benefit of visually impaired audiences. For movies, this presents …

Deepdpm: Deep clustering with an unknown number of clusters

M Ronen, SE Finder, O Freifeld - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Deep Learning (DL) has shown great promise in the unsupervised task of clustering. That
said, while in classical (ie, non-deep) clustering the benefits of the nonparametric approach …

Deep open intent classification with adaptive decision boundary

H Zhang, H Xu, TE Lin - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
Open intent classification is a challenging task in dialogue systems. On the one hand, it
should ensure the quality of known intent identification. On the other hand, it needs to detect …

The one where they reconstructed 3d humans and environments in tv shows

G Pavlakos, E Weber, M Tancik… - European Conference on …, 2022 - Springer
TV shows depict a wide variety of human behaviors and have been studied extensively for
their potential to be a rich source of data for many applications. However, the majority of the …

A theoretical analysis of the number of shots in few-shot learning

T Cao, M Law, S Fidler - arXiv preprint arXiv:1909.11722, 2019 - arxiv.org
Few-shot classification is the task of predicting the category of an example from a set of few
labeled examples. The number of labeled examples per category is called the number of …

[HTML][HTML] Automatic classification of photos by tourist attractions using deep learning model and image feature vector clustering

J Kim, Y Kang - ISPRS International Journal of Geo-Information, 2022 - mdpi.com
With the rise of social media platforms, tourists tend to share their experiences in the form of
texts, photos, and videos on social media. These user-generated contents (UGC) play an …

Learning interactions and relationships between movie characters

A Kukleva, M Tapaswi, I Laptev - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com
Interactions between people are often governed by their relationships. On the flip side,
social relationships are built upon several interactions. Two strangers are more likely to …

Ava-avd: Audio-visual speaker diarization in the wild

EZ Xu, Z Song, S Tsutsui, C Feng, M Ye… - Proceedings of the 30th …, 2022 - dl.acm.org
Audio-visual speaker diarization aims at detecting" who spoke when''using both auditory
and visual signals. Existing audio-visual diarization datasets are mainly focused on indoor …

Deep face clustering using residual graph convolutional network

C Qi, J Zhang, H Jia, Q Mao, L Wang, H Song - Knowledge-Based Systems, 2021 - Elsevier
Face clustering has important applications in image retrieval and criminal investigation.
Face images can be seen as the nodes of a graph and the possibility of links between the …

A black-box adversarial attack for poisoning clustering

AE Cinà, A Torcinovich, M Pelillo - Pattern Recognition, 2022 - Elsevier
Clustering algorithms play a fundamental role as tools in decision-making and sensible
automation processes. Due to the widespread use of these applications, a robustness …