Sthg: Spatial-temporal heterogeneous graph learning for advanced audio-visual diarization

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

Sthg: Spatial-temporal heterogeneous graph learning for advanced audio-visual diarization

在引用文章中搜索

[PDF] thecvf.com

Action Scene Graphs for Long-Form Understanding of Egocentric Videos

I Rodin, A Furnari, K Min, S Tripathi… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract We present Egocentric Action Scene Graphs (EASGs) a new representation for long-
form understanding of egocentric videos. EASGs extend standard manually-annotated …

被引用次数：3 相关文章所有 3 个版本

[PDF] thecvf.com

VideoSAGE: Video Summarization with Graph Representation Learning

JMR Chaves, S Tripathi - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

We propose a graph-based representation learning framework for video summarization.
First we convert an input video to a graph where nodes correspond to each of the video …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges

V Mingote, A Ortega, A Miguel, E Lleida - arXiv preprint arXiv:2409.05659, 2024 - arxiv.org

Nowadays, the large amount of audio-visual content available has fostered the need to
develop new robust automatic speaker diarization systems to analyse and characterise it …

Beyond Words: Enhancing Natural Interaction by Recognizing Social Conversation Contexts in HRI

J Jang, Y Yoon - 2024 21st International Conference on …, 2024 - ieeexplore.ieee.org

With the ongoing advancements in AI technology, human-robot interactions have become
increasingly prevalent, extending across diverse domains such as AI speakers and service …