Autoad ii: The sequel-who, when, and what in movie audio description
Audio Description (AD) is the task of generating descriptions of visual content, at suitable
time intervals, for the benefit of visually impaired audiences. For movies, this presents …
time intervals, for the benefit of visually impaired audiences. For movies, this presents …
AutoAD: Movie description in context
The objective of this paper is an automatic Audio Description (AD) model that ingests movies
and outputs AD in text form. Generating high-quality movie AD is challenging due to the …
and outputs AD in text form. Generating high-quality movie AD is challenging due to the …
Mm-vid: Advancing video understanding with gpt-4v (ision)
We present MM-VID, an integrated system that harnesses the capabilities of GPT-4V,
combined with specialized tools in vision, audio, and speech, to facilitate advanced video …
combined with specialized tools in vision, audio, and speech, to facilitate advanced video …
Condensed movies: Story based retrieval with contextual embeddings
Our objective in this work is the long range understandingof the narrative structure of
movies. Instead of considering the entire movie, we propose to learn from thekey scenes' of …
movies. Instead of considering the entire movie, we propose to learn from thekey scenes' of …
How You Feelin'? Learning Emotions and Mental States in Movie Scenes
D Srivastava, AK Singh… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Movie story analysis requires understanding characters' emotions and mental states.
Towards this goal, we formulate emotion understanding as predicting a diverse and multi …
Towards this goal, we formulate emotion understanding as predicting a diverse and multi …
Laeo-net: revisiting people looking at each other in videos
MJ Marin-Jimenez, V Kalogeiton… - Proceedings of the …, 2019 - openaccess.thecvf.com
Capturing the'mutual gaze'of people is essential for understanding and interpreting the
social interactions between them. To this end, this paper addresses the problem of detecting …
social interactions between them. To this end, this paper addresses the problem of detecting …
Face, body, voice: Video person-clustering with multiple modalities
A Brown, V Kalogeiton… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
The objective of this work is person-clustering in videos--grouping characters according to
their identity. Previous methods focus on the narrower task of face-clustering, and for the …
their identity. Previous methods focus on the narrower task of face-clustering, and for the …
Social fabric: Tubelet compositions for video relation detection
This paper strives to classify and detect the relationship between object tubelets appearing
within a video as a< subject-predicate-object> triplet. Where existing works treat object …
within a video as a< subject-predicate-object> triplet. Where existing works treat object …
Linking the characters: Video-oriented social graph generation via hierarchical-cumulative GCN
Recent years have witnessed the booming of online video platforms. Along this line, a graph
to illustrate social relation among characters has been long expected to not only benefit the …
to illustrate social relation among characters has been long expected to not only benefit the …
Learning social relationship from videos via pre-trained multimodal transformer
As a crucial task for video analysis, social relation recognition from characters provides
intelligent applications with great potential to better understand the behaviors or emotions of …
intelligent applications with great potential to better understand the behaviors or emotions of …