Deep audio-visual learning: A survey
Audio-visual learning, aimed at exploiting the relationship between audio and visual
modalities, has drawn considerable attention since deep learning started to be used …
modalities, has drawn considerable attention since deep learning started to be used …
Learning to separate object sounds by watching unlabeled video
Perceiving a scene most fully requires all the senses. Yet modeling how objects look and
sound is challenging: most natural scenes and events contain multiple objects, and the …
sound is challenging: most natural scenes and events contain multiple objects, and the …
2.5 d visual sound
Binaural audio provides a listener with 3D sound sensation, allowing a rich perceptual
experience of the scene. However, binaural recordings are scarcely available and require …
experience of the scene. However, binaural recordings are scarcely available and require …
Co-separating sounds of visual objects
Learning how objects sound from video is challenging, since they often heavily overlap in a
single audio channel. Current methods for visually-guided audio source separation sidestep …
single audio channel. Current methods for visually-guided audio source separation sidestep …
Deep cross-modal audio-visual generation
Cross-modal audio-visual perception has been a long-lasting topic in psychology and
neurology, and various studies have discovered strong correlations in human perception of …
neurology, and various studies have discovered strong correlations in human perception of …
Creating a multitrack classical music performance dataset for multimodal music analysis: Challenges, insights, and applications
We introduce a dataset for facilitating audio-visual analysis of music performances. The
dataset comprises 44 simple multi-instrument classical music pieces assembled from …
dataset comprises 44 simple multi-instrument classical music pieces assembled from …
Move2hear: Active audio-visual source separation
S Majumder, Z Al-Halah… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We introduce the active audio-visual source separation problem, where an agent must move
intelligently in order to better isolate the sounds coming from an object of interest in its …
intelligently in order to better isolate the sounds coming from an object of interest in its …
Multimodal music information processing and retrieval: Survey and future challenges
F Simonetta, S Ntalampiras… - … workshop on multilayer …, 2019 - ieeexplore.ieee.org
Towards improving the performance in various music information processing tasks, recent
studies exploit different modalities able to capture diverse aspects of music. Such modalities …
studies exploit different modalities able to capture diverse aspects of music. Such modalities …
Audiovisual analysis of music performances: Overview of an emerging field
In the physical sciences and engineering domains, music has traditionally been considered
an acoustic phenomenon. From a perceptual viewpoint, music is naturally associated with …
an acoustic phenomenon. From a perceptual viewpoint, music is naturally associated with …
Active audio-visual separation of dynamic sound sources
S Majumder, K Grauman - European Conference on Computer Vision, 2022 - Springer
We explore active audio-visual separation for dynamic sound sources, where an embodied
agent moves intelligently in a 3D environment to continuously isolate the time-varying audio …
agent moves intelligently in a 3D environment to continuously isolate the time-varying audio …