An audio-visual system for object-based audio: from recording to listening

An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction

M Cobos, J Ahrens, K Kowalczyk, A Politis - EURASIP Journal on Audio …, 2022 - Springer

The domain of spatial audio comprises methods for capturing, processing, and reproducing
audio content that contains spatial information. Data-based methods are those that operate …

被引用次数：26 相关文章所有 12 个版本

[PDF] qmul.ac.uk

Multi-speaker tracking from an audio–visual sensing device

X Qian, A Brutti, O Lanz, M Omologo… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org

Compact multi-sensor platforms are portable and thus desirable for robotics and personal-
assistance tasks. However, compared to physically distributed sensors, the size of these …

被引用次数：57 相关文章所有 11 个版本

Translation of a higher order ambisonics sound scene based on parametric decomposition

M Kentgens, A Behler, P Jax - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org

This paper presents a novel 3DoF+ system that allows to navigate, ie, change position, in
scene-based spatial audio content beyond the sweet spot of a Higher Order Ambisonics …

被引用次数：30 相关文章所有 4 个版本

DMMAN: A two-stage audio–visual fusion framework for sound separation and event localization

R Hu, S Zhou, ZR Tang, S Chang, Q Huang, Y Liu… - Neural Networks, 2021 - Elsevier

Videos are used widely as the media platforms for human beings to touch the physical
change of the world. However, we always receive the mixed sound from the multiple sound …

被引用次数：17 相关文章所有 3 个版本

[PDF] arxiv.org

Towards generating ambisonics using audio-visual cue for virtual reality

A Rana, C Ozcinar, A Smolic - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org

Ambisonics ie, a full-sphere surround sound, is quintessential with 360° visual content to
provide a realistic virtual reality (VR) experience. While 360° visual content capture gained a …

被引用次数：31 相关文章所有 8 个版本

[PDF] soton.ac.uk

Qualitative evaluation of media device orchestration for immersive spatial audio reproduction

J Francombe, J Woodcock, RJ Hughes… - Journal of the Audio …, 2018 - eprints.soton.ac.uk

The challenge of installing and setting up dedicated spatial audio systems can make it
difficult to deliver immersive listening experiences to the general public. However, the …

被引用次数：28 相关文章所有 12 个版本

[PDF] arxiv.org

Audio-visual speaker tracking: Progress, challenges, and future directions

J Zhao, Y Xu, X Qian, D Berghi, P Wu, M Cui… - arXiv preprint arXiv …, 2023 - arxiv.org

Audio-visual speaker tracking has drawn increasing attention over the past few years due to
its academic values and wide application. Audio and visual modalities can provide …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Tragic Talkers: A Shakespearean sound-and light-field dataset for audio-visual machine learning research

D Berghi, M Volino, PJB Jackson - Proceedings of the 19th ACM …, 2022 - dl.acm.org

3D audio-visual production aims to deliver immersive and interactive experiences to the
consumer. Yet, faithfully reproducing real-world 3D scenes remains a challenging task. This …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

Real-time low-latency music source separation using Hybrid spectrogram-TasNet

S Venkatesh, A Benilov, P Coleman… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

There have been significant advances in deep learning for music demixing in recent years.
However, there has been little attention given to how these neural networks can be adapted …

被引用次数：2 相关文章所有 4 个版本

Visr—a versatile open software framework for audio signal processing

A Franck, FM Fazi - Audio Engineering Society Conference: 2018 AES …, 2018 - aes.org

Software plays an increasingly important role in spatial and object-based audio. Realtime
and interactive rendering is often needed to subjectively evaluate and demonstrate …

被引用次数：23 相关文章所有 3 个版本