ASR-based speech intelligibility prediction: A review

M Karbasi, D Kolossa - Hearing Research, 2022 - Elsevier
Various types of methods and approaches are available to predict the intelligibility of speech
signals, but many of these still suffer from two major problems: first, their required prior …

Comparing human and automatic speech recognition in simple and complex acoustic scenes

C Spille, B Kollmeier, BT Meyer - Computer Speech & Language, 2018 - Elsevier
Former comparisons of human speech recognition (HSR) and automatic speech recognition
(ASR) have shown that humans outperform ASR systems in nearly all speech recognition …

Multi-tone phase coding of interaural time difference for sound source localization with spiking neural networks

Z Pan, M Zhang, J Wu, J Wang… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Mammals exhibit remarkable capability of detecting and localizing sound sources in
complex acoustic environments by using binaural cues in the spiking manner. Emulating the …

An audio-visual system for object-based audio: from recording to listening

P Coleman, A Franck, J Francombe… - IEEE Transactions …, 2018 - ieeexplore.ieee.org
Object-based audio is an emerging representation for audio content, where content is
represented in a reproduction-format-agnostic way and, thus, produced once for …

Identification of perceptually relevant methods of inter-aural time difference estimation

A Andreopoulou, BFG Katz - The Journal of the Acoustical Society of …, 2017 - pubs.aip.org
The inter-aural time difference (ITD) is a fundamental cue for human sound localization.
Over the past decades several methods have been proposed for its estimation from …

Fundamentals of a parametric method for virtual navigation within an array of ambisonics microphones

JG Tylka, EY Choueiri - AES: Journal of the Audio Engineering …, 2020 - oar.princeton.edu
Fundamental aspects of a method for virtual navigation of a sound field within an array of
ambisonics microphones, wherein the subset of microphones to be used for interpolation is …

Modeling binaural unmasking of speech using a blind binaural processing stage

CF Hauth, SC Berning, B Kollmeier… - Trends in …, 2020 - journals.sagepub.com
The equalization cancellation model is often used to predict the binaural masking level
difference. Previously its application to speech in noise has required separate knowledge …

Multitask learning of time-frequency CNN for sound source localization

C Pang, H Liu, X Li - IEEE Access, 2019 - ieeexplore.ieee.org
Sound source localization (SSL) is an important technique for many audio processing
systems, such as speech enhancement/recognition and human-robot interaction. Although …

A Bayesian model for human directional localization of broadband static sound sources

R Barumerli, P Majdak, M Geronazzo… - Acta …, 2023 - acta-acustica.edpsciences.org
Humans estimate sound-source directions by combining prior beliefs with sensory evidence.
Prior beliefs represent statistical knowledge about the environment, and the sensory …

Cognitive persistence of soundscape in urban parks

X Hong, G Wang, J Liu, S Lan - Sustainable cities and society, 2019 - Elsevier
Cognition of soundscape information affects an individual's preference and understanding
soundscapes in urban parks. This study explored the cognitive persistence of soundscape …