Training audio captioning models without audio
S Deshmukh, B Elizalde… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Automated Audio Captioning (AAC) is the task of generating natural language descriptions
given an audio stream. A typical AAC system requires manually curated training data of …
given an audio stream. A typical AAC system requires manually curated training data of …
Perceptual–neural–physical sound matching
Sound matching algorithms seek to approximate a target waveform by parametric audio
synthesis. Deep neural networks have achieved promising results in matching sustained …
synthesis. Deep neural networks have achieved promising results in matching sustained …
Classifying non-individual head-related transfer functions with a computational auditory model: Calibration and metrics
This study explores the use of a multi-feature Bayesian auditory sound localisation model to
classify non-individual head-related transfer functions (HRTFs). Based on predicted sound …
classify non-individual head-related transfer functions (HRTFs). Based on predicted sound …
Semantically-informed deep neural networks for sound recognition
M Esposito, G Valente… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Deep neural networks (DNNs) for sound recognition learn to categorize a barking sound as
a" dog" and a meowing sound as a" cat" but do not exploit information inherent to the …
a" dog" and a meowing sound as a" cat" but do not exploit information inherent to the …
An Approach to Ontological Learning from Weak Labels
Ontologies encompass a formal representation of knowledge through the definition of
concepts or properties of a domain, and the relationships between those concepts. In this …
concepts or properties of a domain, and the relationships between those concepts. In this …
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding
Recent literature uses language to build foundation models for audio. These Audio-
Language Models (ALMs) are trained on a vast number of audio-text pairs and show …
Language Models (ALMs) are trained on a vast number of audio-text pairs and show …
[PDF][PDF] Steering latent audio models through interactive machine learning
G Vigliensoni, R Fiebrink - 2023 - ualresearchonline.arts.ac.uk
In this paper, we present a proof-of-concept mechanism for steering latent audio models
through interactive machine learning. Our approach involves mapping the human …
through interactive machine learning. Our approach involves mapping the human …
Using Machine Learning to Understand the Relationships Between Audiometric Data, Speech Perception, Temporal Processing, And Cognition
Aging and hearing loss cause communication difficulties, particularly for speech perception
in demanding situations, which have been associated with factors including cognitive …
in demanding situations, which have been associated with factors including cognitive …
[图书][B] Listening: The Key Concepts
A vital and comprehensive starting place for understanding the key concepts, this book
explores 177 diverse types and styles of listening named in academic scholarship to date …
explores 177 diverse types and styles of listening named in academic scholarship to date …
Perceptual Analysis of Speaker Embeddings for Voice Discrimination between Machine And Human Listening
This study investigates the information captured by speaker embeddings with relevance to
human speech perception. A Convolutional Neural Network was trained to perform one-shot …
human speech perception. A Convolutional Neural Network was trained to perform one-shot …