TUT database for acoustic scene classification and sound event detection

MC Schiappa, YS Rawat, M Shah - ACM Computing Surveys, 2023 - dl.acm.org

The remarkable success of deep learning in various domains relies on the availability of
large-scale annotated datasets. However, obtaining annotations is expensive and requires …

被引用次数：149 相关文章所有 4 个版本

[PDF] arxiv.org

Sound event detection: A tutorial

A Mesaros, T Heittola, T Virtanen… - IEEE Signal …, 2021 - ieeexplore.ieee.org

Imagine standing on a street corner in the city. With your eyes closed you can hear and
recognize a succession of sounds: cars passing by, people speaking, their footsteps when …

被引用次数：262 相关文章所有 9 个版本

[PDF] arxiv.org

Fsd50k: an open dataset of human-labeled sound events

E Fonseca, X Favory, J Pons, F Font… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org

Most existing datasets for sound event recognition (SER) are relatively small and/or domain-
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …

被引用次数：506 相关文章所有 5 个版本

[PDF] surrey.ac.uk

Panns: Large-scale pretrained audio neural networks for audio pattern recognition

Q Kong, Y Cao, T Iqbal, Y Wang… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org

Audio pattern recognition is an important research topic in the machine learning area, and
includes several tasks such as audio tagging, acoustic scene classification, music …

被引用次数：1252 相关文章所有 8 个版本

[PDF] arxiv.org

Listen, think, and understand

Y Gong, H Luo, AH Liu, L Karlinsky, J Glass - arXiv preprint arXiv …, 2023 - arxiv.org

The ability of artificial intelligence (AI) systems to perceive and comprehend audio signals is
crucial for many applications. Although significant progress has been made in this area …

被引用次数：139 相关文章所有 6 个版本

[PDF] arxiv.org

Vggsound: A large-scale audio-visual dataset

H Chen, W Xie, A Vedaldi… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org

Our goal is to collect a large-scale audio-visual dataset with low label noise from videosin
the wild'using computer vision techniques. The resulting dataset can be used for training …

被引用次数：606 相关文章所有 10 个版本

[HTML] aip.org

[HTML][HTML] Machine learning in acoustics: Theory and applications

MJ Bianco, P Gerstoft, J Traer, E Ozanich… - The Journal of the …, 2019 - pubs.aip.org

Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …

被引用次数：550 相关文章所有 14 个版本

[PDF] google.com

Audio set: An ontology and human-labeled dataset for audio events

JF Gemmeke, DPW Ellis, D Freedman… - … on acoustics, speech …, 2017 - ieeexplore.ieee.org

Audio event recognition, the human-like ability to identify and relate sounds from audio, is a
nascent problem in machine perception. Comparable problems such as object detection in …

被引用次数：3873 相关文章所有 11 个版本

[PDF] arxiv.org

Deep transfer learning for automatic speech recognition: Towards better generalization

H Kheddar, Y Himeur, S Al-Maadeed, A Amira… - Knowledge-Based …, 2023 - Elsevier

Automatic speech recognition (ASR) has recently become an important challenge when
using deep learning (DL). It requires large-scale training datasets and high computational …

被引用次数：80 相关文章所有 5 个版本

[PDF] arxiv.org

CNN architectures for large-scale audio classification

S Hershey, S Chaudhuri, DPW Ellis… - … on acoustics, speech …, 2017 - ieeexplore.ieee.org

Convolutional Neural Networks (CNNs) have proven very effective in image classification
and show promise for audio. We use various CNN architectures to classify the soundtracks …

被引用次数：3224 相关文章所有 9 个版本