Self-supervised learning for videos: A survey
The remarkable success of deep learning in various domains relies on the availability of
large-scale annotated datasets. However, obtaining annotations is expensive and requires …
large-scale annotated datasets. However, obtaining annotations is expensive and requires …
Sound event detection: A tutorial
Imagine standing on a street corner in the city. With your eyes closed you can hear and
recognize a succession of sounds: cars passing by, people speaking, their footsteps when …
recognize a succession of sounds: cars passing by, people speaking, their footsteps when …
Fsd50k: an open dataset of human-labeled sound events
Most existing datasets for sound event recognition (SER) are relatively small and/or domain-
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …
Panns: Large-scale pretrained audio neural networks for audio pattern recognition
Audio pattern recognition is an important research topic in the machine learning area, and
includes several tasks such as audio tagging, acoustic scene classification, music …
includes several tasks such as audio tagging, acoustic scene classification, music …
Listen, think, and understand
The ability of artificial intelligence (AI) systems to perceive and comprehend audio signals is
crucial for many applications. Although significant progress has been made in this area …
crucial for many applications. Although significant progress has been made in this area …
Vggsound: A large-scale audio-visual dataset
Our goal is to collect a large-scale audio-visual dataset with low label noise from videosin
the wild'using computer vision techniques. The resulting dataset can be used for training …
the wild'using computer vision techniques. The resulting dataset can be used for training …
[HTML][HTML] Machine learning in acoustics: Theory and applications
Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …
communications to ocean and Earth science. We survey the recent advances and …
Audio set: An ontology and human-labeled dataset for audio events
Audio event recognition, the human-like ability to identify and relate sounds from audio, is a
nascent problem in machine perception. Comparable problems such as object detection in …
nascent problem in machine perception. Comparable problems such as object detection in …
Deep transfer learning for automatic speech recognition: Towards better generalization
Automatic speech recognition (ASR) has recently become an important challenge when
using deep learning (DL). It requires large-scale training datasets and high computational …
using deep learning (DL). It requires large-scale training datasets and high computational …
CNN architectures for large-scale audio classification
Convolutional Neural Networks (CNNs) have proven very effective in image classification
and show promise for audio. We use various CNN architectures to classify the soundtracks …
and show promise for audio. We use various CNN architectures to classify the soundtracks …