Sensor-based human activity recognition with spatio-temporal deep learning
Human activity recognition (HAR) remains a challenging yet crucial problem to address in
computer vision. HAR is primarily intended to be used with other technologies, such as the …
computer vision. HAR is primarily intended to be used with other technologies, such as the …
End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks
Speech enhancement model is used to map a noisy speech to a clean speech. In the
training stage, an objective function is often adopted to optimize the model parameters …
training stage, an objective function is often adopted to optimize the model parameters …
Gated residual networks with dilated convolutions for monaural speech enhancement
For supervised speech enhancement, contextual information is important for accurate mask
estimation or spectral mapping. However, commonly used deep neural networks (DNNs) are …
estimation or spectral mapping. However, commonly used deep neural networks (DNNs) are …
Feature learning for human activity recognition using convolutional neural networks: A case study for inertial measurement unit and audio data
Abstract The use of Convolutional Neural Networks (CNNs) as a feature learning method for
Human Activity Recognition (HAR) is becoming more and more common. Unlike …
Human Activity Recognition (HAR) is becoming more and more common. Unlike …
Multichannel speech enhancement by raw waveform-mapping using fully convolutional networks
In recent years, waveform-mapping-based speech enhancement (SE) methods have
garnered significant attention. These methods generally use a deep learning model to …
garnered significant attention. These methods generally use a deep learning model to …
Cat: Causal audio transformer for audio classification
The attention-based Transformers have been increasingly applied to audio classification
because of their global receptive field and ability to handle long-term dependency. However …
because of their global receptive field and ability to handle long-term dependency. However …
The cocktail fork problem: Three-stem audio separation for real-world soundtracks
The cocktail party problem aims at isolating any source of interest within a complex acoustic
scene, and has long inspired audio source separation research. Recent efforts have mainly …
scene, and has long inspired audio source separation research. Recent efforts have mainly …
FENet: a frequency extraction network for obstructive sleep apnea detection
Obstructive Sleep Apnea (OSA) is a highly prevalent but inconspicuous disease that
seriously jeopardizes the health of human beings. Polysomnography (PSG), the gold …
seriously jeopardizes the health of human beings. Polysomnography (PSG), the gold …
Deep learning approaches in topics of singing information processing
Singing, the vocal productionof musical tones, is one of the most important elements of
music. Addressing the needs of real-world applications, the study of technologies related to …
music. Addressing the needs of real-world applications, the study of technologies related to …
[PDF][PDF] Music Source Separation Using Stacked Hourglass Networks.
In this paper, we propose a simple yet effective method for multiple music source separation
using convolutional neural networks. Stacked hourglass network, which was originally …
using convolutional neural networks. Stacked hourglass network, which was originally …