Sensor-based human activity recognition with spatio-temporal deep learning

O Nafea, W Abdul, G Muhammad, M Alsulaiman - Sensors, 2021 - mdpi.com
Human activity recognition (HAR) remains a challenging yet crucial problem to address in
computer vision. HAR is primarily intended to be used with other technologies, such as the …

End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks

SW Fu, TW Wang, Y Tsao, X Lu… - IEEE/ACM Transactions …, 2018 - ieeexplore.ieee.org
Speech enhancement model is used to map a noisy speech to a clean speech. In the
training stage, an objective function is often adopted to optimize the model parameters …

Gated residual networks with dilated convolutions for monaural speech enhancement

K Tan, J Chen, DL Wang - IEEE/ACM transactions on audio …, 2018 - ieeexplore.ieee.org
For supervised speech enhancement, contextual information is important for accurate mask
estimation or spectral mapping. However, commonly used deep neural networks (DNNs) are …

Feature learning for human activity recognition using convolutional neural networks: A case study for inertial measurement unit and audio data

F Cruciani, A Vafeiadis, C Nugent, I Cleland… - CCF Transactions on …, 2020 - Springer
Abstract The use of Convolutional Neural Networks (CNNs) as a feature learning method for
Human Activity Recognition (HAR) is becoming more and more common. Unlike …

Multichannel speech enhancement by raw waveform-mapping using fully convolutional networks

CL Liu, SW Fu, YJ Li, JW Huang… - … /ACM Transactions on …, 2020 - ieeexplore.ieee.org
In recent years, waveform-mapping-based speech enhancement (SE) methods have
garnered significant attention. These methods generally use a deep learning model to …

Cat: Causal audio transformer for audio classification

X Liu, H Lu, J Yuan, X Li - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
The attention-based Transformers have been increasingly applied to audio classification
because of their global receptive field and ability to handle long-term dependency. However …

The cocktail fork problem: Three-stem audio separation for real-world soundtracks

D Petermann, G Wichern, ZQ Wang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
The cocktail party problem aims at isolating any source of interest within a complex acoustic
scene, and has long inspired audio source separation research. Recent efforts have mainly …

FENet: a frequency extraction network for obstructive sleep apnea detection

G Ye, H Yin, T Chen, H Chen, L Cui… - IEEE Journal of …, 2021 - ieeexplore.ieee.org
Obstructive Sleep Apnea (OSA) is a highly prevalent but inconspicuous disease that
seriously jeopardizes the health of human beings. Polysomnography (PSG), the gold …

Deep learning approaches in topics of singing information processing

C Gupta, H Li, M Goto - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Singing, the vocal productionof musical tones, is one of the most important elements of
music. Addressing the needs of real-world applications, the study of technologies related to …

[PDF][PDF] Music Source Separation Using Stacked Hourglass Networks.

S Park, T Kim, K Lee, N Kwak - ISMIR, 2018 - archives.ismir.net
In this paper, we propose a simple yet effective method for multiple music source separation
using convolutional neural networks. Stacked hourglass network, which was originally …