Improved singing voice separation with chromagram-based pitch-aware remixing

W Xu, Z Chen, Z Tan, S Lv, R Han… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

A typical neural speech enhancement (SE) approach mainly handles speech and noise
mixtures, which is not optimal for singing voice enhancement scenarios where singing is …

被引用次数：3 相关文章所有 3 个版本

[PDF] ieee.org

Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques

Y Özer, M Müller - IEEE/ACM Transactions on Audio, Speech …, 2024 - ieeexplore.ieee.org

In this work, we address the novel and rarely considered source separation task of
decomposing piano concerto recordings into separate piano and orchestral tracks. Being a …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Semi-supervised time domain target speaker extraction with attention

Z Wang, R Giri, S Venkataramani, U Isik… - arXiv preprint arXiv …, 2022 - arxiv.org

In this work, we propose Exformer, a time-domain architecture for target speaker extraction. It
consists of a pre-trained speaker embedder network and a separator network based on …

被引用次数：7 相关文章所有 2 个版本

[PDF] aclanthology.org

Augmenting pre-trained language models with audio feature embedding for argumentation mining in political debates

R Mestre, SE Middleton, M Ryan… - Findings of the …, 2023 - aclanthology.org

The integration of multimodality in natural language processing (NLP) tasks seeks to exploit
the complementary information contained in two or more modalities, such as text, audio and …

被引用次数：5 相关文章所有 5 个版本

Unsupervised Deep Unfolded Representation Learning for Singing Voice Separation

W Yuan, S Wang, J Wang, M Unoki… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org

Learning effective vocal representations from a waveform mixture is a crucial but
challenging task for deep neural network (DNN)-based singing voice separation (SVS) …

被引用次数：1 相关文章所有 3 个版本

Audio deepfakes: feature extraction and model evaluation for detection

RK Bhukya, A Raj, DN Raja - 2024 5th International …, 2024 - ieeexplore.ieee.org

Cutting-edge AI-driven tools are currently employed for replicating human voices, leading to
the emergence of audio deepfakes. Initially designed to enhance experiences like audio …

被引用次数：1 相关文章

[PDF] mdpi.com

Air Traffic Controller Fatigue Detection by Applying a Dual-Stream Convolutional Neural Network to the Fusion of Radiotelephony and Facial Data

L Xu, S Ma, Z Shen, Y Nan - Aerospace, 2024 - mdpi.com

The role of air traffic controllers is to direct and manage highly dynamic flights. Their work
requires both efficiency and accuracy. Previous studies have shown that fatigue in air traffic …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

A study of audio mixing methods for piano transcription in violin-piano ensembles

H Kim, J Park, T Kwon, D Jeong… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

While piano music transcription models have shown high performance for solo piano
recordings, their performance de-grades when applied to ensemble recordings. This study …

被引用次数：5 相关文章所有 4 个版本

[PDF] mdpi.com

Unsupervised Single-Channel Singing Voice Separation with Weighted Robust Principal Component Analysis Based on Gammatone Auditory Filterbank and Vocal …

F Li, Y Hu, L Wang - Sensors, 2023 - mdpi.com

Singing-voice separation is a separation task that involves a singing voice and musical
accompaniment. In this paper, we propose a novel, unsupervised methodology for extracting …

被引用次数：3 相关文章所有 7 个版本

An Improved Optimal Transport Kernel Embedding Method with Gating Mechanism for Singing Voice Separation and Speaker Identification

W Yuan, Y Bian, S Wang, M Unoki… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Singing voice separation (SVS) and speaker identification (SI) are two classic problems in
speech signal processing. Deep neural networks (DNNs) solve these two problems by …

被引用次数：1 相关文章