A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward
Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …
modern tools such as Tensorflow or Keras, and open-source trained models, along with …
An overview of voice conversion and its challenges: From statistical modeling to deep learning
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while keeping the linguistic …
conversion, we change the speaker identity from one to another, while keeping the linguistic …
Contentvec: An improved self-supervised speech representation by disentangling speakers
Self-supervised learning in speech involves training a speech representation network on a
large-scale unannotated speech corpus, and then applying the learned representations to …
large-scale unannotated speech corpus, and then applying the learned representations to …
A review of synthetic image data and its use in computer vision
Development of computer vision algorithms using convolutional neural networks and deep
learning has necessitated ever greater amounts of annotated and labelled data to produce …
learning has necessitated ever greater amounts of annotated and labelled data to produce …
Stargan-vc2: Rethinking conditional methods for stargan-based voice conversion
Non-parallel multi-domain voice conversion (VC) is a technique for learning mappings
among multiple domains without relying on parallel data. This is important but challenging …
among multiple domains without relying on parallel data. This is important but challenging …
A novel deep learning technique for morphology preserved fetal ECG extraction from mother ECG using 1D-CycleGAN
The non-invasive fetal electrocardiogram (fECG) enables easy detection of developing heart
abnormalities, leading to a significant reduction in infant mortality rate and post-natal …
abnormalities, leading to a significant reduction in infant mortality rate and post-natal …
Again-vc: A one-shot voice conversion using activation guidance and adaptive instance normalization
Recently, voice conversion (VC) has been widely studied. Many VC systems use
disentangle-based learning techniques to separate the speaker and the linguistic content …
disentangle-based learning techniques to separate the speaker and the linguistic content …
Data augmentation for deep neural networks model in EEG classification task: a review
Classification of electroencephalogram (EEG) is a key approach to measure the rhythmic
oscillations of neural activity, which is one of the core technologies of brain-computer …
oscillations of neural activity, which is one of the core technologies of brain-computer …
Cyclegan-vc3: Examining and improving cyclegan-vcs for mel-spectrogram conversion
Non-parallel voice conversion (VC) is a technique for learning mappings between source
and target speeches without using a parallel corpus. Recently, cycle-consistent adversarial …
and target speeches without using a parallel corpus. Recently, cycle-consistent adversarial …