Classification of ALS patients based on acoustic analysis of sustained vowel phonations
M Vashkevich, Y Rushkevich - Biomedical Signal Processing and Control, 2021 - Elsevier
Amyotrophic lateral sclerosis (ALS) is incurable neurological disorder with rapidly
progressive course. Common early symptoms of ALS are difficulty in swallowing and …
progressive course. Common early symptoms of ALS are difficulty in swallowing and …
MSMC-TTS: Multi-stage multi-codebook VQ-VAE based neural TTS
This article aims to improve neural TTS with vector-quantized, compact speech
representations. We propose a Vector-Quantized Variational AutoEncoder (VQ-VAE) based …
representations. We propose a Vector-Quantized Variational AutoEncoder (VQ-VAE) based …
Fusion of spectral and prosody modelling for multilingual speech emotion conversion
S Vekkot, D Gupta - Knowledge-Based Systems, 2022 - Elsevier
The paper proposes an integrated speech emotion conversion framework developed using
speaker-independent mixed-lingual training. The key contribution of the work is non-parallel …
speaker-independent mixed-lingual training. The key contribution of the work is non-parallel …
Bulbar ALS detection based on analysis of voice perturbation and vibrato
M Vashkevich, A Petrovsky… - 2019 Signal Processing …, 2019 - ieeexplore.ieee.org
On average the lack of biological markers causes a one year diagnostic delay to detect
amyotrophic lateral sclerosis (ALS). To improve the diagnostic process an automatic voice …
amyotrophic lateral sclerosis (ALS). To improve the diagnostic process an automatic voice …
Contributions of Jitter and Shimmer in the Voice for Fake Audio Detection
Fake audio detection (FAD) aims to identify fraudulent speech generated through advanced
speech-synthesis techniques. Most current FAD methods rely solely on a deep neural …
speech-synthesis techniques. Most current FAD methods rely solely on a deep neural …
Emotional voice conversion using a hybrid framework with speaker-adaptive DNN and particle-swarm-optimized neural network
We propose a hybrid network-based learning framework for speaker-adaptive vocal emotion
conversion, tested on three different datasets (languages), namely, EmoDB (German) …
conversion, tested on three different datasets (languages), namely, EmoDB (German) …
CAMNet: A controllable acoustic model for efficient, expressive, high-quality text-to-speech
JM Alvarez, H Francois, H Sung, S Choi, J Jeong… - Applied Acoustics, 2022 - Elsevier
Spoken language is becoming one of the key components of human–machine interaction,
both to send information to the machine–eg voice control–and to receive from it–eg virtual …
both to send information to the machine–eg voice control–and to receive from it–eg virtual …
HAEPF: hybrid approach for estimating pitch frequency in the presence of reverberation
ES Hassan, B Neyazi, HS Seddeq… - Multimedia Tools and …, 2024 - Springer
In the realm of speaker identification, pitch frequency serves as a fundamental feature.
However, this feature can be compromised when a speaker records his speech in a closed …
However, this feature can be compromised when a speaker records his speech in a closed …
A novel pitch detection algorithm based on instantaneous frequency for clean and noisy speech
In this paper, a novel pitch detection algorithm (PDA) is proposed. Actually, pitch detection is
a classical problem that has been investigated since the very beginning of speech …
a classical problem that has been investigated since the very beginning of speech …
[PDF][PDF] Real-time voice conversion using artificial neural networks with rectified linear units.
This paper presents an approach to parametric voice conversion that can be used in real-
time entertainment applications. The approach is based on spectral mapping using an …
time entertainment applications. The approach is based on spectral mapping using an …