Deepfakes as a threat to a speaker and facial recognition: An overview of tools and attack vectors
Deepfakes present an emerging threat in cyberspace. Recent developments in machine
learning make deepfakes highly believable, and very difficult to differentiate between what is …
learning make deepfakes highly believable, and very difficult to differentiate between what is …
Cross-speaker emotion transfer for low-resource text-to-speech using non-parallel voice conversion with pitch-shift data augmentation
R Terashima, R Yamamoto, E Song… - arXiv preprint arXiv …, 2022 - arxiv.org
Data augmentation via voice conversion (VC) has been successfully applied to low-resource
expressive text-to-speech (TTS) when only neutral data for the target speaker are available …
expressive text-to-speech (TTS) when only neutral data for the target speaker are available …
Nonparallel emotional voice conversion for unseen speaker-emotion pairs using dual domain adversarial network & virtual domain pairing
Primary goal of an emotional voice conversion (EVC) system is to convert the emotion of a
given speech signal from one style to another style without modifying the linguistic content of …
given speech signal from one style to another style without modifying the linguistic content of …
Promptvc: Flexible stylistic voice conversion in latent space driven by natural language prompts
Stylistic voice conversion aims to transform the style of source speech to a desired style
according to real-world application demands. However, the current style voice conversion …
according to real-world application demands. However, the current style voice conversion …
TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Recent advances in synthetic speech quality have enabled us to train text-to-speech (TTS)
systems by using synthetic corpora. However, merely increasing the amount of synthetic …
systems by using synthetic corpora. However, merely increasing the amount of synthetic …
Learning Emotional Representations from Imbalanced Speech Data for Speech Emotion Recognition and Emotional Text-to-Speech
Effective speech emotional representations play a key role in Speech Emotion Recognition
(SER) and Emotional Text-To-Speech (TTS) tasks. However, emotional speech samples are …
(SER) and Emotional Text-To-Speech (TTS) tasks. However, emotional speech samples are …
Accented text-to-speech synthesis with a conditional variational autoencoder
Accent plays a significant role in speech communication, influencing understanding
capabilities and also conveying a person's identity. This paper introduces a novel and …
capabilities and also conveying a person's identity. This paper introduces a novel and …
A High-Quality Melody-Aware Peking Opera Synthesizer Using Data Augmentation
X Zhou, W Sun, X Shi - 2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
The performing art of Peking Opera places great demands on the singing skills of singers,
including pronunciation, melody, role, personal style and emotional expression, which …
including pronunciation, melody, role, personal style and emotional expression, which …
Nonparallel expressive tts for unseen target speaker using style-controlled adaptive layer and optimized pitch embedding
MS Al-Radhi, TG Csapó… - … Conference on Speech …, 2023 - ieeexplore.ieee.org
Recent advancements in text-to-speech (TTS) systems have focused on developing style-
controlled models that generate speech with desired characteristics such as accent, tone …
controlled models that generate speech with desired characteristics such as accent, tone …
Robot reads ads: likability of calm and energetic audio advertising styles transferred to synthesized voices
H Pajupuu, J Pajupuu, R Altrov, I Kiissel - Frontiers in Communication, 2023 - frontiersin.org
The increasing prevalence of audio advertising has provided a challenge to find out more
about voices and performance styles used in advertisements. In this study, we were …
about voices and performance styles used in advertisements. In this study, we were …