Harvest: A High-Performance Fundamental Frequency Estimator from Speech Signals.

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …

被引用次数：46 相关文章所有 7 个版本

[PDF] ieee.org

Speech synthesis with mixed emotions

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Emotional speech synthesis aims to synthesize human voices with various emotional effects.
The current studies are mostly focused on imitating an averaged style belonging to a specific …

被引用次数：39 相关文章所有 7 个版本

[HTML] nih.gov

Styletts-vc: One-shot voice conversion by knowledge transfer from style-based tts models

YA Li, C Han, N Mesgarani - 2022 IEEE Spoken Language …, 2023 - ieeexplore.ieee.org

One-shot voice conversion (VC) aims to convert speech from any source speaker to an
arbitrary target speaker with only a few seconds of reference speech from the target speaker …

被引用次数：12 相关文章所有 6 个版本

[PDF] arxiv.org

Visinger 2: High-fidelity end-to-end singing voice synthesis enhanced by digital signal processing synthesizer

Y Zhang, H Xue, H Li, L Xie, T Guo, R Zhang… - arXiv preprint arXiv …, 2022 - arxiv.org

End-to-end singing voice synthesis (SVS) model VISinger can achieve better performance
than the typical two-stage model with fewer parameters. However, VISinger has several …

被引用次数：14 相关文章所有 4 个版本

[PDF] ieee.org

Converting foreign accent speech without a reference

G Zhao, S Ding… - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org

Foreign accent conversion (FAC) is the problem of generating a synthetic voice that has the
voice identity of a second-language (L2) learner and the pronunciation patterns of a native …

被引用次数：23 相关文章所有 4 个版本

[PDF] arxiv.org

A comparative study of voice conversion models with large-scale speech and singing data: The T13 systems for the singing voice conversion challenge 2023

R Yamamoto, R Yoneyama, LP Violeta… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

This paper presents our systems (denoted as T13) for the singing voice conversion
challenge (SVCC) 2023. For both in-domain and cross-domain English singing voice …

被引用次数：5 相关文章所有 3 个版本

[PDF] ieee.org

Acoustic tracking of pitch, modal, and subharmonic vibrations of vocal folds in Parkinson's disease and parkinsonism

J Hlavnička, R Čmejla, J Klempíř, E Růžička… - IEEE Access, 2019 - ieeexplore.ieee.org

The prominent and early presence of dysphonia is considered a valuable marker for
differentiation of idiopathic Parkinson's disease and parkinsonian syndromes. Objective …

被引用次数：36 相关文章所有 3 个版本

[PDF] xlhu.cn

A fast high-fidelity source-filter vocoder with lightweight neural modules

R Yang, Y Peng, X Hu - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org

The quality of raw audio waveform generated by a vocoder could affect various audio
generative tasks. In recent years, the dominance of source-filter vocoders was greatly …

被引用次数：3 相关文章所有 4 个版本

Validation of freely-available pitch detection algorithms across various noise levels in assessing speech captured by smartphone in Parkinson's disease

V Illner, P Sovka, J Rusz - Biomedical Signal Processing and Control, 2020 - Elsevier

Measuring the fundamental frequency of the vocal folds F 0 is recognized as an important
parameter in the assessment of speech impairments in Parkinsons disease (PD). Although a …

被引用次数：32 相关文章

[PDF] arxiv.org

Traditional machine learning for pitch detection

T Drugman, G Huybrechts, V Klimkov… - IEEE Signal …, 2018 - ieeexplore.ieee.org

Pitch detection is a fundamental problem in speech processing as F0 is used in a large
number of applications. Recent papers have proposed deep learning for robust pitch …

被引用次数：36 相关文章所有 5 个版本