CheapTrick, a spectral envelope estimator for high-quality speech synthesis

SH Mohammadi, A Kain - Speech Communication, 2017 - Elsevier

Voice transformation (VT) aims to change one or more aspects of a speech signal while
preserving linguistic information. A subset of VT, Voice conversion (VC) specifically aims to …

被引用次数：322 相关文章所有 6 个版本

[PDF] jst.go.jp

World: a vocoder-based high-quality speech synthesis system for real-time applications

M Morise, F Yokomori, K Ozawa - IEICE TRANSACTIONS on …, 2016 - search.ieice.org

A vocoder-based speech synthesis system, named WORLD, was developed in an effort to
improve the sound quality of real-time applications using speech. Speech analysis …

被引用次数：1432 相关文章所有 11 个版本

[HTML] sciencedirect.com

[HTML][HTML] D4C, a band-aperiodicity estimator for high-quality speech synthesis

M Morise - Speech Communication, 2016 - Elsevier

An algorithm is proposed for estimating the band aperiodicity of speech signals, where
“aperiodicity” is defined as the power ratio between the speech signal and the aperiodic …

被引用次数：224 相关文章所有 6 个版本

[PDF] isca-archive.org

[PDF][PDF] Harvest: A High-Performance Fundamental Frequency Estimator from Speech Signals.

M Morise - INTERSPEECH, 2017 - isca-archive.org

A fundamental frequency (F0) estimator named Harvest is described. The unique points of
Harvest are that it can obtain a reliable F0 contour and reduce the error that the voiced …

被引用次数：107 相关文章所有 4 个版本

[PDF] arxiv.org

Evaluating voice conversion-based privacy protection against informed attackers

BML Srivastava, N Vauquier… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

Speech data conveys sensitive speaker attributes like identity or accent. With a small
amount of found data, such attributes can be inferred and exploited for malicious purposes …

被引用次数：85 相关文章所有 11 个版本

[PDF] ed.ac.uk

Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis

J Lorenzo-Trueba, GE Henter, S Takaki… - Speech …, 2018 - Elsevier

In this paper, we investigate the simultaneous modeling of multiple emotions in DNN-based
expressive speech synthesis, and how to represent the emotional labels, such as emotional …

被引用次数：95 相关文章所有 4 个版本

[PDF] researchgate.net

mmphone: Acoustic eavesdropping on loudspeakers via mmwave-characterized piezoelectric effect

C Wang, F Lin, T Liu, Z Liu, Y Shen, Z Ba… - … -IEEE Conference on …, 2022 - ieeexplore.ieee.org

More and more people turn to online voice communication with loudspeaker-equipped
devices due to its convenience. To prevent speech leakage, soundproof rooms are often …

被引用次数：25 相关文章所有 6 个版本

[PDF] arxiv.org

Deep encoder-decoder models for unsupervised learning of controllable speech synthesis

GE Henter, J Lorenzo-Trueba, X Wang… - arXiv preprint arXiv …, 2018 - arxiv.org

Generating versatile and appropriate synthetic speech requires control over the output
expression separate from the spoken text. Important non-textual speech variation is seldom …

被引用次数：67 相关文章所有 2 个版本

[PDF] arxiv.org

Emotionless: Privacy-preserving speech analysis for voice assistants

R Aloufi, H Haddadi, D Boyle - arXiv preprint arXiv:1908.03632, 2019 - arxiv.org

Voice-enabled interactions provide more human-like experiences in many popular IoT
systems. Cloud-based speech analysis services extract useful information from voice input …

被引用次数：50 相关文章所有 2 个版本

Estimation and Voicing Detection With Cascade Architecture in Noisy Speech

Y Zhang, H Wang, DL Wang - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org

As a fundamental problem in speech processing, pitch tracking has been studied for
decades. While strong performance has been achieved on clean speech, pitch tracking in …

被引用次数：2 相关文章所有 2 个版本