- 学术资源搜索

World: a vocoder-based high-quality speech synthesis system for real-time applications

M Morise, F Yokomori, K Ozawa - IEICE TRANSACTIONS on …, 2016 - search.ieice.org

A vocoder-based speech synthesis system, named WORLD, was developed in an effort to
improve the sound quality of real-time applications using speech. Speech analysis …

被引用次数：1432 相关文章所有 11 个版本

[PDF] arxiv.org

Non-autoregressive sequence-to-sequence voice conversion

T Hayashi, WC Huang, K Kobayashi… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

This paper proposes a novel voice conversion (VC) method based on non-autoregressive
sequence-to-sequence (NAR-S2S) models. Inspired by the great success of NAR-S2S …

被引用次数：20 相关文章所有 3 个版本

[PDF] arxiv.org

Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis

H Kawahara, Y Agiomyrgiannakis, H Zen - arXiv preprint arXiv …, 2016 - arxiv.org

This paper introduces a general and flexible framework for F0 and aperiodicity (additive non
periodic component) analysis, specifically intended for high-quality speech synthesis and …

被引用次数：39 相关文章所有 16 个版本

[HTML] springer.com

[HTML][HTML] Noise and acoustic modeling with waveform generator in text-to-speech and neutral speech conversion

MS Al-Radhi, TG Csapó, G Németh - Multimedia Tools and Applications, 2021 - Springer

This article focuses on developing a system for high-quality synthesized and converted
speech by addressing three fundamental principles. Although the noise-like component in …

被引用次数：5 相关文章所有 7 个版本

[PDF] isca-archive.org

[PDF][PDF] SparkNG: Interactive MATLAB Tools for Introduction to Speech Production, Perception and Processing Fundamentals and Application of the Aliasing-Free LF …

H Kawahara - INTERSPEECH, 2016 - isca-archive.org

This article introduces a set of interactive tools for studying fundamentals of speech
production, perception and processing. In addition to this voice production simulator, it …

被引用次数：13 相关文章所有 3 个版本

[HTML] springer.com

[HTML][HTML] Continuous vocoder applied in deep neural network based voice conversion

MS Al-Radhi, TG Csapó, G Németh - Multimedia Tools and Applications, 2019 - Springer

In this paper, a novel vocoder is proposed for a Statistical Voice Conversion (SVC)
framework using deep neural network, where multiple features from the speech of two …

被引用次数：7 相关文章所有 5 个版本

[PDF] academia.edu

Analysis and synthesis of strong vocal expressions: Extension and application of audio texture features to singing voice

H Kawahara, M Morise - 2012 IEEE International Conference …, 2012 - ieeexplore.ieee.org

Realistic reconstruction and manipulation of strong vocal expressions found in singing
voices is a challenging and exciting topic. A speech analysis, modification and resynthesis …

被引用次数：12 相关文章所有 4 个版本

[PDF] arxiv.org

MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion

WJ Chung, D Kim, SW Chung, HG Kang - arXiv preprint arXiv:2306.09640, 2023 - arxiv.org

We introduce Multi-level feature Fusion-based Periodicity Analysis Model (MF-PAM), a novel
deep learning-based pitch estimation model that accurately estimates pitch trajectory in …

被引用次数：1 相关文章所有 6 个版本

[PDF] github.io

[PDF][PDF] Continuous vocoder in feed-forward deep neural network based speech synthesis

MS Al-Radhi, TG Csapó, G Németh - Proceedings of digital …, 2017 - malradhi.github.io

Deep neural networks Page 1 http://smartlab.tmit.bme.hu Continuous vocoder in feed-forward
deep neural network based speech synthesis Mohammed Salah Al-Radhi, Tamás Gábor …

被引用次数：6 相关文章所有 3 个版本

[PDF] isca-archive.org

[PDF][PDF] A Fast and Accurate Fundamental Frequency Estimator Using Recursive Moving Average Filters.

R Daido, Y Hisaminato - INTERSPEECH, 2016 - isca-archive.org

We propose a fundamental frequency (F0) estimation method which is fast, accurate and
suitable for real-time use. While the proposed method is based on the same framework as …

被引用次数：6 相关文章所有 3 个版本