Representation of complex spectrogram via phase conversion

K Yatabe, Y Masuyama, T Kusano… - Acoustical Science and …, 2019 - jstage.jst.go.jp
As importance of the phase of complex spectrogram has been recognized widely, many
techniques have been proposed for handling it. However, several definitions and …

Temporally variable multi-aspect N-way morphing based on interference-free speech representations

H Kawahara, M Morise, H Banno… - 2013 Asia-Pacific Signal …, 2013 - ieeexplore.ieee.org
Voice morphing is a powerful tool for exploratory research and various applications. A
temporally variable multi-aspect morphing is extended to enable morphing of arbitrarily …

Aliasing-free implementation of discrete-time glottal source models and their applications to speech synthesis and F0 extractor evaluation

H Kawahara, KI Sakakibara, H Banno… - 2015 Asia-Pacific …, 2015 - ieeexplore.ieee.org
A closed-form representation of anti-aliased LF model is derived for a LPF function family
based on cosine series. The Matlab based implementation of the derived form provides …

An investigation of the effectiveness of phase for audio classification

S Hidaka, K Wakamiya… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
While log-amplitude mel-spectrogram has widely been used as the feature representation
for processing speech based on deep learning, the effectiveness of another aspect of …

[PDF][PDF] Inharmonic speech: A tool for the study of speech perception and separation

JH McDermott, DPW Ellis… - SAPA-SCALE Conference, 2012 - isca-archive.org
Sounds created by a periodic process have a Fourier representation with harmonic structure–
ie, components at multiples of a fundamental frequency. Harmonic frequency relations are a …

Postural control of the vocal tract affects auditory speech perception.

HH Yeung, M Scott - Journal of Experimental Psychology: General, 2021 - psycnet.apa.org
Many researchers have proposed that sensorimotor information about the dynamic
production of speech gestures can supplement the auditory perception of speech. Here we …

Higher order waveform symmetry measure and its application to periodicity detectors for speech and singing with fine temporal resolution

H Kawahara, M Morise, R Nisimura… - 2013 IEEE International …, 2013 - ieeexplore.ieee.org
Another simple and high-speed F0 extractor with high temporal resolution based on our
previous proposal has been developed by adding a higher-order symmetry measure. This …

Application of the velvet noise and its variant for synthetic speech and singing

H Kawahara - IPSJ SIG Technical Report, 2018 - ipsj.ixsq.nii.ac.jp
The Velvet noise is a sparse signal which sounds smoother than Gaussian white noise. We
propose the direct use of the velvet noise and application of its variant for speech and …

Analysis and synthesis of strong vocal expressions: Extension and application of audio texture features to singing voice

H Kawahara, M Morise - 2012 IEEE International Conference …, 2012 - ieeexplore.ieee.org
Realistic reconstruction and manipulation of strong vocal expressions found in singing
voices is a challenging and exciting topic. A speech analysis, modification and resynthesis …

[PDF][PDF] Excitation source analysis for high-quality speech manipulation systems based on an interference-free representation of group delay with minimum phase …

H Kawahara, M Morise, T Toda, H Banno… - … Annual Conference of …, 2014 - isca-archive.org
A group delay-based excitation source analysis and design method is introduced for
extension of TANDEM-STRAIGHT, a speech analysis, modification and synthesis system …