Processing rhythm in speech and music: Shared mechanisms and implications for developmental speech and language disorders.

A Fiveash, N Bedoin, RL Gordon, B Tillmann - Neuropsychology, 2021 - psycnet.apa.org
Objective: Music and speech are complex signals containing regularities in how they unfold
in time. Similarities between music and speech/language in terms of their auditory features …

Ten years of research on automatic voice and speech analysis of people with Alzheimer's disease and mild cognitive impairment: a systematic review article

I Martínez-Nicolás, TE Llorente… - Frontiers in …, 2021 - frontiersin.org
Background: The field of voice and speech analysis has become increasingly popular over
the last 10 years, and articles on its use in detecting neurodegenerative diseases have …

A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

Towards end-to-end prosody transfer for expressive speech synthesis with tacotron

RJ Skerry-Ryan, E Battenberg, Y Xiao… - international …, 2018 - proceedings.mlr.press
We present an extension to the Tacotron speech synthesis architecture that learns a latent
embedding space of prosody, derived from a reference acoustic representation containing …

[图书][B] Intonation and prosodic structure

C Féry - 2016 - books.google.com
This book provides a state-of-the-art survey of intonation and prosodic structure. Taking a
phonological perspective, it shows how morpho-syntactic constituents are mapped to …

[图书][B] Cognitive psychology: A student's handbook

MW Eysenck, MT Keane - 2020 - taylorfrancis.com
The fully updated eighth edition of Cognitive Psychology: A Student's Handbook provides
comprehensive yet accessible coverage of all the key areas in the field ranging from visual …

Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis

G Sun, Y Zhang, RJ Weiss, Y Cao… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
This paper proposes a hierarchical, fine-grained and interpretable latent variable model for
prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution …

Prosody in context: A review

J Cole - Language, Cognition and Neuroscience, 2015 - Taylor & Francis
Prosody conveys information about the linguistic context of an utterance at every level of
linguistic organisation, from the word up to the discourse context. Acoustic correlates of …

Controllable neural text-to-speech synthesis using intuitive prosodic features

T Raitio, R Rasipuram, D Castellani - arXiv preprint arXiv:2009.06775, 2020 - arxiv.org
Modern neural text-to-speech (TTS) synthesis can generate speech that is indistinguishable
from natural speech. However, the prosody of generated utterances often represents the …

Brain mechanisms of acoustic communication in humans and nonhuman primates: an evolutionary perspective

H Ackermann, SR Hage, W Ziegler - Behavioral and Brain Sciences, 2014 - cambridge.org
Any account of “what is special about the human brain”(Passingham 2008) must specify the
neural basis of our unique ability to produce speech and delineate how these remarkable …