A linear memory CTC-based algorithm for text-to-voice alignment of very long audio recordings

G Doras, Y Teytaut, A Roebel - Applied Sciences, 2023 - mdpi.com
Synchronisation of a voice recording with the corresponding text is a common task in speech
and music processing, and is used in many practical applications (automatic subtitling …

Probabilistic kernels for improved text-to-speech alignment in long audio tracks

G Bordel, M Penagarikano… - IEEE Signal …, 2015 - ieeexplore.ieee.org
The synchronization of text transcripts with audio tracks is typically solved by forced
alignment at the phonetic level. However, when dealing with either very long audio tracks or …

The development of the Cambridge University alignment systems for the Multi-Genre Broadcast challenge

P Lanchantin, MJF Gales, P Karanasou… - … IEEE Workshop on …, 2015 - ieeexplore.ieee.org
We describe the alignment systems developed both for the preparation of data for the Multi-
Genre Broadcast (MGB) challenge and for our participation in the transcription and …

Finger tracking: facilitating non-commercial content production for mobile e-reading applications

CD Epp, C Munteanu, B Axtell, K Ravinthiran… - Proceedings of the 19th …, 2017 - dl.acm.org
Limited literacy and visual impairment reduce the ability of many to read on their own.
Current e-reader solutions rely on either unnatural synthetic voices or professionally …

[PDF][PDF] Enhancing Data-Driven Phone Confusions Using Restricted Recognition.

M Kane, J Carson-Berndsen - INTERSPEECH, 2016 - isca-archive.org
This paper presents a novel approach to address data sparseness in standard confusion
matrices and demonstrates how enhanced matrices, which capture additional similarities …

Improving a long audio aligner through phone-relatedness matrices for english, spanish and basque

A Álvarez, P Ruiz, H Arzelus - … 2014, Brno, Czech Republic, September 8 …, 2014 - Springer
A multilingual long audio alignment system is presented in the automatic subtitling domain,
supporting English, Spanish and Basque. Pre-recorded contents are recognized at …

Automatic alignment of phonetic transcriptions for russian

D Kocharov - Speech and Computer: 16th International Conference …, 2014 - Springer
This paper presents automatic alignment of Russian phonetic pronunciations using the
information about phonetic nature of speech sounds in the aligned transcription sequences …

Speech technologies for the audiovisual and multimedia interaction environments

A Álvarez Muniain - 2016 - portalcientifico.uvigo.gal
En esta memoria de tesis, se analiza el estado actual de algunas tecnologías de análisis del
audio y procesamiento del habla aplicadas a sectores como el audiovisual y el de …

Phoneme alignment using the information on phonological processes in continuous speech

D Kocharov - Proceedings of the Tenth International Conference …, 2016 - aclanthology.org
The current study focuses on optimization of Levenshtein algorithm for the purpose of
computing the optimal alignment between two phoneme transcriptions of spoken utterance …

On automatic cross-lingual subtitle timing

M Bohac, M Rott, K Blavka - 2015 IEEE International Workshop …, 2015 - ieeexplore.ieee.org
In the last years automatic speech recognition (ASR) technologies are applied as assistant
technologies for hearing-impaired people. ASR technologies typically produce some kind of …