Improved subword modeling for WFST-based speech recognition

P Smit, S Virpioja, M Kurimo - Interspeech, 2017 - research.aalto.fi
Because in agglutinative languages the number of observed word forms is very high,
subword units are often utilized in speech recognition. However, the proper use of subword …

[PDF][PDF] Leveraging a Character, Word and Prosody Triplet for an ASR Error Robust and Agglutination Friendly Punctuation Approach.

G Szaszák, MA Tündik - Interspeech, 2019 - isca-archive.org
Punctuating ASR transcript has received increasing attention recently, and well-performing
approaches were presented based on sequence-to-sequence modelling, exploiting textual …

[PDF][PDF] User-centric evaluation of automatic punctuation in ASR closed captioning

MÁ Tündik, G Szaszák, G Gosztolya, A Beke - 2018 - real.mtak.hu
Punctuation of ASR-produced transcripts has received increasing attention in the recent
years; RNN-based sequence modelling solutions which exploit textual and/or acoustic …

Joint word-and character-level embedding CNN-RNN models for punctuation restoration

MÁ Tündik, G Szaszák - 2018 9th IEEE International …, 2018 - ieeexplore.ieee.org
The sequence-to-sequence modelling paradigm has been successfully used in automatic
punctuation of text generated by Automatic Speech Recognizers (ASR), using bidirectional …

[PDF][PDF] Assessing the Semantic Space Bias Caused by ASR Error Propagation and its Effect on Spoken Document Summarization.

MA Tündik, V Kaszás, G Szaszák - INTERSPEECH, 2019 - isca-archive.org
Ambitions in artificial intelligence involve machine understanding of human language. The
state-of-the-art approach for Spoken Language Understanding is using an Automatic …

A prosody inspired RNN approach for punctuation of machine produced speech transcripts to improve human readability

A Moró, G Szaszák - 2017 8th IEEE International Conference …, 2017 - ieeexplore.ieee.org
Speech communication human-machine interfaces exploit automatic speech recognition to
implement speech-to-text conversion. Unfortunately, in the past, not much effort has been …

[PDF][PDF] Towards abstractive summarization in Hungarian

M Makrai, ÁM Tündik, B Indig, G Szaszák - XVIII. Magyar Számítógépes …, 2022 - hlt.bme.hu
We publish an abstractive summarizer for Hungarian, an encoder-decoder model initialized
with huBERT, and fine-tuned on the ELTE. DH corpus of former Hungarian news portals …

[PDF][PDF] An audio-based sequential punctuation model for asr and its effect on human readability

G Szaszák - Acta Polytechnica Hungarica, 2019 - epa.niif.hu
Inserting punctuation marks into the word chain hypothesis produced by automatic speech
recognition (ASR) has long been a neglected task. In several application domains of ASR …

A bilingual comparison of maxent-and rnn-based punctuation restoration in speech transcripts

MÁ Tündik, B Tarjan, G Szaszák - 2017 8th IEEE International …, 2017 - ieeexplore.ieee.org
Closed captioning is a common method to improve accessibility of TV programs for people
who are hearing impaired or hard of hearing, while representing an application relevant for …

Low Latency MaxEnt-and RNN-Based Word Sequence Models for Punctuation Restoration of Closed Caption Data

MÁ Tündik, B Tarján, G Szaszák - … , SLSP 2017, Le Mans, France, October …, 2017 - Springer
Abstract Automatic Speech Recognition (ASR) rarely addresses the punctuation of the
obtained transcriptions. Recently, Recurrent Neural Network (RNN) based models were …