Real-world sentence boundary detection using multitask learning: A case study on French

KT Lim, J Park - Natural Language Engineering, 2024 - cambridge.org
We propose a novel approach for sentence boundary detection in text datasets in which
boundaries are not evident (eg, sentence fragments). Although detecting sentence …

Improving speech-based end-of-turn detection via cross-modal representation learning with punctuated text data

R Masumura, M Ihori, T Tanaka, A Ando… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org
This paper presents a novel training method for speech-based end-of-turn detection for
which not only manually annotated speech data sets but also punctuated text data sets are …

An Untold Story of Preprocessing Task Evaluation: An Alignment-based Joint Evaluation Approach

EL Jo, AY Park, GT Zhang, IX Wang… - Proceedings of the …, 2024 - aclanthology.org
A preprocessing task such as tokenization and sentence boundary detection (SBD) has
commonly been considered as NLP challenges that have already been solved. This …