Vietnamese capitalization and punctuation recovery models

HTT Uyen, NA Tu, TD Huy - arXiv preprint arXiv:2207.01312, 2022 - arxiv.org
Despite the rise of recent performant methods in Automatic Speech Recognition (ASR), such
methods do not ensure proper casing and punctuation for their outputs. This problem has a …

CDCPP: 跨领域中文标点符号预测(CDCPP: Cross-Domain Chinese Punctuation Prediction)

P Liu, W Wang, L Qiu, B Du - Proceedings of the 19th Chinese …, 2020 - aclanthology.org
标点符号对文本理解起很大作用. 但目前, 在中文文本特别是在社交媒体及问答领域文本中的
标点符号使用存在非常多的错误或缺失的情况, 这严重影响对其进行语义分析及机器翻译等各项 …

Punctuation prediction in bangla text

H Rahman, MRS Rahin, AM Mahbub… - ACM Transactions on …, 2023 - dl.acm.org
Punctuation prediction is critical as it can enhance the readability of machine-transcribed
speeches or texts significantly by adding appropriate punctuation. Furthermore, systems like …

An efficient transformer-based model for Vietnamese punctuation prediction

H Tran, CV Dinh, Q Pham, BT Nguyen - International conference on …, 2021 - Springer
In both formal and informal texts, missing punctuation marks make the texts confusing and
challenging to read. This paper aims to conduct exhaustive experiments to investigate the …

Transformer Based Punctuation Restoration for Turkish

U Kurt, A Çayır - … on Computer Science and Engineering (UBMK …, 2023 - ieeexplore.ieee.org
Mobile devices and social media platforms make communication faster than humans have
had before, thanks to the technologies such as automatic speech recognition (ASR) …

Vietnamese punctuation prediction using deep neural networks

T Pham, N Nguyen, Q Pham, H Cao… - SOFSEM 2020: Theory and …, 2020 - Springer
Adding appropriate punctuation marks into text is an essential step in speech-to-text where
such information is usually not available. While this has been extensively studied for …

[PDF][PDF] CDCPP: 跨领域中文标点符号预测

刘鹏远, 王伟康, 邱立坤, 杜冰洁 - 中文信息学报, 2021 - jcip.cipsc.org.cn
在中文文本特别是在社交媒体及问答领域文本中, 存在非常多的标点符号错误或缺失的情况,
这严重影响对文本进行语义分析及机器翻译等各项自然语言处理的效果. 当前对标点符号进行 …

Fake Advertisements Detection Using Automated Multimodal Learning: A Case Study for Vietnamese Real Estate Data

DV Nguyen-Duc, TT Nguyen, CV Nguyen - Available at SSRN 4579070 - papers.ssrn.com
The popularity of e-commerce has given rise to fake advertisements that can expose users
to financial and data risks while damaging the reputation of these e-commerce platforms. For …

STHAL: Location-mention Identification in Tweets of Indian-context

K Verma, S Sinha, MS Akhtar… - Proceedings of the 17th …, 2020 - aclanthology.org
We investigate the problem of extracting Indian-locations from a given crowd-sourced textual
dataset. The problem of extracting fine-grained Indian-locations has many challenges. One …

[PDF][PDF] PREDICTION OF PUNCTUATION MARKS FOR CLASSICAL AND MODERN STANDARD ARABIC

M Ala'a - 2016 - researchgate.net
This research aims to investigate and argue that automatic punctuation annotation including
sentence terminal prediction is possible using machine learning algorithms on Classical …