Tokenizing, pos tagging, lemmatizing and parsing ud 2.0 with udpipe

M Straka, J Straková - Proceedings of the CoNLL 2017 shared …, 2017 - aclanthology.org
Many natural language processing tasks, including the most advanced ones, routinely start
by several basic processing steps–tokenization and segmentation, most likely also POS …

[HTML][HTML] Evaluating the state-of-the-art of end-to-end natural language generation: The e2e nlg challenge

O Dušek, J Novikova, V Rieser - Computer Speech & Language, 2020 - Elsevier
This paper provides a comprehensive analysis of the first shared task on End-to-End Natural
Language Generation (NLG) and identifies avenues for future research based on the results …

Findings of the third shared task on multimodal machine translation

L Barrault, F Bougares, L Specia, C Lala… - Third Conference on …, 2018 - hal.science
We present the results from the third shared task on multimodal machine translation. In this
task a source sentence in English is supplemented by an image and participating systems …

RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model

M Straka, J Náplava, J Straková, D Samuel - Text, Speech, and Dialogue …, 2021 - Springer
We present RobeCzech, a monolingual RoBERTa language representation model trained
on Czech data. RoBERTa is a robustly optimized Transformer-based pretraining approach …

Assessing the quality of multiple-choice questions using gpt-4 and rule-based methods

S Moore, HA Nguyen, T Chen, J Stamper - European Conference on …, 2023 - Springer
Multiple-choice questions with item-writing flaws can negatively impact student learning and
skew analytics. These flaws are often present in student-generated questions, making it …

Neural sign language synthesis: Words are our glosses

J Zelinka, J Kanis - Proceedings of the IEEE/CVF winter …, 2020 - openaccess.thecvf.com
This paper deals with a text-to-video sign language synthesis. Instead of direct video
production, we focused on skeletal models production. Our main goal in this paper was to …

Parabank: Monolingual bitext generation and sentential paraphrasing via lexically-constrained neural machine translation

JE Hu, R Rudinger, M Post, B Van Durme - Proceedings of the AAAI …, 2019 - ojs.aaai.org
We present PARABANK, a large-scale English paraphrase dataset that surpasses prior work
in both quantity and quality. Following the approach of PARANMT (Wieting and Gimpel …

Czeng 1.6: enlarged czech-english parallel corpus with processing tools dockered

O Bojar, O Dušek, T Kocmi, J Libovický… - Text, Speech, and …, 2016 - Springer
We present a new release of the Czech-English parallel corpus CzEng. CzEng 1.6 consists
of about 0.5 billion words (“gigaword”) in each language. The corpus is equipped with …

Rapid detection of fake news based on machine learning methods

B Probierz, P Stefański, J Kozak - Procedia Computer Science, 2021 - Elsevier
Nowadays, it is very important to quickly recognize the false information referred to as fake
news. This is especially important in the case of news appearing on the Internet because of …

From discourse to pathology: automatic identification of Parkinson's disease patients via morphological measures across three languages

E Eyigoz, M Courson, L Sedeño, K Rogg… - Cortex, 2020 - Elsevier
Embodied cognition research on Parkinson's disease (PD) points to disruptions of
frontostriatal language functions as sensitive targets for clinical assessment. However, no …