Bertje: A dutch bert model

W De Vries, A van Cranenburgh, A Bisazza… - arXiv preprint arXiv …, 2019 - arxiv.org
The transformer-based pre-trained language model BERT has helped to improve state-of-
the-art performance on many natural language processing (NLP) tasks. Using the same …

Robbert: a dutch roberta-based language model

P Delobelle, T Winters, B Berendt - arXiv preprint arXiv:2001.06286, 2020 - arxiv.org
Pre-trained language models have been dominating the field of natural language
processing in recent years, and have led to significant performance gains for various …

What's so special about BERT's layers? A closer look at the NLP pipeline in monolingual and multilingual models

W De Vries, A Van Cranenburgh, M Nissim - arXiv preprint arXiv …, 2020 - arxiv.org
Peeking into the inner workings of BERT has shown that its layers resemble the classical
NLP pipeline, with progressively more complex tasks being concentrated in later layers. To …

All Mixed Up? Finding the Optimal Feature Set for General Readability Prediction and Its Application to English and Dutch

O De Clercq, V Hoste - Computational Linguistics, 2016 - direct.mit.edu
Readability research has a long and rich tradition, but there has been too little focus on
general readability prediction without targeting a specific audience or text genre. Moreover …

A robust transformation-based learning approach using ripple down rules for part-of-speech tagging

DQ Nguyen, DQ Nguyen, DD Pham… - AI …, 2016 - content.iospress.com
In this paper, we propose a new approach to construct a system of transformation rules for
the Part-of-Speech (POS) tagging task. Our approach is based on an incremental …

SICKNL: a dataset for Dutch natural language inference

G Wijnholds, M Moortgat - arXiv preprint arXiv:2101.05716, 2021 - arxiv.org
We present SICK-NL (read: signal), a dataset targeting Natural Language Inference in
Dutch. SICK-NL is obtained by translating the SICK dataset of Marelli et al.(2014) from …

Towards adaptive support for self‐regulated learning of causal relations: Evaluating four Dutch word vector models

HJ Pijeira‐Díaz, S Braumann… - British Journal of …, 2024 - Wiley Online Library
Advances in computational language models increasingly enable adaptive support for self‐
regulated learning (SRL) in digital learning environments (DLEs; eg, via automated …

Translationese and post-editese: How comparable is comparable quality?

J Daems, O De Clercq, L Macken - Linguistica Antverpiensia New …, 2017 - biblio.ugent.be
Whereas post-edited texts have been shown to be either of comparable quality to human
translations or better, one study shows that people still seem to prefer human-translated …

Using the crowd for readability prediction

O De Clercq, V Hoste, B Desmet… - Natural Language …, 2014 - cambridge.org
While human annotation is crucial for many natural language processing tasks, it is often
very expensive and time-consuming. Inspired by previous work on crowdsourcing, we …

LeTs Preprocess: The multilingual LT3 linguistic preprocessing toolkit

M Van de Kauter, G Coorman, E Lefever… - … Linguistics in the …, 2013 - clinjournal.org
This paper presents the LeTs Preprocess Toolkit, a suite of robust high-performance
preprocessing modules including Part-of-Speech Taggers, Lemmatizers and Named Entity …