On the proper treatment of tokenization in psycholinguistics

M Giulianelli, L Malagutti, JL Gastaldi, B DuSell… - arXiv preprint arXiv …, 2024 - arxiv.org
Language models are widely used in computational psycholinguistics to test theories that
relate the negative log probability (the surprisal) of a region of interest (a substring of …

[HTML][HTML] Mouse Tracking for Reading (MoTR): A new naturalistic incremental processing measurement tool

EG Wilcox, C Ding, M Sachan, LA Jäger - Journal of Memory and Language, 2024 - Elsevier
Abstract We introduce Mouse Tracking for Reading (MoTR) a new incremental processing
measurement tool that can be used to collect word-by-word reading times. In a MoTR trial …

EMTeC: A Corpus of Eye Movements on Machine-Generated Texts

LS Bolliger, P Haller, ICR Cretton, DR Reich… - arXiv preprint arXiv …, 2024 - arxiv.org
The Eye Movements on Machine-Generated Texts Corpus (EMTeC) is a naturalistic eye-
movements-while-reading corpus of 107 native English speakers reading machine …