Locally typical sampling
Today's probabilistic language generators fall short when it comes to producing coherent
and fluent text despite the fact that the underlying models perform well under standard …
and fluent text despite the fact that the underlying models perform well under standard …
[HTML][HTML] Testing the predictions of surprisal theory in 11 languages
Surprisal theory posits that less-predictable words should take more time to process, with
word predictability quantified as surprisal, ie, negative log probability in context. While …
word predictability quantified as surprisal, ie, negative log probability in context. While …
Revisiting the uniform information density hypothesis
The uniform information density (UID) hypothesis posits a preference among language
users for utterances structured such that information is distributed uniformly across a signal …
users for utterances structured such that information is distributed uniformly across a signal …
On the probability-quality paradox in language generation
When generating natural language from neural probabilistic models, high probability does
not always coincide with high quality: It has often been observed that mode-seeking …
not always coincide with high quality: It has often been observed that mode-seeking …
Revisiting the optimality of word lengths
Zipf (1935) posited that wordforms are optimized to minimize utterances' communicative
costs. Under the assumption that cost is given by an utterance's length, he supported this …
costs. Under the assumption that cost is given by an utterance's length, he supported this …
[HTML][HTML] A Cross-Linguistic Pressure for Uniform Information Density in Word Order
While natural languages differ widely in both canonical word order and word order flexibility,
their word orders still follow shared cross-linguistic statistical patterns, often attributed to …
their word orders still follow shared cross-linguistic statistical patterns, often attributed to …
An information-theoretic analysis of self-supervised discrete representations of speech
Self-supervised representation learning for speech often involves a quantization step that
transforms the acoustic input into discrete units. However, it remains unclear how to …
transforms the acoustic input into discrete units. However, it remains unclear how to …
Quantifying the redundancy between prosody and text
Prosody--the suprasegmental component of speech, including pitch, loudness, and tempo--
carries critical aspects of meaning. However, the relationship between the information …
carries critical aspects of meaning. However, the relationship between the information …
Using linguistic typology to enrich multilingual lexicons: the case of lexical gaps in kinship
This paper describes a method to enrich lexical resources with content relating to linguistic
diversity, based on knowledge from the field of lexical typology. We capture the …
diversity, based on knowledge from the field of lexical typology. We capture the …
Grammatical cues to subjecthood are redundant in a majority of simple clauses across languages
Grammatical cues are sometimes redundant with word meanings in natural language. For
instance, English word order rules constrain the word order of a sentence like “The dog …
instance, English word order rules constrain the word order of a sentence like “The dog …