Data-driven sentence simplification: Survey and benchmark

F Alva-Manchego, C Scarton, L Specia - Computational Linguistics, 2020 - direct.mit.edu
Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read
and understand. In order to do so, several rewriting transformations can be performed such …

ASSET: A dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations

F Alva-Manchego, L Martin, A Bordes… - arXiv preprint arXiv …, 2020 - arxiv.org
In order to simplify a sentence, human editors perform multiple rewriting transformations:
they split it into several shorter sentences, paraphrase words (ie replacing complex words or …

[HTML][HTML] Wikis and collaborative writing applications in health care: a scoping review

PM Archambault, TH Van De Belt, FJ Grajales III… - Journal of medical …, 2013 - jmir.org
Background: Collaborative writing applications (eg, wikis and Google Documents) hold the
potential to improve the use of evidence in both public health and health care. The rapid rise …

Quantifying Wikipedia Usage Patterns Before Stock Market Moves

HS Moat, C Curme, A Avakian, DY Kenett… - Scientific reports, 2013 - nature.com
Financial crises result from a catastrophic combination of actions. Vast stock market datasets
offer us a window into some of the actions that have led to these crises. Here, we investigate …

Early prediction of movie box office success based on Wikipedia activity big data

M Mestyán, T Yasseri, J Kertész - PloS one, 2013 - journals.plos.org
Use of socially generated “big data” to access information about collective states of the
minds in human societies has become a new paradigm in the emerging field of …

Dynamics of conflicts in Wikipedia

T Yasseri, R Sumi, A Rung, A Kornai, J Kertész - PloS one, 2012 - journals.plos.org
In this work we study the dynamical features of editorial wars in Wikipedia (WP). Based on
our previously established algorithm, we build up samples of controversial and peaceful …

[HTML][HTML] Perusal of readability with focus on web content understandability

PK Ojha, A Ismail, KK Srinivasan - … of King Saud University-Computer and …, 2021 - Elsevier
The Web has become a popular and important medium of transmitting information from one
place to another place. For making information accessible to all, we need to check their …

How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

A Yedetore, T Linzen, R Frank, RT McCoy - arXiv preprint arXiv …, 2023 - arxiv.org
When acquiring syntax, children consistently choose hierarchical rules over competing non-
hierarchical possibilities. Is this preference due to a learning bias for hierarchical structure …

A standardized Project Gutenberg corpus for statistical analysis of natural language and quantitative linguistics

M Gerlach, F Font-Clos - Entropy, 2020 - mdpi.com
The use of Project Gutenberg (PG) as a text corpus has been extremely popular in statistical
analysis of language for more than 25 years. However, in contrast to other major linguistic …

Comparing the topological properties of real and artificially generated scientific manuscripts

DR Amancio - Scientometrics, 2015 - Springer
Recent years have witnessed the increase of competition in science. While promoting the
quality of research in many cases, an intense competition among scientists can also trigger …