On the state of the art in authorship attribution and authorship verification

J Tyo, B Dhingra, ZC Lipton - arXiv preprint arXiv:2209.06869, 2022 - arxiv.org
Despite decades of research on authorship attribution (AA) and authorship verification (AV),
inconsistent dataset splits/filtering and mismatched evaluation methods make it difficult to …

[PDF][PDF] A case study of sockpuppet detection in wikipedia

T Solorio, R Hasan, M Mizan - Proceedings of the Workshop on …, 2013 - aclanthology.org
This paper presents preliminary results of using authorship attribution methods for the
detection of sockpuppeteering in Wikipedia. Sockpuppets are fake accounts created by …

Weak and strong discourse markers in speech, chat, and writing: Do signals compensate for ambiguity in explicit relations?

L Crible - Discourse processes, 2020 - Taylor & Francis
Ambiguity in discourse is pervasive, yet mechanisms of production and processing suggest
that it tends to be compensated in context. The present study sets out to analyze the …

Developing a benchmark for reducing data bias in authorship attribution

B Murauer, G Specht - Proceedings of the 2nd Workshop on …, 2021 - aclanthology.org
Authorship attribution is the task of assigning an unknown document to an author from a set
of candidates. In the past, studies in this field use various evaluation datasets to demonstrate …

Sockpuppet detection in wikipedia: A corpus of real-world deceptive writing for linking identities

T Solorio, R Hasan, M Mizan - arXiv preprint arXiv:1310.6772, 2013 - arxiv.org
This paper describes the corpus of sockpuppet cases we gathered from Wikipedia. A
sockpuppet is an online user account created with a fake identity for the purpose of covering …

[PDF][PDF] Person identification from text and speech genre samples

J Goldstein, R Winder, R Sabin - … of the 12th Conference of the …, 2009 - aclanthology.org
In this paper, we describe experiments conducted on identifying a person using a novel
unique correlated corpus of text and audio samples of the person's communication in six …

A comparison of several AI techniques for authorship attribution on Romanian texts

SM Avram, M Oltean - Mathematics, 2022 - mdpi.com
Determining the author of a text is a difficult task. Here, we compare multiple Artificial
Intelligence techniques for classifying literary texts written by multiple authors by taking into …

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

O Shaikh, M Lam, J Hejna, Y Shao, M Bernstein… - arXiv preprint arXiv …, 2024 - arxiv.org
Language models are aligned to emulate the collective voice of many, resulting in outputs
that align with no one in particular. Steering LLMs away from generic output is possible …

The syntax and semantics of coherence relations: From relative configurations to predictive signals

L Crible - International Journal of Corpus Linguistics, 2022 - jbe-platform.com
This corpus-based study investigates the inter-relation between discourse markers (DMs)
and other contextual signals that contribute to the interpretation of coherence relations. The …

When do we leave discourse relations underspecified? The effect of formality and relation type

L Crible, V Demberg - Discours. Revue de linguistique …, 2020 - journals.openedition.org
Speakers have several options when they express a discourse relation: they can leave it
implicit, or make it explicit, usually through a connective. Although not all connectives can go …