Creating and using a correlated corpus to glean communicative commonalities

J Tyo, B Dhingra, ZC Lipton - arXiv preprint arXiv:2209.06869, 2022 - arxiv.org

Despite decades of research on authorship attribution (AA) and authorship verification (AV),
inconsistent dataset splits/filtering and mismatched evaluation methods make it difficult to …

被引用次数：26 相关文章所有 2 个版本

[PDF] aclanthology.org

[PDF][PDF] A case study of sockpuppet detection in wikipedia

T Solorio, R Hasan, M Mizan - Proceedings of the Workshop on …, 2013 - aclanthology.org

This paper presents preliminary results of using authorship attribution methods for the
detection of sockpuppeteering in Wikipedia. Sockpuppets are fake accounts created by …

被引用次数：103 相关文章所有 6 个版本

[PDF] ed.ac.uk

Weak and strong discourse markers in speech, chat, and writing: Do signals compensate for ambiguity in explicit relations?

L Crible - Discourse processes, 2020 - Taylor & Francis

Ambiguity in discourse is pervasive, yet mechanisms of production and processing suggest
that it tends to be compensated in context. The present study sets out to analyze the …

被引用次数：21 相关文章所有 11 个版本

[PDF] aclanthology.org

Developing a benchmark for reducing data bias in authorship attribution

B Murauer, G Specht - Proceedings of the 2nd Workshop on …, 2021 - aclanthology.org

Authorship attribution is the task of assigning an unknown document to an author from a set
of candidates. In the past, studies in this field use various evaluation datasets to demonstrate …

被引用次数：12 相关文章所有 7 个版本

[PDF] arxiv.org

Sockpuppet detection in wikipedia: A corpus of real-world deceptive writing for linking identities

T Solorio, R Hasan, M Mizan - arXiv preprint arXiv:1310.6772, 2013 - arxiv.org

This paper describes the corpus of sockpuppet cases we gathered from Wikipedia. A
sockpuppet is an online user account created with a fake identity for the purpose of covering …

被引用次数：39 相关文章所有 11 个版本

[PDF] aclanthology.org

[PDF][PDF] Person identification from text and speech genre samples

J Goldstein, R Winder, R Sabin - … of the 12th Conference of the …, 2009 - aclanthology.org

In this paper, we describe experiments conducted on identifying a person using a novel
unique correlated corpus of text and audio samples of the person's communication in six …

被引用次数：45 相关文章所有 8 个版本

[PDF] mdpi.com

A comparison of several AI techniques for authorship attribution on Romanian texts

SM Avram, M Oltean - Mathematics, 2022 - mdpi.com

Determining the author of a text is a difficult task. Here, we compare multiple Artificial
Intelligence techniques for classifying literary texts written by multiple authors by taking into …

被引用次数：6 相关文章所有 9 个版本

[PDF] arxiv.org

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

O Shaikh, M Lam, J Hejna, Y Shao, M Bernstein… - arXiv preprint arXiv …, 2024 - arxiv.org

Language models are aligned to emulate the collective voice of many, resulting in outputs
that align with no one in particular. Steering LLMs away from generic output is possible …

被引用次数：4 相关文章所有 2 个版本

[PDF] ugent.be

The syntax and semantics of coherence relations: From relative configurations to predictive signals

L Crible - International Journal of Corpus Linguistics, 2022 - jbe-platform.com

This corpus-based study investigates the inter-relation between discourse markers (DMs)
and other contextual signals that contribute to the interpretation of coherence relations. The …

被引用次数：6 相关文章所有 5 个版本

[PDF] openedition.org

When do we leave discourse relations underspecified? The effect of formality and relation type

L Crible, V Demberg - Discours. Revue de linguistique …, 2020 - journals.openedition.org

Speakers have several options when they express a discourse relation: they can leave it
implicit, or make it explicit, usually through a connective. Although not all connectives can go …

被引用次数：10 相关文章所有 11 个版本