OffensEval 2023: Offensive language identification in the age of Large Language Models

M Zampieri, S Rosenthal, P Nakov… - Natural Language …, 2023 - cambridge.org
The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular,
attracting over 1300 participating teams. The two editions of the shared task helped advance …

Automatic readability assessment of German sentences with transformer ensembles

PG Blaneck, T Bornheim, N Grieger… - arXiv preprint arXiv …, 2022 - arxiv.org
Reliable methods for automatic readability assessment have the potential to impact a variety
of fields, ranging from machine translation to self-informed learning. Recently, large …

The problem of varying annotations to identify abusive language in social media content

N Seemann, YS Lee, J Höllig… - Natural Language …, 2023 - cambridge.org
With the increase of user-generated content on social media, the detection of abusive
language has become crucial and is therefore reflected in several shared tasks that have …

Mbti personality prediction based on bert classification

H Zhang - Highlights in Science, Engineering and Technology, 2023 - drpress.org
Young people today tend to express their feelings and socialize on the internet instead of in
real life, which makes social media practical in defining one's personality since their …

POLygraph: Polish Fake News Dataset

D Dzienisiewicz, F Graliński, P Jabłoński… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper presents the POLygraph dataset, a unique resource for fake news detection in
Polish. The dataset, created by an interdisciplinary team, is composed of two parts: the" fake …

LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification

K Chernyshev, E Garanina, D Bayram, Q Zheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Misogyny and sexism are growing problems in social media. Advances have been made in
online sexism detection but the systems are often uninterpretable. SemEval-2023 Task 10 …

Assessing In-context Learning and Fine-tuning for Topic Classification of German Web Data

J Schelb, R Ulloa, A Spitz - arXiv preprint arXiv:2407.16516, 2024 - arxiv.org
Researchers in the political and social sciences often rely on classification models to
analyze trends in information consumption by examining browsing histories of millions of …

Detecting Sexism in German Online Newspaper Comments with Open-Source Text Embeddings (Team GDA, GermEval2024 Shared Task 1: GerMS-Detect, Subtasks …

F Bremm, PG Blaneck, T Bornheim, N Grieger… - arXiv preprint arXiv …, 2024 - arxiv.org
Sexism in online media comments is a pervasive challenge that often manifests subtly,
complicating moderation efforts as interpretations of what constitutes sexism can vary …

Constructing ensembles for hate speech detection

IE Kucukkaya, C Toraman - Natural Language Processing, 2024 - cambridge.org
Hate speech against individuals and groups with certain demographics is a major issue in
social media. Supervised models for hate speech detection mostly utilize labeled data …

Offensive text detection across languages and datasets using rule-based and hybrid methods

KA Gemes, Á Kovács, G Recski - # …, 2023 - repositum.tuwien.at
We investigate the potential of rule-based systems for the task of offensive text detection in
English and German, and demonstrate their effectiveness in low-resource settings, as an …