Automatic genre identification: a survey

T Kuzman, N Ljubešić - Language Resources and Evaluation, 2023 - Springer
Automatic genre identification (AGI) is a text classification task focused on genres, ie, text
categories defined by the author's purpose, common function of the text, and the text's …

Automatic genre identification for robust enrichment of massive text collections: Investigation of classification methods in the era of large language models

T Kuzman, I Mozetič, N Ljubešić - Machine Learning and Knowledge …, 2023 - mdpi.com
Massive text collections are the backbone of large language models, the main ingredient of
the current significant progress in artificial intelligence. However, as these collections are …

User‐based identification of Web genres

MA Rosso - Journal of the American Society for Information …, 2008 - Wiley Online Library
This research explores the use of genre as a document descriptor in order to improve the
effectiveness of Web searching. A major issue to be resolved is the identification of what …

The GINCO training dataset for web genre identification of documents out in the wild

T Kuzman, P Rupnik, N Ljubešić - arXiv preprint arXiv:2201.03857, 2022 - arxiv.org
This paper presents a new training dataset for automatic genre identification GINCO, which
is based on 1,125 crawled Slovenian web documents that consist of 650 thousand words …

Zero, single, or multi? Genre of web pages through the users' perspective

M Santini - Information Processing & Management, 2008 - Elsevier
The goal of the study presented in this article is to investigate to what extent the classification
of a web page by a single genre matches the users' perspective. The extent of agreement on …

Алгоритмы и программы автоматической обработки текста

ВА Яцко - Вестник Иркутского государственного …, 2012 - cyberleninka.ru
Даётся обзор наиболее распространённых алгоритмов и программ автоматической
обработки текста. Описываются особенности алгоритмов и программ, применяемых на …

[PDF][PDF] Theorizing about genre and cybergenre

R Caballero - CORELL: Computer resources for language …, 2008 - researchgate.net
The present paper provides an overview of several approaches and definitions of the
concept of genre as a means to discuss whether the defining traits of genre (particularly, its …

[图书][B] Exploiting task-document relations in support of information retrieval in the workplace

L Freund - 2008 - collectionscanada.gc.ca
Increasingly, workplace information seeking takes place in digital information environments
and is reliant upon search systems. Existing systems are designed to retrieve information …

Cross-testing a genre classification model for the web

M Santini - Genres on the web: Computational models and …, 2011 - Springer
The main aim of the experiments described in this chapter is to investigate ways of
assessing the robustness and stability of an Automatic Genre Identification (AGI) model for …

[PDF][PDF] Towards a Reference Corpus of Web Genres for the Evaluation of Genre Identification Systems.

G Rehm, M Santini, A Mehler, P Braslavski, R Gleim… - LREC, 2008 - academia.edu
We present initial results from an international and multi-disciplinary research collaboration
that aims at the construction of a reference corpus of web genres. The primary application …