NLP for the Greek language: A brief survey

K Papantoniou, Y Tzitzikas - 11th hellenic conference on artificial …, 2020 - dl.acm.org
There is a plethora of methods, tools and resources for processing text in the English
language, however this is not the case for other languages, like Greek. Due to the increasing …

A survey on handwritten documents word spotting

R Ahmed, WG Al-Khatib, S Mahmoud - International Journal of Multimedia …, 2017 - Springer
Along with the explosive growth of the amount of handwritten documents that are preserved,
processed and accessed in a digital form, handwritten document images word spotting has …

A survey of historical document image datasets

K Nikolaidou, M Seuret, H Mokayed… - International Journal on …, 2022 - Springer
This paper presents a systematic literature review of image datasets for document image
analysis, focusing on historical documents, such as handwritten manuscripts and early …

Diva-hisdb: A precisely annotated large dataset of challenging medieval manuscripts

F Simistira, M Seuret, N Eichenberger… - … on Frontiers in …, 2016 - ieeexplore.ieee.org
This paper introduces a publicly available historical manuscript database DIVA-HisDB for
the evaluation of several Document Image Analysis (DIA) tasks. The database consists of …

U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts

S Zottin, A De Nardin, E Colombi, C Piciarelli… - Neural Computing and …, 2024 - Springer
Abstract Document Layout Analysis, which is the task of identifying different semantic
regions inside of a document page, is a subject of great interest for both computer scientists …

Optical character recognition of 19th century classical commentaries: the current state of affairs

M Romanello, S Najem-Meyer… - Proceedings of the 6th …, 2021 - dl.acm.org
Together with critical editions and translations, commentaries are one of the main genres of
publication in literary and textual scholarship, and have a century-long tradition. Yet, the …

Cardis: A swedish historical handwritten character and word dataset

A Yavariabdi, H Kusetogullari, T Celik… - IEEE …, 2022 - ieeexplore.ieee.org
This paper introduces a new publicly available image-based Swedish historical handwritten
character and word dataset named C haracter Ar kiv D igital S weden (CArDIS)(https …

[HTML][HTML] Few-shot symbol classification via self-supervised learning and nearest neighbor

M Alfaro-Contreras, A Ríos-Vila, JJ Valero-Mas… - Pattern Recognition …, 2023 - Elsevier
The recognition of symbols within document images is one of the most relevant steps
involved in the Document Analysis field. While current state-of-the-art methods based on …

Recognition of historical Greek polytonic scripts using LSTM networks

F Simistira, A Ul-Hassan… - 2015 13th …, 2015 - ieeexplore.ieee.org
This paper reports on high-performance Optical Character Recognition (OCR) experiments
using Long Short-Term Memory (LSTM) Networks for Greek polytonic script. Even though …

Zoning aggregated hypercolumns for keyword spotting

G Sfikas, G Retsinas, B Gatos - 2016 15th international …, 2016 - ieeexplore.ieee.org
In this paper we present a novel descriptor and method for segmentation-based keyword
spotting. We introduce Zoning-Aggregated Hypercolumn features as pixel-level cues for …