Synthetic document generator for annotation-free layout recognition

N Raman, S Shah, M Veloso - Pattern Recognition, 2022 - Elsevier
Analyzing the layout of a document to identify headers, sections, tables, figures etc. is critical
to understanding its content. Deep learning based approaches for detecting the layout …

OCR improves machine translation for low-resource languages

O Ignat, J Maillard, V Chaudhary, F Guzmán - arXiv preprint arXiv …, 2022 - arxiv.org
We aim to investigate the performance of current OCR systems on low resource languages
and low resource scripts. We introduce and make publicly available a novel benchmark …

Scrambled text: training Language Models to correct OCR errors using synthetic data

J Bourne - arXiv preprint arXiv:2409.19735, 2024 - arxiv.org
OCR errors are common in digitised historical archives significantly affecting their usability
and value. Generative Language Models (LMs) have shown potential for correcting these …

Recognition of Handwritten Swedish Sentences With Deep Learning

H Kara Fallah - 2023 - diva-portal.org
This study attempts the task of handwritten text recognition within the context of the Swedish
language. It examines the applicability of deep neural networks to comprehend handwritten …

Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering

N Maddikunta, H Zhao, S Keswani, A Samuel… - arXiv preprint arXiv …, 2021 - arxiv.org
In the past, computer vision systems for digitized documents could rely on systematically
captured, high-quality scans. Today, transactions involving digital documents are more likely …

[PDF][PDF] UNDERWATER AND DOCUMENT IMAGE ENHANCEMENT USING DEEP LEARNING

A GANESH - 2022 - eescholars.iitm.ac.in
Image Enhancement is a topic of key importance with a wide variety of uses, both for visual
perception and for follow on tasks like optical character recognition. This thesis explores two …