Survey of post-OCR processing approaches
Optical character recognition (OCR) is one of the most popular techniques used for
converting printed documents into machine-readable ones. While OCR engines can do well …
converting printed documents into machine-readable ones. While OCR engines can do well …
An OCR post-correction approach using deep learning for processing medical reports
S Karthikeyan, AGS de Herrera… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
According to a recent Deloitte study, the COVID-19 pandemic continues to place a huge
strain on the global health care sector. Covid-19 has also catalysed digital transformation …
strain on the global health care sector. Covid-19 has also catalysed digital transformation …
Using automated methods to detect safety problems with health information technology: a scoping review
Objective To summarize the research literature evaluating automated methods for early
detection of safety problems with health information technology (HIT). Materials and …
detection of safety problems with health information technology (HIT). Materials and …
Automated misspelling detection and correction in Persian clinical text
Accurate electronic health records are important for clinical care, research, and patient
safety assurance. Correction of misspelled words is required to ensure the correct …
safety assurance. Correction of misspelled words is required to ensure the correct …
[PDF][PDF] A hybrid solution for extracting information from unstructured data using optical character recognition (OCR) with natural language processing (NLP)
B Dash - Research Gate, 2021 - researchgate.net
With rapid digitalization, organizations are producing a lot of data as part of their day-to-day
operations. These data are stored either on their legacy platforms or in any cloud storage …
operations. These data are stored either on their legacy platforms or in any cloud storage …
Correcting arabic soft spelling mistakes using bilstm-based machine learning
Soft spelling errors are a class of spelling mistakes that is widespread among native Arabic
speakers and foreign learners alike. Some of these errors are typographical in nature. They …
speakers and foreign learners alike. Some of these errors are typographical in nature. They …
Named entity recognition for Chinese biomedical patents
Y Hu, S Verberne - … of the 28th international conference on …, 2020 - aclanthology.org
There is a large body of work on Biomedical Entity Recognition (Bio-NER) for English but
there have only been a few attempts addressing NER for Chinese biomedical texts. Because …
there have only been a few attempts addressing NER for Chinese biomedical texts. Because …
Generating a training corpus for OCR post-correction using encoder-decoder model
In this paper we present a novel approach to the automatic correction of OCR-induced
orthographic errors in a given text. While current systems depend heavily on large training …
orthographic errors in a given text. While current systems depend heavily on large training …
Upcycle your OCR: Reusing OCRs for post-OCR text correction in Romanised Sanskrit
We propose a post-OCR text correction approach for digitising texts in Romanised Sanskrit.
Owing to the lack of resources our approach uses OCR models trained for other languages …
Owing to the lack of resources our approach uses OCR models trained for other languages …
Improving the quality of Persian clinical text with a novel spelling correction system
SMS Dashti, SF Dashti - BMC Medical Informatics and Decision Making, 2024 - Springer
Background The accuracy of spelling in Electronic Health Records (EHRs) is a critical factor
for efficient clinical care, research, and ensuring patient safety. The Persian language, with …
for efficient clinical care, research, and ensuring patient safety. The Persian language, with …