作者
Fotini Simistira, Adnan Ul-Hassan, Vassilis Papavassiliou, Basilis Gatos, Vassilis Katsouros, Marcus Liwicki
发表日期
2015/8/23
研讨会论文
2015 13th International Conference on Document Analysis and Recognition (ICDAR)
页码范围
766-770
出版商
IEEE
简介
This paper reports on high-performance Optical Character Recognition (OCR) experiments using Long Short-Term Memory (LSTM) Networks for Greek polytonic script. Even though there are many Greek polytonic manuscripts, the digitization of such documents has not been widely applied, and very limited work has been done on the recognition of such scripts. We have collected a large number of diverse document pages of Greek polytonic scripts in a novel database, called Polyton-DB, containing 15; 689 textlines of synthetic and authentic printed scripts and performed baseline experiments using LSTM Networks. Evaluation results show that the character error rate obtained with LSTM varies from 5.51% to 14.68% (depending on the document) and is better than two well-known OCR engines, namely, Tesseract and ABBYY FineReader.
引用总数
201620172018201920202021202220232024663545474
学术搜索中的文章
F Simistira, A Ul-Hassan, V Papavassiliou, B Gatos… - 2015 13th International Conference on Document …, 2015