Development of a Robust Dataset for Printed Tamil Character Recognition
International Conference on Machine Learning, IoT and Big Data, 2023•Springer
Despite the fact that many character datasets for several languages are publicly available,
there are only a very few standardized datasets for Tamil characters. This article presents a
subset of the Mepco Tamil Character database, a Tamil font isolated character dataset
representing the printed characters. This dataset includes 124 glyphs representing the 247
characters of Tamil language. This dataset is tested for its robustness using multiple
experimentations using SVM classifier and is compared against UJTDchar, another dataset …
there are only a very few standardized datasets for Tamil characters. This article presents a
subset of the Mepco Tamil Character database, a Tamil font isolated character dataset
representing the printed characters. This dataset includes 124 glyphs representing the 247
characters of Tamil language. This dataset is tested for its robustness using multiple
experimentations using SVM classifier and is compared against UJTDchar, another dataset …
Abstract
Despite the fact that many character datasets for several languages are publicly available, there are only a very few standardized datasets for Tamil characters. This article presents a subset of the Mepco Tamil Character database, a Tamil font isolated character dataset representing the printed characters. This dataset includes 124 glyphs representing the 247 characters of Tamil language. This dataset is tested for its robustness using multiple experimentations using SVM classifier and is compared against UJTDchar, another dataset available for Tamil language. Also we have verified the robustness of DIGI-Net, CNN architecture for this Tamil character recognition problem using the UJTDchar dataset and the Mepco Tamil Character dataset. We report an accuracy of 90.59% and 97.66% while using SVM and DIGI-Net CNN on our newly created dataset.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果