Khmer printed character recognition using attention-based Seq2Seq network

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Buoy Rina, Chenda Sovisal, Kor Sokchea, Taing Nguonly

Ngôn ngữ: eng

Ký hiệu phân loại:

Thông tin xuất bản: Ho Chi Minh City Open University Journal of Science: Engineering and Technology, 2022

Mô tả vật lý: tr.45732

Bộ sưu tập: Metadata

ID: 334060

This paper presents an end-to-end deep convolutional recurrent neural network solution for Khmer optical character recognition (OCR) task. The proposed solution uses a sequence-to-sequence (Seq2Seq) architecture with attention mechanism. The encoder extracts visual features from an input text-line image via layers of convolutional blocks and a layer of gated recurrent units (GRU). The features are encoded in a single context vector and a sequence of hidden states which are fed to the decoder for decoding one character at a time until a special end-of-sentence (EOS) token is reached. The attention mechanism allows the decoder network to adaptively select relevant parts of the input image while predicting a target character. The Seq2Seq Khmer OCR network is trained on a large collection of computer-generated text-line images for multiple common Khmer fonts. Complex data augmentation is applied on both train and validation dataset. The proposed model’s performance outperforms the state-of-art Tesseract OCR engine for Khmer language on the validation set of 6400 augmented images by achieving a character error rate (CER) of 0.7% vs 35.9%.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH