论文标题
HWRCNET:使用CNN-BILSTM网络在JPEG压缩域中手写字识别
HWRCNet: Handwritten Word Recognition in JPEG Compressed Domain using CNN-BiLSTM Network
论文作者
论文摘要
使用深度学习从文档图像中识别手写的单词识别是文档图像分析和识别领域的一个活跃研究领域。在当今的大数据时代,由于越来越多的文档以压缩形式生成和存档,以提供更好的存储和传输效率,因此在没有减压的情况下,在相应的压缩域中单词识别的问题变得非常具有挑战性。传统方法采用解压缩,然后在它们上应用学习算法,因此,要设计新颖的算法,以直接将学习技术直接应用于压缩表示/域中。在这个方向上,本研究论文提出了一种新型的HWRCNET模型,用于直接在压缩域中专门针对JPEG格式的手写单词识别。提出的模型结合了基于卷积的神经网络(CNN)和基于双向长期记忆(BILSTM)复发性神经网络(RNN)。基本上,我们使用JPEG压缩单词图像训练该模型,并以$ 89.05 \%$ Word识别精度和$ 13.37 \%$ $ $ $ $ $ $字符错误率观察出非常吸引人的性能。
Handwritten word recognition from document images using deep learning is an active research area in the field of Document Image Analysis and Recognition. In the present era of Big data, since more and more documents are being generated and archived in the compressed form to provide better storage and transmission efficiencies, the problem of word recognition in the respective compressed domain without decompression becomes very challenging. The traditional methods employ decompression and then apply learning algorithms over them, therefore, novel algorithms are to be designed in order to apply learning techniques directly in the compressed representations/domains. In this direction, this research paper proposes a novel HWRCNet model for handwritten word recognition directly in the compressed domain specifically focusing on JPEG format. The proposed model combines the Convolutional Neural Network (CNN) and Bi-Directional Long Short Term Memory (BiLSTM) based Recurrent Neural Network (RNN). Basically, we train the model using JPEG compressed word images and observe a very appealing performance with $89.05\%$ word recognition accuracy and $13.37\%$ character error rate.