论文标题
使用端到端顺序方法从扫描发票(AIESI)中提取抽象性信息
Abstractive Information Extraction from Scanned Invoices (AIESI) using End-to-end Sequential Approach
论文作者
论文摘要
机器学习和深度学习领域的最新扩散使我们能够以更高的精度生成OCR模型。光学特征识别(OCR)是从文档和扫描图像中提取文本的过程。对于文档数据精简,我们对数据,收款人名称,总量,地址等。提取的信息有助于获得数据的完整见解,这对于快速文档搜索,在数据库,数据分析中的有效索引以及AISI中的索引很有帮助。使用AIESI,我们可以消除从扫描的文档中提取的关键参数的努力。从扫描发票(AIESI)中提取的抽象信息是从扫描收据中提取信息的过程。在本文中,我们提出了一种改进的方法,以将发票中的所有视觉和文本功能整合,以使用Wise Bilstm提取关键发票参数。
Recent proliferation in the field of Machine Learning and Deep Learning allows us to generate OCR models with higher accuracy. Optical Character Recognition(OCR) is the process of extracting text from documents and scanned images. For document data streamlining, we are interested in data like, Payee name, total amount, address, and etc. Extracted information helps to get complete insight of data, which can be helpful for fast document searching, efficient indexing in databases, data analytics, and etc. Using AIESI we can eliminate human effort for key parameters extraction from scanned documents. Abstract Information Extraction from Scanned Invoices (AIESI) is a process of extracting information like, date, total amount, payee name, and etc from scanned receipts. In this paper we proposed an improved method to ensemble all visual and textual features from invoices to extract key invoice parameters using Word wise BiLSTM.