1 research outputs found
EATEN: Entity-aware Attention for Single Shot Visual Text Extraction
Extracting entity from images is a crucial part of many OCR applications,
such as entity recognition of cards, invoices, and receipts. Most of the
existing works employ classical detection and recognition paradigm. This paper
proposes an Entity-aware Attention Text Extraction Network called EATEN, which
is an end-to-end trainable system to extract the entities without any
post-processing. In the proposed framework, each entity is parsed by its
corresponding entity-aware decoder, respectively. Moreover, we innovatively
introduce a state transition mechanism which further improves the robustness of
entity extraction. In consideration of the absence of public benchmarks, we
construct a dataset of almost 0.6 million images in three real-world scenarios
(train ticket, passport and business card), which is publicly available at
https://github.com/beacandler/EATEN. To the best of our knowledge, EATEN is the
first single shot method to extract entities from images. Extensive experiments
on these benchmarks demonstrate the state-of-the-art performance of EATEN.Comment: 7 page