Automatic generation of a custom corpora for invoice analysis and recognition

Abstract

International audienceIn this paper, we present a bill-type document generator capable of supplying on demand all the mass of documents that a learning system needs. The lack of administrative documents has long been a handicap because of the confidentiality of this type of document. In addition, this generator allowed us to solve the problem of annotations since they are done automatically during the generation and put directly in XML-GEDI form. Then, to show the interest of the generator, we proposed a system of invoice recognition based on graph convolutional neural network. The experiments took place in excellent conditions since we had all the possibilities to vary the classes, the samples in the classes, and their parameters

    Similar works