Segmentation, labeling and optical character recognition applied on receipt images

Abstract

Not always science and companies share the same objectives, however a company’s need can be sometimes satisfied by applying science. Two of the common problems of a company that tries to work with data in the field of artificial intelligence are firstly how to get the data itself and secondly how to label it. This thesis presents a real case where a company has to acquire data, label it and then create the best model that fits the data. After many experiments this thesis shows how a small number of training inputs and an Arachnid model, that combines 45 specialized CNNs and a classifier that finds the pattern behind the output of those CNNs, can improve the test accuracy of LeNet-5 from 93.85% to 99,00% when classifying 10 different classes of optical characters with a concrete dataset. 1.000 single characters were extracted randomly from around 10.000 images of receipts and 800 of them were used for the training and 200 for the test. The approach of this thesis is focused on a real and concrete problem of a company, trying to find the best solution by using science and at the same time taking into account the company’s need

    Similar works