Application of Recognition Input Squinting and Error-Correcting Output Coding to Convolutional Neural Networks

Abstract

The Convolutional Neural Network (CNN) is a type of artificial neural network that is successful in addressing many computer vision classification problems. This thesis considers problems related to optical character recognition by CNN when few training samples are available. Two techniques are proposed that can be used to improve the application of CNNs to such problems and these benefits are demonstrated experimentally on subsets of two labelled databases: MNIST (handwritten digits) and CENPARMI-MPC (machineprinted characters). The first technique is novel and is called “Recognition Input Squinting”. It involves taking the input image to be recognized and applying a set of geometric transformations on it to produce a set of squinted images. The trained CNN classifier then recognizes each of these generated input images and computes an overall recognition confidence score. It is shown that this technique yields superior recognition precision as compared to the case where a single input image is recognized without squinting. The second technique is an application of the Error-Correcting Output Coding technique to the CNN. Each class to be recognized is assigned a codeword from an appropriately chosen error-correcting code’s codebook and the CNN is trained using these codeword labels. At recognition time, the output class is selected according to a minimum code distance criterion. It is shown that this technique provides better recognition precision than when the classic place output coding is used

    Similar works