10 research outputs found
Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification
We present an exhaustive investigation of recent Deep Learning architectures,
algorithms, and strategies for the task of document image classification to
finally reduce the error by more than half. Existing approaches, such as the
DeepDocClassifier, apply standard Convolutional Network architectures with
transfer learning from the object recognition domain. The contribution of the
paper is threefold: First, it investigates recently introduced very deep neural
network architectures (GoogLeNet, VGG, ResNet) using transfer learning (from
real images). Second, it proposes transfer learning from a huge set of document
images, i.e. 400,000 documents. Third, it analyzes the impact of the amount of
training data (document images) and other parameters to the classification
abilities. We use two datasets, the Tobacco-3482 and the large-scale RVL-CDIP
dataset. We achieve an accuracy of 91.13% for the Tobacco-3482 dataset while
earlier approaches reach only 77.6%. Thus, a relative error reduction of more
than 60% is achieved. For the large dataset RVL-CDIP, an accuracy of 90.97% is
achieved, corresponding to a relative error reduction of 11.5%
Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval
This paper presents a new state-of-the-art for document image classification
and retrieval, using features learned by deep convolutional neural networks
(CNNs). In object and scene analysis, deep neural nets are capable of learning
a hierarchical chain of abstraction from pixel inputs to concise and
descriptive representations. The current work explores this capacity in the
realm of document analysis, and confirms that this representation strategy is
superior to a variety of popular hand-crafted alternatives. Experiments also
show that (i) features extracted from CNNs are robust to compression, (ii) CNNs
trained on non-document images transfer well to document analysis tasks, and
(iii) enforcing region-specific feature-learning is unnecessary given
sufficient training data. This work also makes available a new labelled subset
of the IIT-CDIP collection, containing 400,000 document images across 16
categories, useful for training new CNNs for document analysis
Analysis of Documents Using CNN
Tato diplomová práce se zabývá problémem klasifikace skenovaných dokumentů. Problém klasifikace se diplomová práce snaží vyřešit pomocí konvolučních neuronových sítí a to třemi různými postupy. V rámci práce jsou zmapovány dosavadní postupy při klasifikaci dokumentů. V závěru práce jsou zhotoveny experimenty, které se snaží demonstrovat funkčnost různých způsobů přístupu k řešenému problému.This thesis deals with the problem of classification of scanned documents. The thesis tries to solve the classification problem by using convolutional neural networks in three different ways. The existing approaches in document classification are mapped in the thesis. At the end of the thesis, experiments are made to demonstrate the functionality of different approaches to the problem.460 - Katedra informatikyvýborn