Search CORE

14,816 research outputs found

Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification

Author: Afzal Muhammad Zeshan
Ahmed Sheraz
Kölsch Andreas
Liwicki Marcus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/04/2017
Field of study

We present an exhaustive investigation of recent Deep Learning architectures, algorithms, and strategies for the task of document image classification to finally reduce the error by more than half. Existing approaches, such as the DeepDocClassifier, apply standard Convolutional Network architectures with transfer learning from the object recognition domain. The contribution of the paper is threefold: First, it investigates recently introduced very deep neural network architectures (GoogLeNet, VGG, ResNet) using transfer learning (from real images). Second, it proposes transfer learning from a huge set of document images, i.e. 400,000 documents. Third, it analyzes the impact of the amount of training data (document images) and other parameters to the classification abilities. We use two datasets, the Tobacco-3482 and the large-scale RVL-CDIP dataset. We achieve an accuracy of 91.13% for the Tobacco-3482 dataset while earlier approaches reach only 77.6%. Thus, a relative error reduction of more than 60% is achieved. For the large dataset RVL-CDIP, an accuracy of 90.97% is achieved, corresponding to a relative error reduction of 11.5%

arXiv.org e-Print Archive

Crossref

Math Search for the Masses: Multimodal Search Interfaces and Appearance-Based Retrieval

Author: Orakwue Awelemdy
Zanibbi Richard
Publication venue
Publication date: 11/05/2015
Field of study

We summarize math search engines and search interfaces produced by the Document and Pattern Recognition Lab in recent years, and in particular the min math search interface and the Tangent search engine. Source code for both systems are publicly available. "The Masses" refers to our emphasis on creating systems for mathematical non-experts, who may be looking to define unfamiliar notation, or browse documents based on the visual appearance of formulae rather than their mathematical semantics.Comment: Paper for Invited Talk at 2015 Conference on Intelligent Computer Mathematics (July, Washington DC

arXiv.org e-Print Archive

CiteSeerX

Digital Image Access & Retrieval

Author: Heidorn P. Bryan
Sandore Beth
Publication venue: Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Publication date: 01/01/1997
Field of study

The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval

Author: Derpanis Konstantinos G.
Harley Adam W.
Ufkes Alex
Publication venue
Publication date: 25/02/2015
Field of study

This paper presents a new state-of-the-art for document image classification and retrieval, using features learned by deep convolutional neural networks (CNNs). In object and scene analysis, deep neural nets are capable of learning a hierarchical chain of abstraction from pixel inputs to concise and descriptive representations. The current work explores this capacity in the realm of document analysis, and confirms that this representation strategy is superior to a variety of popular hand-crafted alternatives. Experiments also show that (i) features extracted from CNNs are robust to compression, (ii) CNNs trained on non-document images transfer well to document analysis tasks, and (iii) enforcing region-specific feature-learning is unnecessary given sufficient training data. This work also makes available a new labelled subset of the IIT-CDIP collection, containing 400,000 document images across 16 categories, useful for training new CNNs for document analysis

arXiv.org e-Print Archive

Crossref

Query-dependent metric learning for adaptive, content-based image browsing and retrieval

Author: Han Junwei
McKenna Stephen
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/10/2014
Field of study

University of Dundee Online Publications