4,416 research outputs found
A Hybrid Deep Learning Approach for Texture Analysis
Texture classification is a problem that has various applications such as
remote sensing and forest species recognition. Solutions tend to be custom fit
to the dataset used but fails to generalize. The Convolutional Neural Network
(CNN) in combination with Support Vector Machine (SVM) form a robust selection
between powerful invariant feature extractor and accurate classifier. The
fusion of experts provides stability in classification rates among different
datasets
Learning to Read by Spelling: Towards Unsupervised Text Recognition
This work presents a method for visual text recognition without using any
paired supervisory data. We formulate the text recognition task as one of
aligning the conditional distribution of strings predicted from given text
images, with lexically valid strings sampled from target corpora. This enables
fully automated, and unsupervised learning from just line-level text-images,
and unpaired text-string samples, obviating the need for large aligned
datasets. We present detailed analysis for various aspects of the proposed
method, namely - (1) impact of the length of training sequences on convergence,
(2) relation between character frequencies and the order in which they are
learnt, (3) generalisation ability of our recognition network to inputs of
arbitrary lengths, and (4) impact of varying the text corpus on recognition
accuracy. Finally, we demonstrate excellent text recognition accuracy on both
synthetically generated text images, and scanned images of real printed books,
using no labelled training examples
Clustering Learning for Robotic Vision
We present the clustering learning technique applied to multi-layer
feedforward deep neural networks. We show that this unsupervised learning
technique can compute network filters with only a few minutes and a much
reduced set of parameters. The goal of this paper is to promote the technique
for general-purpose robotic vision systems. We report its use in static image
datasets and object tracking datasets. We show that networks trained with
clustering learning can outperform large networks trained for many hours on
complex datasets.Comment: Code for this paper is available here:
https://github.com/culurciello/CL_paper1_cod
Action recognition based on efficient deep feature learning in the spatio-temporal domain
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Hand-crafted feature functions are usually designed based on the domain knowledge of a presumably controlled environment and often fail to generalize, as the statistics of real-world data cannot always be modeled correctly. Data-driven feature learning methods, on the other hand, have emerged as an alternative that often generalize better in uncontrolled environments. We present a simple, yet robust, 2D convolutional neural network extended to a concatenated 3D network that learns to extract features from the spatio-temporal domain of raw video data. The resulting network model is used for content-based recognition of videos. Relying on a 2D convolutional neural network allows us to exploit a pretrained network as a descriptor that yielded the best results on the largest and challenging ILSVRC-2014 dataset. Experimental results on commonly used benchmarking video datasets demonstrate that our results are state-of-the-art in terms of accuracy and computational time without requiring any preprocessing (e.g., optic flow) or a priori knowledge on data capture (e.g., camera motion estimation), which makes it more general and flexible than other approaches. Our implementation is made available.Peer ReviewedPostprint (author's final draft
- …