74,001 research outputs found
A stochastic algorithm for feature selection in pattern recognition
Abstract We introduce a new model adressing feature selection from a large dictionary of variables that can be computed from a signal or an image. Features are extracted according to an efficiency criterion, on the basis of specified classification or recognition tasks. This is done by estimating a probability distribution P on the complete dictionary, which distributes its mass over the more efficient, or informative, components. We implement a stochastic gradient descent algorithm, using the probability as a state variable and optimizing a multitask goodness of fit criterion for classifiers based on variable randomly chosen according to P. We then generate classifiers from the optimal distribution of weights learned on the training set. The method is first tested on several pattern recognition problems including face detection, handwritten digit recognition, spam classification and micro-array analysis. We then compare our approach with other step-wise algorithms like random forests or recursive feature elimination
Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation
Feature selection (FS) has become an indispensable task in dealing with
today's highly complex pattern recognition problems with massive number of
features. In this study, we propose a new wrapper approach for FS based on
binary simultaneous perturbation stochastic approximation (BSPSA). This
pseudo-gradient descent stochastic algorithm starts with an initial feature
vector and moves toward the optimal feature vector via successive iterations.
In each iteration, the current feature vector's individual components are
perturbed simultaneously by random offsets from a qualified probability
distribution. We present computational experiments on datasets with numbers of
features ranging from a few dozens to thousands using three widely-used
classifiers as wrappers: nearest neighbor, decision tree, and linear support
vector machine. We compare our methodology against the full set of features as
well as a binary genetic algorithm and sequential FS methods using
cross-validated classification error rate and AUC as the performance criteria.
Our results indicate that features selected by BSPSA compare favorably to
alternative methods in general and BSPSA can yield superior feature sets for
datasets with tens of thousands of features by examining an extremely small
fraction of the solution space. We are not aware of any other wrapper FS
methods that are computationally feasible with good convergence properties for
such large datasets.Comment: This is the Istanbul Sehir University Technical Report
#SHR-ISE-2016.01. A short version of this report has been accepted for
publication at Pattern Recognition Letter
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference
The main obstacle to weakly supervised semantic image segmentation is the
difficulty of obtaining pixel-level information from coarse image-level
annotations. Most methods based on image-level annotations use localization
maps obtained from the classifier, but these only focus on the small
discriminative parts of objects and do not capture precise boundaries.
FickleNet explores diverse combinations of locations on feature maps created by
generic deep neural networks. It selects hidden units randomly and then uses
them to obtain activation scores for image classification. FickleNet implicitly
learns the coherence of each location in the feature maps, resulting in a
localization map which identifies both discriminative and other parts of
objects. The ensemble effects are obtained from a single network by selecting
random hidden unit pairs, which means that a variety of localization maps are
generated from a single image. Our approach does not require any additional
training steps and only adds a simple layer to a standard convolutional neural
network; nevertheless it outperforms recent comparable techniques on the Pascal
VOC 2012 benchmark in both weakly and semi-supervised settings.Comment: To appear in CVPR 201
- …