994 research outputs found
AutoEncoder Inspired Unsupervised Feature Selection
High-dimensional data in many areas such as computer vision and machine
learning tasks brings in computational and analytical difficulty. Feature
selection which selects a subset from observed features is a widely used
approach for improving performance and effectiveness of machine learning models
with high-dimensional data. In this paper, we propose a novel AutoEncoder
Feature Selector (AEFS) for unsupervised feature selection which combines
autoencoder regression and group lasso tasks. Compared to traditional feature
selection methods, AEFS can select the most important features by excavating
both linear and nonlinear information among features, which is more flexible
than the conventional self-representation method for unsupervised feature
selection with only linear assumptions. Experimental results on benchmark
dataset show that the proposed method is superior to the state-of-the-art
method.Comment: accepted by ICASSP 201
Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality
In an era where accumulating data is easy and storing it inexpensive, feature
selection plays a central role in helping to reduce the high-dimensionality of
huge amounts of otherwise meaningless data. In this paper, we propose a
graph-based method for feature selection that ranks features by identifying the
most important ones into arbitrary set of cues. Mapping the problem on an
affinity graph-where features are the nodes-the solution is given by assessing
the importance of nodes through some indicators of centrality, in particular,
the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance
of a feature as a function of the importance of its neighbors. Ranking central
nodes individuates candidate features, which turn out to be effective from a
classification point of view, as proved by a thoroughly experimental section.
Our approach has been tested on 7 diverse datasets from recent literature
(e.g., biological data and object recognition, among others), and compared
against filter, embedded and wrappers methods. The results are remarkable in
terms of accuracy, stability and low execution time.Comment: Preprint version - Lecture Notes in Computer Science - Springer 201
A survey of handwritten character recognition with MNIST and EMNIST
This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning.This paper summarizes the top state-of-the-art contributions reported on the MNIST dataset for handwritten digit recognition. This dataset has been extensively used to validate novel techniques in computer vision, and in recent years, many authors have explored the performance of convolutional neural networks (CNNs) and other deep learning techniques over this dataset. To the best of our knowledge, this paper is the first exhaustive and updated review of this dataset; there are some online rankings, but they are outdated, and most published papers survey only closely related works, omitting most of the literature. This paper makes a distinction between those works using some kind of data augmentation and works using the original dataset out-of-the-box. Also, works using CNNs are reported separately; as they are becoming the state-of-the-art approach for solving this problem. Nowadays, a significant amount of works have attained a test error rate smaller than 1% on this dataset; which is becoming non-challenging. By mid-2017, a new dataset was introduced: EMNIST, which involves both digits and letters, with a larger amount of data acquired from a database different than MNIST's. In this paper, EMNIST is explained and some results are surveyed
Automatic machine learning:methods, systems, challenges
This open access book presents the first comprehensive overview of general methods in Automatic Machine Learning (AutoML), collects descriptions of existing systems based on these methods, and discusses the first international challenge of AutoML systems. The book serves as a point of entry into this quickly-developing field for researchers and advanced students alike, as well as providing a reference for practitioners aiming to use AutoML in their work. The recent success of commercial ML applications and the rapid growth of the field has created a high demand for off-the-shelf ML methods that can be used easily and without expert knowledge. Many of the recent machine learning successes crucially rely on human experts, who select appropriate ML architectures (deep learning architectures or more traditional ML workflows) and their hyperparameters; however the field of AutoML targets a progressive automation of machine learning, based on principles from optimization and machine learning itself
Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval
Where previous reviews on content-based image retrieval emphasize on what can
be seen in an image to bridge the semantic gap, this survey considers what
people tag about an image. A comprehensive treatise of three closely linked
problems, i.e., image tag assignment, refinement, and tag-based image retrieval
is presented. While existing works vary in terms of their targeted tasks and
methodology, they rely on the key functionality of tag relevance, i.e.
estimating the relevance of a specific tag with respect to the visual content
of a given image and its social context. By analyzing what information a
specific method exploits to construct its tag relevance function and how such
information is exploited, this paper introduces a taxonomy to structure the
growing literature, understand the ingredients of the main works, clarify their
connections and difference, and recognize their merits and limitations. For a
head-to-head comparison between the state-of-the-art, a new experimental
protocol is presented, with training sets containing 10k, 100k and 1m images
and an evaluation on three test sets, contributed by various research groups.
Eleven representative works are implemented and evaluated. Putting all this
together, the survey aims to provide an overview of the past and foster
progress for the near future.Comment: to appear in ACM Computing Survey
- …