103,543 research outputs found
DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size
Large-scale supervised classification algorithms, especially those based on
deep convolutional neural networks (DCNNs), require vast amounts of training
data to achieve state-of-the-art performance. Decreasing this data requirement
would significantly speed up the training process and possibly improve
generalization. Motivated by this objective, we consider the task of adaptively
finding concise training subsets which will be iteratively presented to the
learner. We use convex optimization methods, based on an objective criterion
and feedback from the current performance of the classifier, to efficiently
identify informative samples to train on. We propose an algorithm to decompose
the optimization problem into smaller per-class problems, which can be solved
in parallel. We test our approach on standard classification tasks and
demonstrate its effectiveness in decreasing the training set size without
compromising performance. We also show that our approach can make the
classifier more robust in the presence of label noise and class imbalance
MaxiMin Active Learning in Overparameterized Model Classes}
Generating labeled training datasets has become a major bottleneck in Machine
Learning (ML) pipelines. Active ML aims to address this issue by designing
learning algorithms that automatically and adaptively select the most
informative examples for labeling so that human time is not wasted labeling
irrelevant, redundant, or trivial examples. This paper proposes a new approach
to active ML with nonparametric or overparameterized models such as kernel
methods and neural networks. In the context of binary classification, the new
approach is shown to possess a variety of desirable properties that allow
active learning algorithms to automatically and efficiently identify decision
boundaries and data clusters.Comment: 43 pages, 12 figure
Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision
Supervised machine learning based state-of-the-art computer vision techniques
are in general data hungry. Their data curation poses the challenges of
expensive human labeling, inadequate computing resources and larger experiment
turn around times. Training data subset selection and active learning
techniques have been proposed as possible solutions to these challenges. A
special class of subset selection functions naturally model notions of
diversity, coverage and representation and can be used to eliminate redundancy
thus lending themselves well for training data subset selection. They can also
help improve the efficiency of active learning in further reducing human
labeling efforts by selecting a subset of the examples obtained using the
conventional uncertainty sampling based techniques. In this work, we
empirically demonstrate the effectiveness of two diversity models, namely the
Facility-Location and Dispersion models for training-data subset selection and
reducing labeling effort. We demonstrate this across the board for a variety of
computer vision tasks including Gender Recognition, Face Recognition, Scene
Recognition, Object Detection and Object Recognition. Our results show that
diversity based subset selection done in the right way can increase the
accuracy by upto 5 - 10% over existing baselines, particularly in settings in
which less training data is available. This allows the training of complex
machine learning models like Convolutional Neural Networks with much less
training data and labeling costs while incurring minimal performance loss.Comment: Accepted to WACV 2019. arXiv admin note: substantial text overlap
with arXiv:1805.1119
Active Learning for Convolutional Neural Networks: A Core-Set Approach
Convolutional neural networks (CNNs) have been successfully applied to many
recognition and learning tasks using a universal recipe; training a deep model
on a very large dataset of supervised examples. However, this approach is
rather restrictive in practice since collecting a large set of labeled images
is very expensive. One way to ease this problem is coming up with smart ways
for choosing images to be labelled from a very large collection (ie. active
learning).
Our empirical study suggests that many of the active learning heuristics in
the literature are not effective when applied to CNNs in batch setting.
Inspired by these limitations, we define the problem of active learning as
core-set selection, ie. choosing set of points such that a model learned over
the selected subset is competitive for the remaining data points. We further
present a theoretical result characterizing the performance of any selected
subset using the geometry of the datapoints. As an active learning algorithm,
we choose the subset which is expected to yield best result according to our
characterization. Our experiments show that the proposed method significantly
outperforms existing approaches in image classification experiments by a large
margin.Comment: ICLR 2018 Pape
Stochastic Design and Analysis of Wireless Cloud Caching Networks
This paper develops a stochastic geometry-based approach for the modeling,
analysis, and optimization of wireless cloud caching networks comprised of
multiple-antenna radio units (RUs) inside clouds. We consider the Matern
cluster process to model RUs and the probabilistic content placement to cache
files in RUs. Accordingly, we study the exact hit probability for a user of
interest for two strategies; closest selection, where the user is served by the
closest RU that has its requested file, and best selection, where the serving
RU having the requested file provides the maximum instantaneous received power
at the user. As key steps for the analyses, the Laplace transform of out of
cloud interference, the desired link distance distribution in the closest
selection, and the desired link received power distribution in the best
selection are derived. Also, we approximate the derived exact hit probabilities
for both the closest and the best selections in such a way that the related
objective functions for the content caching design of the network can lead to
tractable concave optimization problems. Solving the optimization problems, we
propose algorithms to efficiently find their optimal content placements.
Finally, we investigate the impact of different parameters such as the number
of antennas and the cache memory size on the caching performance
Exploring Representativeness and Informativeness for Active Learning
How can we find a general way to choose the most suitable samples for
training a classifier? Even with very limited prior information? Active
learning, which can be regarded as an iterative optimization procedure, plays a
key role to construct a refined training set to improve the classification
performance in a variety of applications, such as text analysis, image
recognition, social network modeling, etc.
Although combining representativeness and informativeness of samples has been
proven promising for active sampling, state-of-the-art methods perform well
under certain data structures. Then can we find a way to fuse the two active
sampling criteria without any assumption on data? This paper proposes a general
active learning framework that effectively fuses the two criteria. Inspired by
a two-sample discrepancy problem, triple measures are elaborately designed to
guarantee that the query samples not only possess the representativeness of the
unlabeled data but also reveal the diversity of the labeled data. Any
appropriate similarity measure can be employed to construct the triple
measures. Meanwhile, an uncertain measure is leveraged to generate the
informativeness criterion, which can be carried out in different ways.
Rooted in this framework, a practical active learning algorithm is proposed,
which exploits a radial basis function together with the estimated
probabilities to construct the triple measures and a modified
Best-versus-Second-Best strategy to construct the uncertain measure,
respectively. Experimental results on benchmark datasets demonstrate that our
algorithm consistently achieves superior performance over the state-of-the-art
active learning algorithms
Dissimilarity-based Sparse Subset Selection
Finding an informative subset of a large collection of data points or models
is at the center of many problems in computer vision, recommender systems,
bio/health informatics as well as image and natural language processing. Given
pairwise dissimilarities between the elements of a `source set' and a `target
set,' we consider the problem of finding a subset of the source set, called
representatives or exemplars, that can efficiently describe the target set. We
formulate the problem as a row-sparsity regularized trace minimization problem.
Since the proposed formulation is, in general, NP-hard, we consider a convex
relaxation. The solution of our optimization finds representatives and the
assignment of each element of the target set to each representative, hence,
obtaining a clustering. We analyze the solution of our proposed optimization as
a function of the regularization parameter. We show that when the two sets
jointly partition into multiple groups, our algorithm finds representatives
from all groups and reveals clustering of the sets. In addition, we show that
the proposed framework can effectively deal with outliers. Our algorithm works
with arbitrary dissimilarities, which can be asymmetric or violate the triangle
inequality. To efficiently implement our algorithm, we consider an Alternating
Direction Method of Multipliers (ADMM) framework, which results in quadratic
complexity in the problem size. We show that the ADMM implementation allows to
parallelize the algorithm, hence further reducing the computational time.
Finally, by experiments on real-world datasets, we show that our proposed
algorithm improves the state of the art on the two problems of scene
categorization using representative images and time-series modeling and
segmentation using representative~models
Joint Active Learning with Feature Selection via CUR Matrix Decomposition
This paper presents an unsupervised learning approach for simultaneous sample
and feature selection, which is in contrast to existing works which mainly
tackle these two problems separately. In fact the two tasks are often
interleaved with each other: noisy and high-dimensional features will bring
adverse effect on sample selection, while informative or representative samples
will be beneficial to feature selection. Specifically, we propose a framework
to jointly conduct active learning and feature selection based on the CUR
matrix decomposition. From the data reconstruction perspective, both the
selected samples and features can best approximate the original dataset
respectively, such that the selected samples characterized by the features are
highly representative. In particular, our method runs in one-shot without the
procedure of iterative sample selection for progressive labeling. Thus, our
model is especially suitable when there are few labeled samples or even in the
absence of supervision, which is a particular challenge for existing methods.
As the joint learning problem is NP-hard, the proposed formulation involves a
convex but non-smooth optimization problem. We solve it efficiently by an
iterative algorithm, and prove its global convergence. Experimental results on
publicly available datasets corroborate the efficacy of our method compared
with the state-of-the-art.Comment: Accepted by T-PAM
New S-norm and T-norm Operators for Active Learning Method
Active Learning Method (ALM) is a soft computing method used for modeling and
control based on fuzzy logic. All operators defined for fuzzy sets must serve
as either fuzzy S-norm or fuzzy T-norm. Despite being a powerful modeling
method, ALM does not possess operators which serve as S-norms and T-norms which
deprive it of a profound analytical expression/form. This paper introduces two
new operators based on morphology which satisfy the following conditions:
First, they serve as fuzzy S-norm and T-norm. Second, they satisfy Demorgans
law, so they complement each other perfectly. These operators are investigated
via three viewpoints: Mathematics, Geometry and fuzzy logic.Comment: 11 pages, 20 figures, under review of SPRINGER (Fuzzy Optimization
and Decision Making
Optimal arrangements of hyperplanes for multiclass classification
In this paper, we present a novel approach to construct multiclass
classifiers by means of arrangements of hyperplanes. We propose different mixed
integer (linear and non linear) programming formulations for the problem using
extensions of widely used measures for misclassifying observations where the
\textit{kernel trick} can be adapted to be applicable. Some dimensionality
reductions and variable fixing strategies are also developed for these models.
An extensive battery of experiments has been run which reveal the powerfulness
of our proposal as compared with other previously proposed methodologies.Comment: 8 Figures, 2 Table
- …