10 research outputs found
Temporal Patterns in Fine Particulate Matter Time Series in Beijing: A Calendar View
published_or_final_versio
Supervised Learning with Similarity Functions
We address the problem of general supervised learning when data can only be
accessed through an (indefinite) similarity function between data points.
Existing work on learning with indefinite kernels has concentrated solely on
binary/multi-class classification problems. We propose a model that is generic
enough to handle any supervised learning task and also subsumes the model
previously proposed for classification. We give a "goodness" criterion for
similarity functions w.r.t. a given supervised learning task and then adapt a
well-known landmarking technique to provide efficient algorithms for supervised
learning using "good" similarity functions. We demonstrate the effectiveness of
our model on three important super-vised learning problems: a) real-valued
regression, b) ordinal regression and c) ranking where we show that our method
guarantees bounded generalization error. Furthermore, for the case of
real-valued regression, we give a natural goodness definition that, when used
in conjunction with a recent result in sparse vector recovery, guarantees a
sparse predictor with bounded generalization error. Finally, we report results
of our learning algorithms on regression and ordinal regression tasks using
non-PSD similarity functions and demonstrate the effectiveness of our
algorithms, especially that of the sparse landmark selection algorithm that
achieves significantly higher accuracies than the baseline methods while
offering reduced computational costs.Comment: To appear in the proceedings of NIPS 2012, 30 page
Similarity-based Learning via Data Driven Embeddings
We consider the problem of classification using similarity/distance functions
over data. Specifically, we propose a framework for defining the goodness of a
(dis)similarity function with respect to a given learning task and propose
algorithms that have guaranteed generalization properties when working with
such good functions. Our framework unifies and generalizes the frameworks
proposed by [Balcan-Blum ICML 2006] and [Wang et al ICML 2007]. An attractive
feature of our framework is its adaptability to data - we do not promote a
fixed notion of goodness but rather let data dictate it. We show, by giving
theoretical guarantees that the goodness criterion best suited to a problem can
itself be learned which makes our approach applicable to a variety of domains
and problems. We propose a landmarking-based approach to obtaining a classifier
from such learned goodness criteria. We then provide a novel diversity based
heuristic to perform task-driven selection of landmark points instead of random
selection. We demonstrate the effectiveness of our goodness criteria learning
method as well as the landmark selection heuristic on a variety of
similarity-based learning datasets and benchmark UCI datasets on which our
method consistently outperforms existing approaches by a significant margin.Comment: To appear in the proceedings of NIPS 2011, 14 page
Good edit similarity learning by loss minimization
International audienceSimilarity functions are a fundamental component of many learning algorithms. When dealing with string or tree-structured data, edit distancebased measures are widely used, and there exists a few methods for learning them from data. However, these methods offer no theoretical guarantee as to the generalization ability and discriminative power of the learned similarities. In this paper, we propose a loss minimization-based edit similarity learning approach, called GESL. It is driven by the notion of (e, γ, τ )-goodness, a theory that bridges the gap between the properties of a similarity function and its performance in classification. We show that our learning framework is a suitable way to deal not only with strings but also with tree-structured data. Using the notion of uniform stability, we derive generalization guarantees for a large class of loss functions. We also provide experimental results on two realworld datasets which show that edit similarities learned with GESL induce more accurate and sparser classifiers than other (standard or learned) edit similarities
Symmetry Induction in Computational Intelligence
Symmetry has been a very useful tool to researchers in various scientific fields. At its most basic,
symmetry refers to the invariance of an object to some transformation, or set of transformations.
Usually one searches for, and uses information concerning an existing symmetry within given data,
structure or concept to somehow improve algorithm performance or compress the search space.
This thesis examines the effects of imposing or inducing symmetry on a search space. That is, the
question being asked is whether only existing symmetries can be useful, or whether changing
reference to an intuition-based definition of symmetry over the evaluation function can also be of
use. Within the context of optimization, symmetry induction as defined in this thesis will have the
effect of equating the evaluation of a set of given objects.
Group theory is employed to explore possible symmetrical structures inherent in a search space.
Additionally, conditions when the search space can have a symmetry induced on it are examined. The
idea of a neighborhood structure then leads to the idea of opposition-based computing which aims
to induce a symmetry of the evaluation function. In this context, the search space can be seen as
having a symmetry imposed on it. To be useful, it is shown that an opposite map must be defined
such that it equates elements of the search space which have a relatively large difference in their
respective evaluations. Using this idea a general framework for employing opposition-based ideas
is proposed. To show the efficacy of these ideas, the framework is applied to popular computational
intelligence algorithms within the areas of Monte Carlo optimization, estimation of distribution and
neural network learning.
The first example application focuses on simulated annealing, a popular Monte Carlo optimization
algorithm. At a given iteration, symmetry is induced on the system by considering opposite
neighbors. Using this technique, a temporary symmetry over the neighborhood region is induced.
This simple algorithm is benchmarked using common real optimization problems and compared against
traditional simulated annealing as well as a randomized version. The results highlight improvements
in accuracy, reliability and convergence rate. An application to image thresholding further
confirms the results.
Another example application, population-based incremental learning, is rooted in estimation of
distribution algorithms. A major problem with these techniques is a rapid loss of diversity within
the samples after a relatively low number of iterations. The opposite sample is introduced as a
remedy to this problem. After proving an increased diversity, a new probability update procedure is
designed. This opposition-based version of the algorithm is benchmarked using common binary
optimization problems which have characteristics of deceptivity and attractive basins
characteristic of difficult real world problems. Experiments reveal improvements in diversity,
accuracy, reliability and convergence rate over the traditional approach. Ten instances of the
traveling salesman problem and six image thresholding problems are used to further highlight the
improvements.
Finally, gradient-based learning for feedforward neural networks is improved using opposition-based
ideas. The opposite transfer function is presented as a simple adaptive neuron which easily allows
for efficiently jumping in weight space. It is shown that each possible opposite network represents
a unique input-output mapping, each having an associated effect on the numerical conditioning of
the network. Experiments confirm the potential of opposite networks during pre- and early training
stages. A heuristic for efficiently selecting one opposite network per epoch is presented.
Benchmarking focuses on common classification problems and reveals improvements in accuracy,
reliability, convergence rate and generalization ability over common backpropagation variants. To
further show the potential, the heuristic is applied to resilient propagation where similar
improvements are also found
On learning with dissimilarity functions
We study the problem of learning a classification task in which only a dissimilarity function of the objects is accessible. That is, data are not represented by feature vectors but in terms of their pairwise dissimilarities. We investigate the sufficient conditions for dissimilarity functions to allow building accurate classifiers. Our results have the advantages that they apply to unbounded dissimilarities and are invariant to order-preserving transformations. The theory immediately suggests a learning paradigm: construct an ensemble of decision stumps each depends on a pair of examples, then find a convex combination of them to achieve a large margin. We next develop a practical algorithm called Dissimilarity based Boosting (DBoost) for learning with dissimilarity functions under the theoretical guidance. Experimental results demonstrate that DBoost compares favorably with several existing approaches on a variety of databases and under different conditions. 1