10 research outputs found

    Temporal Patterns in Fine Particulate Matter Time Series in Beijing: A Calendar View

    Get PDF

    Supervised Learning with Similarity Functions

    Full text link
    We address the problem of general supervised learning when data can only be accessed through an (indefinite) similarity function between data points. Existing work on learning with indefinite kernels has concentrated solely on binary/multi-class classification problems. We propose a model that is generic enough to handle any supervised learning task and also subsumes the model previously proposed for classification. We give a "goodness" criterion for similarity functions w.r.t. a given supervised learning task and then adapt a well-known landmarking technique to provide efficient algorithms for supervised learning using "good" similarity functions. We demonstrate the effectiveness of our model on three important super-vised learning problems: a) real-valued regression, b) ordinal regression and c) ranking where we show that our method guarantees bounded generalization error. Furthermore, for the case of real-valued regression, we give a natural goodness definition that, when used in conjunction with a recent result in sparse vector recovery, guarantees a sparse predictor with bounded generalization error. Finally, we report results of our learning algorithms on regression and ordinal regression tasks using non-PSD similarity functions and demonstrate the effectiveness of our algorithms, especially that of the sparse landmark selection algorithm that achieves significantly higher accuracies than the baseline methods while offering reduced computational costs.Comment: To appear in the proceedings of NIPS 2012, 30 page

    Similarity-based Learning via Data Driven Embeddings

    Full text link
    We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when working with such good functions. Our framework unifies and generalizes the frameworks proposed by [Balcan-Blum ICML 2006] and [Wang et al ICML 2007]. An attractive feature of our framework is its adaptability to data - we do not promote a fixed notion of goodness but rather let data dictate it. We show, by giving theoretical guarantees that the goodness criterion best suited to a problem can itself be learned which makes our approach applicable to a variety of domains and problems. We propose a landmarking-based approach to obtaining a classifier from such learned goodness criteria. We then provide a novel diversity based heuristic to perform task-driven selection of landmark points instead of random selection. We demonstrate the effectiveness of our goodness criteria learning method as well as the landmark selection heuristic on a variety of similarity-based learning datasets and benchmark UCI datasets on which our method consistently outperforms existing approaches by a significant margin.Comment: To appear in the proceedings of NIPS 2011, 14 page

    Good edit similarity learning by loss minimization

    No full text
    International audienceSimilarity functions are a fundamental component of many learning algorithms. When dealing with string or tree-structured data, edit distancebased measures are widely used, and there exists a few methods for learning them from data. However, these methods offer no theoretical guarantee as to the generalization ability and discriminative power of the learned similarities. In this paper, we propose a loss minimization-based edit similarity learning approach, called GESL. It is driven by the notion of (e, γ, τ )-goodness, a theory that bridges the gap between the properties of a similarity function and its performance in classification. We show that our learning framework is a suitable way to deal not only with strings but also with tree-structured data. Using the notion of uniform stability, we derive generalization guarantees for a large class of loss functions. We also provide experimental results on two realworld datasets which show that edit similarities learned with GESL induce more accurate and sparser classifiers than other (standard or learned) edit similarities

    Symmetry Induction in Computational Intelligence

    Get PDF
    Symmetry has been a very useful tool to researchers in various scientific fields. At its most basic, symmetry refers to the invariance of an object to some transformation, or set of transformations. Usually one searches for, and uses information concerning an existing symmetry within given data, structure or concept to somehow improve algorithm performance or compress the search space. This thesis examines the effects of imposing or inducing symmetry on a search space. That is, the question being asked is whether only existing symmetries can be useful, or whether changing reference to an intuition-based definition of symmetry over the evaluation function can also be of use. Within the context of optimization, symmetry induction as defined in this thesis will have the effect of equating the evaluation of a set of given objects. Group theory is employed to explore possible symmetrical structures inherent in a search space. Additionally, conditions when the search space can have a symmetry induced on it are examined. The idea of a neighborhood structure then leads to the idea of opposition-based computing which aims to induce a symmetry of the evaluation function. In this context, the search space can be seen as having a symmetry imposed on it. To be useful, it is shown that an opposite map must be defined such that it equates elements of the search space which have a relatively large difference in their respective evaluations. Using this idea a general framework for employing opposition-based ideas is proposed. To show the efficacy of these ideas, the framework is applied to popular computational intelligence algorithms within the areas of Monte Carlo optimization, estimation of distribution and neural network learning. The first example application focuses on simulated annealing, a popular Monte Carlo optimization algorithm. At a given iteration, symmetry is induced on the system by considering opposite neighbors. Using this technique, a temporary symmetry over the neighborhood region is induced. This simple algorithm is benchmarked using common real optimization problems and compared against traditional simulated annealing as well as a randomized version. The results highlight improvements in accuracy, reliability and convergence rate. An application to image thresholding further confirms the results. Another example application, population-based incremental learning, is rooted in estimation of distribution algorithms. A major problem with these techniques is a rapid loss of diversity within the samples after a relatively low number of iterations. The opposite sample is introduced as a remedy to this problem. After proving an increased diversity, a new probability update procedure is designed. This opposition-based version of the algorithm is benchmarked using common binary optimization problems which have characteristics of deceptivity and attractive basins characteristic of difficult real world problems. Experiments reveal improvements in diversity, accuracy, reliability and convergence rate over the traditional approach. Ten instances of the traveling salesman problem and six image thresholding problems are used to further highlight the improvements. Finally, gradient-based learning for feedforward neural networks is improved using opposition-based ideas. The opposite transfer function is presented as a simple adaptive neuron which easily allows for efficiently jumping in weight space. It is shown that each possible opposite network represents a unique input-output mapping, each having an associated effect on the numerical conditioning of the network. Experiments confirm the potential of opposite networks during pre- and early training stages. A heuristic for efficiently selecting one opposite network per epoch is presented. Benchmarking focuses on common classification problems and reveals improvements in accuracy, reliability, convergence rate and generalization ability over common backpropagation variants. To further show the potential, the heuristic is applied to resilient propagation where similar improvements are also found

    On learning with dissimilarity functions

    No full text
    We study the problem of learning a classification task in which only a dissimilarity function of the objects is accessible. That is, data are not represented by feature vectors but in terms of their pairwise dissimilarities. We investigate the sufficient conditions for dissimilarity functions to allow building accurate classifiers. Our results have the advantages that they apply to unbounded dissimilarities and are invariant to order-preserving transformations. The theory immediately suggests a learning paradigm: construct an ensemble of decision stumps each depends on a pair of examples, then find a convex combination of them to achieve a large margin. We next develop a practical algorithm called Dissimilarity based Boosting (DBoost) for learning with dissimilarity functions under the theoretical guidance. Experimental results demonstrate that DBoost compares favorably with several existing approaches on a variety of databases and under different conditions. 1